The type of index used by a system dictates the sorts of queries that can be run. Cordra’s default index can be configured with a selection of Lucene-based indexes, however there are circumstances that requires something else. In such a situation an additional index can be added to Cordra by inserting items into that index in response to Cordra’s lifecycle hooks.
This example shows how to install an extension that allows Cordra to insert data into the graph-based index Neo4j. A graph-based index allows for certain types of queries over highly connected data that are not possible in a single pass with other indexing systems.
This project contains lifecycle hooks written in Java that will create nodes and relationships in a Neo4j graph database. The nodes created correspond to Cordra objects and the relationships are determined by properties on the Cordra objects that are defined as handleReference in the JSON Schema for that type of object.
You can download a desktop installation for Neo4j here: https://neo4j.com/download/
On the version of the Neo4j desktop app we tested it was necessary to reset its DBMS password before it would allow clients to connect. To do this run the neo4j desktop app. Then select “Movie DBMS” from the “Example Project”. This will show information about the DBMS in the right panel. In that right panel click the “Details” tab. In the “Details” tab enter a new admin password in the “Reset DBMS password” section.
Ensure the sample database is empty using:
MATCH (n) DETACH DELETE n
Obtain the file cordra-neo4j-test-1.0.0.jar
from the Cordra download distribution in the extensions/neo4j-test
directory.
Then add it as a payload with the payload name “java” to the design object in Cordra. You can see the design object by
signing in to Cordra as admin and then from the menu select Admin->Design Object… Edit the design object and scroll all
the way down to add payloads.
Also add a payload containing a config.json file named “neo4jConfig” to the design object. An example config.json is
below (and also in the extensions/neo4j-test
directory):
{
"user": "neo4j",
"password": "password",
"uri": "bolt://localhost:7687",
"databaseName": "neo4j",
"propertyNameMode": "topLevel",
"excludeTypes": ["User", "Group", "CordraDesign", "Schema"],
"includeTypes": ["Person", "Movie"]
}
The properties “user”, “password” and “uri” are required.
The property “databaseName” is optional and will default to the value “neo4j” if missing.
The property “propertyNameMode” configures the method used to generate the node properties from the cordra object properties. It may take the string values “topLevel” or “jsonPointer”. The value “topLevel” is default if missing.
“topLevel” takes the top level properties of the Cordra object content that have primitive values and sets those as the Neo4j node property names.
“jsonPointer” will create a JSON Pointer for each of the properties of the cordra objects content and set those JSON Pointers as the Neo4j node property names.
Note that in both cases the created node will be given the properties “_type” and “_id”.
The properties “excludeTypes” and “includeTypes” can be used independently or together to control objects of which Cordra types are created as nodes in Neo4j.
“excludeTypes” can be used to specify types that should not be created in Neo4j. If missing no types will be explicitly excluded.
“includeTypes” can be used to specify types that should be created in Neo4j. If missing all types not listed in “excludeTypes” will be created.
“verbose” optional boolean with default value false
. When set to true queries sent to Neo4j will be logged to standard
out.
To demonstrate the functionality of this project create the types “Movie” and “Person” in Cordra. The JSON Schemas for those
types are in the extensions/neo4j-test
directory. You could then create a Movie instance “Top Gun” and then create a
Person instance “Tom Cruise”, on that person object add the “Top Gun” movie to the “ACTED_IN” array.
The example comes with a class MoviesImporter which will import many actors and movies into Cordra. Run the script
import-movies
in the extensions/neo4j-test
directory with the base URI, username, and password for your running Cordra
instance, for example:
./import-movies -b https://localhost:8443 -u admin -p password
Some sample Neo4j queries to try are in the file sample-queries.txt
in the extensions/neo4j-test
directory.
In order to get the Neo4j Browser to display Movie and Person objects with useful captions,
you can create a file “style.grass” and drag it onto the Neo4j Browser window; an example is below
and also in the extensions/neo4j-test
directory:
node {
diameter: 50px;
color: #A5ABB6;
border-color: #9AA1AC;
border-width: 2px;
text-color-internal: #FFFFFF;
font-size: 10px;
}
relationship {
color: #A5ABB6;
shaft-width: 1px;
font-size: 8px;
padding: 3px;
text-color-external: #000000;
text-color-internal: #FFFFFF;
caption: "<type>";
}
node.CordraObject {
color: #F79767;
border-color: #f36924;
text-color-internal: #FFFFFF;
defaultCaption: "<id>";
caption: "{title}";
}
node.Person {
color: #57C7E3;
border-color: #23b3d7;
text-color-internal: #2A2C34;
defaultCaption: "<id>";
caption: "{name}";
}
node.Movie {
color: #F16667;
border-color: #eb2728;
text-color-internal: #FFFFFF;
defaultCaption: "<id>";
caption: "{title}";
}
All Neo4j nodes will be given the labels “CordraObject” and a label that corresponds to its type.
This project will automatically create the relationships between nodes in Neo4j. It does this by inspecting the JSON Schema of the Cordra object that is being created or updated. Any properties on the content of the Cordra object that have a corresponding JsonSchema property cordra.type.handleReference will be treated as a relationship in neo4j.
The name of the relationship will come from the name of the property on the Cordra object. If the property is deep in a nested JSON structure the JSON Pointer is processed from right to left taking the first non-numeric part as the relationship name. e.g. if you had a JSON object:
{
"ACTED_IN": [
"test/123"
]
}
The string value “test/123” is a pointer to another Cordra object. The jsonPointer to this value is “/ACTED_IN/0” so the resulting relationship name used in Neo4j will be “ACTED_IN”.
The Cordra Neo4j indexer gives all nodes in Neo4j the label “CordraObject”. As a result of this you can see the graph of all CordraObjects with the Neo4j query:
MATCH (n:CordraObject) RETURN n;
When you add the jar as a “java” payload to the design object a number of static operations (aka static type methods) will be added to the design object. You can see these in the Cordra UI by going to the design object by selecting Admin->Design Object… then click the “Methods” tab:
“reloadNeo4jConfig”: If you invoke this method the “neo4jConfig” JSON payload will be read and the connection to Neo4j will be reinitialized using that configuration. This allows you to change the configuration payload without needing to restart Cordra.
“deleteInNeo4j”: Accepts an “id” attribute for the single object that should be deleted from the Neo4j index. Note that the delete operation will fail if there are other nodes that have relationships pointing at the node being deleted.
“deleteAllInNeo4j”: Removes all Cordra object nodes and relationships from the Neo4j index.
“searchNeo4j”: Accepts a “query” attribute that is expected to contain a query in the Cypher query syntax. This query is passed to the Neo4j index and the full response is returned as JSON.
Neo4j does not permit creating relationships if the nodes on both sides of the relationship do not exist. For example if you have A->B and you try and index A before indexing B the operation will fail. The same is true if you have A->B and B->A. The solution to this is to index objects in two passes, first to create the nodes in the index and then a second pass to create the relationship.
This project provides a number of custom operations that allow you to instruct Cordra to reindex its objects in Neo4j. These operations will either perform the reindex in two passes, indicated by the name of the operation, or they will accept a boolean attribute “includeRelationships” which defaults to true. If you are using on of the operations that accepts “includeRelationships” you will need to invoke the operation twice, first with “includeRelationships”:false and then again with “includeRelationships”:true.
“reindexAllInNeo4jTwoPass”: Reindexes all Cordra objects into Neo4j in two passes, first just the nodes and then the nodes with relationships.
“reindexAllInNeo4j”: Reindexes all Cordra objects into Neo4j in a single pass. Accepts a boolean attribute “includeRelationships” which defaults to true.
“reindexQueryResultsInNeo4jTwoPass”: Accepts a “query” attribute. This is a Lucene query that is applied to Cordra index. The results are indexed into Neo4j in two passes, first just the nodes and then the nodes with relationships.
“reindexQueryResultsInNeo4jTwoPass”: Accepts a “query” attribute. This is a Lucene query that is applied to Cordra index. The results are indexed into Neo4j in a single pass. Accepts a boolean attribute “includeRelationships” which defaults to true.
“reindexOneInNeo4j”: Accepts an “id” attribute and boolean attribute “includeRelationships”. This operation will reindex the object specified in the “id” attribute.