Partial Replication/Aggregation

This project demonstrates how to install a module into Cordra that performs partial replication/aggregation from other Cordra servers.

This is pull-based replication in that the destination Cordra server is the one doing the work of pulling objects from one or more remote Cordra servers. Which objects are pulled is highly configurable, types of objects can be explicitly included or excluded. In addition a custom query can be supplied to filter which objects are pulled. One advantage of this pull-based approach is that it allows the destination Cordra server to have complete control over which objects are replicated, and when they are replicated. This can be useful in cases where the destination Cordra server has limited resources, or if the network connection between the two Cordra servers is unreliable.

Replication in this project is performed by using the existing Cordra search API. In short the pulling Cordra periodically searches each remote Cordra for new or updated objects that match the constructed query. This is made possible because each object contains a transaction id and that transaction id is indexed and searchable. As such the pulling Cordra can record the highest transaction id it has seen so far and on the next poll include a range search from that highest seen transaction id to the latest transaction id.

Installation Instructions

Note that installation and configuration is only performed on the destination Cordra instance. This instance will pull objects from the source instance. From the source Cordra’s perspective it is responding to ordinary search and retrieve requests. It may be necessary to add a user account and access control on the source instance.

  1. Obtain the file cordra-replication-1.0.0.jar from the Cordra download distribution in the extensions/replication directory.

  2. Create a new type in Cordra called CordraReplicationState.

The JsonSchema for this type is:

{
    "type": "object",
    "properties": {
        "serviceId": {
            "type": "string"
        },
        "pollingIntervalSeconds": {
            "type": "number"
        },
        "services": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "id": {
                        "type": "string"
                    },
                    "baseUri": {
                        "type": "string"
                    },
                    "ipAddress": {
                        "type": "string"
                    },
                    "port": {
                        "type": "number"
                    },
                    "username": {
                        "type": "string"
                    },
                    "password": {
                        "type": "string"
                    },
                    "protocol": {
                        "type": "string",
                        "enum": [
                            "DOIP",
                            "DOIP-HTTP",
                            "CORDRA-HTTP"
                        ]
                    },
                    "latestTxnId": {
                        "type": "number"
                    },
                    "error": {
                        "type": "string"
                    },
                    "lastRunTimestamp": {
                        "type": "string"
                    },
                    "includeTypes": {
                        "type": "array",
                        "items": {
                            "type": "string"
                        }
                    },
                    "excludeTypes": {
                        "type": "array",
                        "items": {
                            "type": "string"
                        }
                    },
                    "customQuery": {
                        "type": "string"
                    }
                }
            }
        }
    }
}
  1. Attach the jar file as a payload to the CordraReplicationState Schema object. You must name that payload “java”.

  2. Configure access control for the type CordraReplicationState such that only your admin user can read and write to objects of that type. This is important as instances of CordraReplicationState contain the credentials used to connect to other Cordra instances.

  3. Create an instance of the CordraReplicationState type. Give it the following values - change the details as needed:

    {
        "pollingIntervalSeconds": 60,
        "services": [
            {
                "id": "test/service",
                "baseUri": "https://10.0.1.157:8443/doip/",
                "ipAddress": "10.0.1.157",
                "port": 9000,
                "username": "admin",
                "password": "password",
                "protocol": "DOIP",
                "excludeTypes": [
                    "Schema"
                ]
            }
        ]
    }
    

“pollingIntervalSeconds”: Number of seconds to wait between the end of one replication process and starting the next.

The “services” property holds an array of objects where each object describes how to connect to another Cordra service that this Cordra will be pulling from.

“id”: The id of the service being pulled from.

“baseUri”: The base URI of the remote Cordra to use if connecting via DOIP-HTTP or CORDRA-HTTP protocols.

“ipAddress”: The ip address of the remote Cordra to use if connecting via the binary DOIP protocol.

“port”: The port of the remote Cordra to use if connecting via the binary DOIP protocol.

“username”: The username to use when talking to this remote Cordra.

“password”: The password to use when talking to this remote Cordra.

“protocol”: The protocol to use when talking to this remote Cordra. May be “DOIP”, “DOIP-HTTP”, “CORDRA-HTTP”.

“excludeTypes”: Types of Cordra objects to be be excluded when pulling from this remote Cordra.

“includeType”: (Optional) Types of Cordra objects to be be excluded when pulling from this remote Cordra.

“customQuery”: (Optional) A custom query that will be ANDed into the replication query when pulling new objects.

Behavior

In order to demonstrate the replication behavior:

  1. Run a source Cordra.

  2. Run a destination Cordra, configured with CordraReplicationState as above, with the source Cordra a service to be pulled from.

Any objects (except of types excluded by the configuration in the destination Cordra’s CordraReplicationState) which are created or updated in the source Cordra will be replicated into the destination Cordra, after at most the time interval in the configured pollingIntervalSeconds.

Custom Operations

The CordraReplicationState type comes with two custom operations, “start” and “stop”. These can be used to pause and resume replication while Cordra is running.