Introduction¶

Cordra is a highly configurable software system. You can learn about design configuration here.

In general, Cordra is configured by (1) the config.json file in the data directory, or its equivalent Zookeeper node for distributed deployment; and (2) the design object. The config.json file is used to specify which storage and indexing backends Cordra will use, along with various other features which require configuration before Cordra can be up and running. When Cordra is deployed as a single instance using its own built-in servlet container, the config.json file is also used to configure the HTTP and HTTPS interfaces. Other features, which can be modified at runtime and can be determined safely after Cordra startup, are configured in the design object.

For convenience a repoInit.json file can be used to specify certain changes to be made to the design object at startup. This allows design-object-based configuration to be done using the filesystem. In particular, this can be used to set an initial admin password, which is needed to modify the design object further in a running Cordra. See repoInit.json for general use of repoInit.json, and Locally Run Instance for the specific case of the admin password.

There are a number of options available for deploying Cordra. The simplest is a standalone Cordra installation that uses the local filesystem for storage and an embedded indexer. Cordra is configured to run in this mode by default, and it is a good setup for testing Cordra and any applications built using Cordra.

In addition to configuring Cordra software to use the local file system and embedded indexer, you can also configure a standalone Cordra installation to use external storage and/or indexing services. Those services would need to be setup independently, and then have Cordra configured to interact with them. Cordra currently supports the following backend services:

Indexing: Apache Lucene (default), system memory Lucene, Elasticsearch, and Apache Solr. Click here for details.
Storage: Filesystem (default), system memory, MongoDB, and Amazon S3. Click here for details.

Finally, multiple instances of Cordra can be configured as load-sharing nodes of an application, using external storage and indexing systems. This setup requires the use of Apache ZooKeeper and Apache Kafka to handle coordination between the nodes. For detailed instructions on setting up a distributed Cordra service, see Deploying Cordra as a Distributed System.

Management of complex infrastructure requires tools and tutorials related to keys management, distributed sessions management, logs management, user management, administrative interface, import-export tool, and environment migration.