1. Daniel Needham
  2. names-database-manager

Overview

HTTPS SSH

Names Database Manager

Author: Daniel Needham (daniel.needham@manchester.ac.uk)

Overview

The Names Database Manager library provides an interface to a given Names Database instance in a standardised and consistent way.

The library provides a number of types that extend the normalised data types found in the Names Disambiguation library, adding a number of features including the ability to track them within the database instance, and the ability to serialise and deserialise a records attributes within the main record table (see Notes).

The library also provides functionality to search over a Names Database instance, and to update records within the database.

The Names Disambiguation library is intentionally abstracted from the Names Database Manager library as much as possible so that the underlying data is independent from the disambiguation that created it.

Dependencies

The Names Database Manager is a maven managed Java application. Its dependencies are:

  1. Log4J
    • This should be picked up from maven's central repository
  2. names-disambiguator
    • This currently needs to be downloaded from here and added to your local maven repository
  3. Mysql-connector-java
    • This should be picked up from maven's central repository

Example usage

// Create a new connection to a Names Database instance
// This can be reused across queries
DatabaseManager dm = new DatabaseManager(driver, database, username, password).connect();

// Build a query
Query query = new Query();
query.setNames("Smith OR Brown");
query.setGetCount(true);

// Execute the query
Response response = dm.find(query);

// Check a reponse code
// Change information
// Update the database

if(response.getStatus == Response.OK 
    && response.getResults().getCount() > 0){
    NamesRecord n = response.getResults().first();  // get first result
    Name newName = new Name();                      // add a new name
    newName.setChars("J. Smith");
    n.addName(newName);
    dm.addUpdate(response.getResults().getRecords());  // update
}

// Close the database connection
dm.close();

Notes

Currently the database manager interacts with a Names database instance using the standard JDBC API.

In the Names data architecture an entity (individual or institution) can have lots of attributes associated with it, and must be searchable over many of these attributes in conjunction. For performance reasons (both for insertion and retrieval) it was determined that full database normalisation would not be especially performant, especially with potentially millions of identified individuals each with hundreds of related attributes.

Instead entity attributes are serialised into columns in a main record table, and then summary tables are populated with the attributes that can be searched over.

For this reason it was difficult to find an ORM framework that would neatly fit this paradigm, and it was also difficult to judge how well an ORM would scale up given these conditions. That said, further investigation might reveal that the database manager could be adapted to use an ORM such as Spring or Hibernate.