Wiki

Clone wiki

neo4j-databridge / 3.1 Resource Descriptors

3.1 Resource Descriptors

A Resource Descriptor is a JSON-formatted file that provides information about the data source you want to import. It always includes a name and a resource attribute, but depending on the resource type, additional attributes may be present in the descriptor. For example an SQL resource descriptor will provide connection information to the RDBMS, while a CSV file resource descriptor may optionally contain information about the column delimiter and column names.


Resource names

A resource is identified by its name in the name attribute of the resource descriptor. Resources can include URI schemes for data that is not file-based, or must be remotely accessed. The following resource schemes are supported.

Scheme Resource Type Available Adapters
- FileSystemResource CSV JSON Excel GPX ITN CIF PGN
file: FileSystemResource CSV JSON Excel GPX ITN CIF PGN
jdbc: JdbcResource SQL databases
http: HttpResource CSV JSON Excel GPX ITN CIF PGN
ftp: FileSystemResource CSV JSON Excel GPX ITN CIF PGN
kafka: MessageResource JSON

CSV resources

The resource descriptor for a CSV file on the local filesystem will look something like this:

#!javascript

{
  "name": "satellites-resource",
  "resource": "/imports/resources/satellites.csv",
  "delimiter": ";",
  "columns": ["Object","Orbit","Alt","Program","Manned","Launched","Status"]
}

The resource attribute

This attribute points to the CSV file that should be imported.

If the file is on a local filesystem you can refer to it via its absolute or relative pathname. If the file you want is behind a web server, you can use an HTTP or HTTPS resource descriptor, for example:

"resource" : "https://data.gov.uk/data/resource_cache/.../motsitelist2015.csv"

The delimiter attribute

This attribute defines the column delimiter in the CSV file. Only single character delimiters will be handled. To define a non-printing delimiter, for example a TAB, you can use escape sequences:

#!javascript

  "\a" Bell (alert)
  "\b" Backspace
  "\t" Horizontal tab
  "\v" Vertical tab

The default delimiter, if you don't specify one is a TAB: "\t"

The columns attribute

You should specify the columns in your CSV file if it doesn't contain a header row. If you don't specify a columns attribute, the importer will assume the first row of data in your CSV file is a header row, and will use that row to obtain the column names.

Note: You should specify exactly the same number of columns as the CSV file actually contains


Excel resources

The resource descriptor for an Excel spreadsheet on the local filesystem will look something like this:

#!javascript

{
  "resource": "/import/budgets/budget.xlsx",
  "sheet": "current_year"
}
The resource attribute

This attribute points to the Excel file to be imported.

If the file is on a local filesystem you can refer to it via its absolute or relative pathname. If the file you want is behind a web server, you can use an HTTP or HTTPS resource descriptor, for example:

"resource" : "https://data.myob.uk/finacc/budget_2016.xls"

The sheet attribute

This attribute defines which sheet in the workbook should be imported. Currently only one sheet can be specified.


JDBC resources

A JDBC Resource Descriptor will look something like this:

#!javascript

{
  "resource": "jdbc:mysql://localhost/test",
  "user": "sa",
  "password": "",
  "query": "/imports/resource/satellites.sql"
}

The resource attribute

The resource attribute describes the JDBC connection string to be used by the appropriate JDBC driver when connecting to the database.

Please note: the JDBC driver for your database must be installed in the /lib folder of the importer. DataBridge does not bundle any JDBC drivers.

The user and password attributes

These attributes are used to provide connection credentials to the database. If your database doesn't require credentials you can omit these attributes.

The query attribute

The query attribute specifies the SQL query you wish to run against the JDBC resource. It can take one of two forms - an inline SQL query like this:

"query": "SELECT * FROM satellites"

Or it can refer to an external .sql file that contains the query to run:

"query": "/import/resources/query.sql"
Both absolute and relative paths are supported.


JSON resources

The resource descriptor for a JSON file on the local filesystem is very simple.

#!javascript

{
  "resource": "/imports/resources/olympics.json",
}

Note that the filename must end in .json for the importer to choose the JsonAdapter automatically. If the filename has a different extension, you must define the adapter explicitly:

#!javascript

{
  "resource": "/imports/resources/olympics.data",
  "adapter": "com.graphaware.neo4j.databridge.adapters.json.JsonAdapter"
}

Updated