HTTPS SSH

SDF Reader

SD-Files

SD-files is a widely used format for storing and sharing chemical structures and properties of these structures. Such files can contain thousands to millions of chemical structures.

Features

SdfReader is an efficient random access reader for SD-files. Random access means that the file does not have to be read sequentially from start to end.

  • specific records can be retrieved by their index regardless of current position in the file (getRecord(index))
  • you can move forward (next()) or backward (previous())
  • you can move forward or backward multiple records (next(numRecords), previous(numRecords))
  • you can access several records sequentially by calling getRecords(startIndex, endIndex)

Note that previous(numRecords) returns records in reversed order meaning the one with the highest index is the first record in the returned list. The same is true for getRecords(startIndex, endIndex) if startIndex > endIndex.

Chemistry

SdfReader is not chemically intelligent. It returns raw textual data that can then be used by any chemistry toolkit for the desired actions.

Usage

Accessing SdfRecord

SdfReader sdfReader = new SdfReader(filePath);
SdfRecord record1 = sdfReader.getRecord(1);
System.out.println(record1.getMolfileName());
System.out.println(record1.getMolfile());
for(Map.Entry<String,String> entry : record1.getProperties().entrySet()){
    System.out.println(entry.getKey() + ": " + entry.getValue());
}

Getting records sequentially

SdfReader sdfReader = new SdfReader(filePath);

while (sdfReader.hasNext()){
    SdfRecord> record = sdfReader.next();
    // do something
}

Batch wise iteration

int batchSize = 10;
SdfReader sdfReader = new SdfReader(filePath);

while (sdfReader.hasNext()){
    // this will read complete file. Last list can have fewer than
    // batchSize of elements
    List<SdfRecord> records = sdfReader.next(batchSize);
    // do something
}

Random Access

//get records 6 to 10
int startIndex = 5;
int endIndex = 9;
SdfReader sdfReader = new SdfReader(filePath);
if(hasRecord(startIndex)){
    List<SdfRecord> records = sdfReader.getRecords(startIndex, endIndex);
    //do something
}