SDF Reader


SD-files is a widely used format for storing and sharing chemical structures and properties of these structures. Such files can contain thousands to millions of chemical structures.


SdfReader is an efficient random access reader for SD-files. Random access means that the file does not have to be read sequentially from start to end.

  • specific records can be retrieved by their index regardless of current position in the file (getRecord(index))
  • you can move forward (next()) or backward (previous())
  • you can move forward or backward multiple records (next(numRecords), previous(numRecords))
  • you can access several records sequentially by calling getRecords(startIndex, endIndex)

Note that previous(numRecords) returns records in reversed order meaning the one with the highest index is the first record in the returned list. The same is true for getRecords(startIndex, endIndex) if startIndex > endIndex.


SdfReader is not chemically intelligent. It returns raw textual data that can then be used by any chemistry toolkit for the desired actions.


Accessing SdfRecord

SdfReader sdfReader = new SdfReader(filePath);
SdfRecord record1 = sdfReader.getRecord(1);
for(Map.Entry<String,String> entry : record1.getProperties().entrySet()){
    System.out.println(entry.getKey() + ": " + entry.getValue());

Getting records sequentially

SdfReader sdfReader = new SdfReader(filePath);

while (sdfReader.hasNext()){
    SdfRecord> record =;
    // do something

Batch wise iteration

int batchSize = 10;
SdfReader sdfReader = new SdfReader(filePath);

while (sdfReader.hasNext()){
    // this will read complete file. Last list can have fewer than
    // batchSize of elements
    List<SdfRecord> records =;
    // do something

Random Access

//get records 6 to 10
int startIndex = 5;
int endIndex = 9;
SdfReader sdfReader = new SdfReader(filePath);
    List<SdfRecord> records = sdfReader.getRecords(startIndex, endIndex);
    //do something