Clone wiki

MoleculeDatabaseFramework / Chemistry

The chemical structure search functionality is provided by the Bingo PotsgreSQL Cartridge. Bingo supports SMILES and molfile formats.

ChemicalStructure Entity

The ChemicalStructure Entity class provided by the framework has limited chemical intelligence. It is mainly a data container. The user of the framework must assure that the structureData and the structureKey fields are in-sync. To create a new ChemicalStructure instance you should always use the provided ChemicalStructureFactory.

ChemicalStructure structure =

This will create a new instance with structureData and the structureKey fields in-sync. structureData must be either SMILES or in molfile format (always the same for a given application, or more precise spring-context). The structureKey is the standard InChiKey for the given structureData. It is generated by the Indigo Toolkit (which uses the InChi library).

Chemical Formats

Bingo Cartridge only supports chemical structure search for smiles or molfile format and all chemical structures must be in the same format. The format you choose must be set in the file:

chemistry.format = smi


chemistry.format = mol

If you choose smiles (smi) Indigo Toolkit will convert between smiles and molfile when importing or exporting SD-Files.

ChemistryFormatConverter offers methods to convert between smiles, molfile and InChi and also offers methods to generate an InChikey.

Salt Handling

As of version 1.1.0 MDF can deal with salts. Per default this salt handling is disabled and the behavior is the same as previous versions. To enable salt handling add

chemistry.handleSalts = true

to the file.

Note that activating salt handling has a performance impact especially when importing large sd-files. This is because every molecule needs to be checked if it is a salt or not and then all the salts must be dealt with.

Salt handling works as follows:

  • Components of salts are stored as an uncharged ChemicalStructures (Na, Cl)
  • the new entity SaltComposition contains the original charged structure (Na+, Cl-)
  • SaltComposition contains the ratio of the component in the salt

The ratio in NaCl is 1 for both ions. A more complex example would be trimagnesium phosphate (Mg3(PO4)2). Here the ratio would be 3 Mg to 2 phosphates. So importing trimagnesium phosphate will lead to a ChemicalCompound consisting of the 2 SaltCompositions Mg and PO4 with the ratio set to 3 for Mg and 2 for PO4.

On the application side if you want to display the charged structures for salts and the uncharged ones for all other compounds you need to call

composition.getSmiles() or composition.getMolfile()

If you want to display only uncharged structures

composition.getChemicalStructure().getSmiles() or composition.getChemicalStructure().getMolfile()

Structure Key

The structure key is essential as it is used as the business key within the framework. This means hashCode() and equals() of ChemicalStructure are based only on the structureKey field. Also these methods can be very important in your JPA Provider, as example see Hiberante Guidelines for equals() and hashCode(). The structureKey is the standard InChikey.

Need for Chemistry Toolkit on application-side

Due to the design of the domain (database), meaning the fact that a ChemicalCompound can be a mixture, the framework must take this into account when importing chemical structure data and exporting ChemicalCompounds. In case of importing disconnected structures in the same molfile, they have to be separated and for each one a ChemicalStructure entity needs to be created. In case of export, potentially multiple structures must be put into 1 molfile (remaining disconnected) and the coordinates must be adjusted so that no overlap occurs. While Bingo Cartridge can do basic import and export it does not offer above advanced functionality. Hence the need for the Indigo Toolkit.