AMI is a software tool to extract and analyze the world's scientific facts in scholarly publications, theses and reports.
AMI will be used to power the Content Mine: see this video. Everyone is welcome, either to help develop/document software or extract and analyse content.
AMI consists of about eight projects, with production and development versions (*-dev). The first (
XHTML2STM) is where users should start.
toplevel package for converting
HTMLto Science Technical Medical (STM). It uses domain-specific visitors (e.g. chemistry, phylogenetics, metabolism).
- SVG2XML. Converts
SVGlibrary (based on
- SVGBuilder including tools for building higher level graphics primitives (squares, circles, arrows, etc.)
- PDF2SVG. Parser/converter from
SVG, including normalization to Unicode where possible.
The actual order of execution is , however, normally:
PDF2SVG . Uses SVG library
SVGBuilder creates higher level primitives as far as possible.
SVG2XML which creates text.
* XHTML2STM. Uses discipline specific plugins to create science (should be renamed "AMI"
- CRAWLERREPO. Crawler for Open scientific publications. Repository using CKAN Datahub.io