Content mining at Oxford
https://vimeo.com/78353557 5-minute video on the way forward
http://www.slideshare.net/petermurrayrust/the-content-mine-presented-at-uksg (especially slides 16++)
There is a chemical example already in place. Just click the button to analyze. Use the subsequent markup buttons to highlight the phrases. Mouseover the chemical names to reveal the structures.
your own example
Find your own example from any chemica or biochemical recipe. It won't hit all of the fields, so let us know what you would like to see added.
The Atmospheric Chemistry example has a geotagger so if you have text with place names in, try it out (no guarantees).
choose a PDF and see how well tabula extracts tables. We'll be joined by one of the authors (Manuel Aristaran) from Argentina.
We have bundled AMI2 for you on https://bitbucket.org/petermr/xhtml2stm-dev/downloads. For instructions see the AMI Tutorial.
It works for HTML input. Try the demos and then try your own examples (copy HTML files with species or sequences into your exampleData/html/ directory).
It works with PDF and SVG , but not in the distro. So if you have PDFs you want analyzed let PMR have them to run for you.
We can store results in the OKFN/CKAN Datahub