- changed status to resolved
Dublin Core Scraper
Issue #1822
resolved
Some web pages provide metadata using Dublin Core vocabulary. E.g., we can not scraper
http://www.nature.com/nm/journal/v10/n7s/full/nm1064.html
but the HTML metadata fields provide some information about the title and authors (unfortunately, the data is not complete - so data about the Journal is missing).
Please implement a scraper that extracts dublin core metadata from web pages.
The scraper should be called pretty late in the chain (probably before the IEScraper), since we should first try to use scrapers for the "better" supported formats.
Comments (1)
-
- Log in to comment