Scraping abstracts
Issue #6
resolved
Most digital libraries contain abstracts (short summaries) of the articles which we extract. Please check for each scraper, if it already extracts the abstract by
- looking at the web page, if there is an abstract, and then
- scraping the page with the bookmarklet and looking if the abstract was correctly scraped.
Please create a table with the following columns:
- Scraper
- tested URL
- abstract on web page
- abstract scraped
The goal then is, to include scraping of abstracts wherever this is possible.
Comments (3)
-
reporter -
- changed status to open
-
Account Deleted - changed status to resolved
resolved
- Log in to comment
Add another column to this table: "5. scraper type" which identifies, how the scraper gets the content:
Then, for scrapers which don't get the abstract, yet, but which already download the web page, extract the abstract from the web page and add it to the resulting BibTeX.