bibsonomy / BibSonomy / issues / #6 - Scraping abstracts — Bitbucket

Issue #6 resolved

Robert Jäschke created an issue 2013-12-05

Most digital libraries contain abstracts (short summaries) of the articles which we extract. Please check for each scraper, if it already extracts the abstract by

looking at the web page, if there is an abstract, and then
scraping the page with the bookmarklet and looking if the abstract was correctly scraped.

Please create a table with the following columns:

Scraper
tested URL
abstract on web page
abstract scraped

The goal then is, to include scraping of abstracts wherever this is possible.

Comments (3)

Robert Jäschke reporter
Add another column to this table: "5. scraper type" which identifies, how the scraper gets the content:
- by building a new URL (SimpleGenericUrlScraper)
- by extracting the BibTeX/EndNote/etc. URL from the content of the given URL
- otherwise (how?)
Then, for scrapers which don't get the abstract, yet, but which already download the web page, extract the abstract from the web page and add it to the resulting BibTeX.
- 2014-02-17T16:11:05+00:00
Daniel Zoller
- changed status to open
- 2014-02-17T17:11:51+00:00
Former user Account Deleted
- changed status to resolved
resolved
- 2014-10-13T14:42:31+00:00
Log in to comment

Assignee: –

Type: enhancement

Priority: minor

Status: resolved

Component: scraper

Milestone: –

Version: –

Votes: 0

Watchers: 0

Jira: the preferred issue tracker for Bitbucket. Join the team!