scraping citations

Issue #1977 open
Robert Jäschke created an issue

Some scrapers are extended to also scrape citations of publications. This step is currently not done, since it requires additional time.

Therefore, the citation scraping process shall be run in a separate thread while the user is editing the post. Upon saving of the post, the citation metadata shall then be stored in the scraper metadata table.

  1. Implement the functionality to run a separate thread in the background which is storing the result in the corresponding fields of the ScrapingContext (which is in the server session during editing of a post).
  2. Ensure that threads are correctly terminated - either upon saving of the post or when the post is not saved, after a certain timeout.
  3. Augment the code that stores the (XML-serialized) scraper metadata in the database such that it additionally also stores (the escaped and XML-serialized) cited-by/references metadata.

Comments (4)

  1. Log in to comment