Scraper for CACM

Create issue
Issue #2533 closed
Robert Jäschke created an issue

Extend the ACMBasicScraper to handle the Communications of the ACM journal.

Example: the URL http://cacm.acm.org/magazines/2015/8/189841-understanding-the-us-domestic-computer-science-phd-pipeline/fulltext contains a link to https://dl.acm.org/citation.cfm?id=2808213.2790854:

<a href="http://dl.acm.org/citation.cfm?id=2808213.2790854&amp;coll=portal&amp;dl=ACM" class="fav_acm_digital" target="_blank" title="View in ACM Digital Library">ACM Digital Library</a>

Extract the id from this link and then continue with the existing workflow.

Comments (4)

  1. Robert Jäschke reporter
    • changed status to open

    Should be straightforward: just check whether the URL points to queue.acm.org and if so, use a regex to extract the ID from the page content. Then proceed with the ID as usual.

  2. Log in to comment