scraper for DeGruyter is extracting wrong information

Issue #2664 on hold
Robert Jäschke created an issue

The publication on http://www.degruyter.com/view/j/itit.2014.56.issue-5/itit-2014-1048/itit-2014-1048.xml contains the following metadata:

Haustein, S., Larivière, V., Thelwall, M., et al. (2014). Tweets vs. Mendeley readers: How do these two social media metrics differ?. Special Issue: Social Media / Katrin Weller, Markus Strohmaier. it - Information Technology, 56(5), pp. 207-215. Retrieved 9 Aug. 2016, from doi:10.1515/itit-2014-1048

However, the DeGryterScraper is extracting this BibTeX:

@misc{stefanie2014tweets,
author = {Haustein Stefanie and Larivière Vincent and Thelwall Mike and Amyot Didier and Peters Isabella},
booktitle = {it - Information Technology},
doi = {10.1515/itit-2014-1048},
issn = {21967032},
pages = {207--},
title = {Tweets vs. Mendeley readers: How do these two social media metrics differ?},
url = {http://www.degruyter.com/view/j/itit.2014.56.issue-5/itit-2014-1048/itit-2014-1048.xml},
volume = {56},
year = {2014}
}

The final page is missing and the author names are not correctly separated (Lastname Firstname) - they should be separated by "," (Lastname, Firstname). Furthermore - if possible - the BibTeX type should be correctly determined (in the above case `article').

Comments (9)

  1. Daniel Zoller

    The scraper downloads the Endnote from DeGruyter:

    TY  - GENERIC
    
    
    AU  - Haustein Stefanie
    
    AU  - Larivière Vincent
    
    AU  - Thelwall Mike
    
    AU  - Amyot Didier
    
    AU  - Peters Isabella
    
    TI  - Tweets vs. Mendeley readers: How do these two social media metrics differ?
    
    T2  - it - Information Technology
    
    J2  - itit
    
    VL  - 56
    
    M1  - 5
    
    SP  - 207
    
    PY  - 2014
    
    PY  - 2014
    
    SN  - 21967032
    
    DO  - 10.1515/itit-2014-1048
    
    UR  - //www.degruyter.com/view/j/itit.2014.56.issue-5/itit-2014-1048/itit-2014-1048.xml
    
    
    Y2  - 2016-09-20T23:53:43.097+02:00
    
    ER  -
    

    author names are not correct, missing ','; the type is always GENERIC and the final page is also missing :(

  2. Johannes Büttner

    Scraping the data directly from the website seems to be the only way to get them correctly.

  3. Robert Jäschke reporter

    OK, then I suggest to contact DeGruyter and give them a hint that their RIS export is broken.

  4. Log in to comment