ScienceMag sometimes does not work

Issue #1827 resolved
Robert Jäschke created an issue

We got an error report from a user for the URL http://www.sciencemag.org/content/302/5651/1704.full that got the following error: Could not scrape the URL: Download link is not available

I tested the URL and for the first time it worked but then I got the same error.

Can you please find out, what is happening and then repair the scraper? Thanks!

Comments (14)

  1. Robert Jäschke reporter

    Why is it not possible to post from the URL with ".full" at the end? Would it be possible to support these URLs? Which changes would be neccessary in the scraper?

  2. Former user Account Deleted

    @jaeschke ScienceMagScraper uses generic CitationManagerScraper to scrape from Science Magazine. CitationManagerScraper extracts a link and construct a new url by adding ""&type=bibtex"" at the end. It doe not have any problem if a url ends with .full , .short or .abstract. I wrote a simple java class to test the two urls

    public class SMMain {
        public static void main(String[] args) throws ScrapingException, MalformedURLException{
            ScienceMagScraper sms = new ScienceMagScraper();
            ScrapingContext sc = new ScrapingContext(new URL("http://www.sciencemag.org/content/302/5651/1704.full"));
    
            System.out.println(sms.scrape(sc));
            System.out.println(sc.getBibtexResult());
        }
    
    }
    

    Finally, I got the same BibTeX result. I did not change anything. It works as it was. Plus I run the JUnit Test and no error report shown.

  3. Robert Jäschke reporter

    @misgna Please test it again on BibSonomy - you will see, that it does not work with the .full URL but it works with the .short URL. The reason is, that the BibSonomy server is running at University of Kassel which has different access rights to ScienceMag. Hence, it is difficult to test this behavior here (use the development system for testing).

    I have implemented a clean fix. Please have a look at it and try to understand what it does.

  4. Log in to comment