Scrapping BibTex from Google Scholar is broken via the Chrome Plugin
When I try to add a publication via the BibText snippet provided by Google Scholar using the Chrome Plugin, I get the following error:
Could not scrape the URL https://scholar.google.de/scholar.bib?q=info:FrNsHbA-fzsJ:scholar.google.com/&output=citation&hl=de&ct=citation&cd=0. Message was: java.lang.StringIndexOutOfBoundsException: String index out of range: -1
Comments (18)
-
-
reporter CitiSense: improving geospatial environmental assessment of air quality using a wireless personal exposure monitoring system
The link reported above still works for me, but I have noticed that it does not when I use the private mode of my browser. So it seems to be a permission permission problem.
-
-
assigned issue to
-
assigned issue to
-
- changed status to open
-
i have this error when i open the link on the Chrome, InternetExplorer, Safari
Your client does not have permission to get URL /scholar.bib?q=info:FrNsHbA-fzsJ:scholar.google.com/&output=citation&hl=de&ct=citation&cd=0 from this server.
-
It's not clear, which page should be scrapable. An option would be to scrape the first hit on a search page. In this case this would be the desired article.
-
reporter I was referring to the BibTex page (when clicking on the "Import into BibTeX" button). In the mentioned case: https://scholar.google.de/scholar.bib?q=info:FrNsHbA-fzsJ:scholar.google.com/&output=citation&hl=en&ct=citation&cd=0 Which seems not to be accessible without session information, I guess. Did Google change that which caused the the Plugin to break?
-
When I access your URL, I get an error. When I do the following, BibSonomy can successfully scrape the BibTeX:
- Open this URL (the search from my previous message)
- Click on Cite below the first (and only) search result.
- In the overlay, click on BibTeX.
- The resulting page can be scraped by BibSonomy.
Tested on Firefox with BibSonomy's "postPublication" bookmarklet (not the plugin, but that should also work).
(It's even the GoogleScholarScraper which is extracting the data, though in that case the BibTeXScraper would do the job as well.)
-
reporter Indeed, if I do what you said, it will work.
Regarding my way of doing this (which saves 2 clicks): I know that the URL will get an error for you, because it seems to be session-bound for some reason (thus the scrapper will probably not be able to parse the URL the normal way). To reproduce, please try changing your Google Scholar settings to only show BibTex (Settings->Show links to import citations into "BibTex"). Then use the appropriate link ("Import into BibTeX") which is shown instead of "cite" and try to import that BibTex snippet.
-
i implemented it just now, after 3 attempts, google has blocked me :D :D hahaha I'm getting a 403 error using the provided link.
-
- changed status to resolved
fixes
#2544→ <<cset d4a9bafe2d3f>>
-
- changed status to closed
-
- changed status to open
does not work test returns 403
any other solution?
-
- changed milestone to 3.6.0
-
if i scrape information from this site i need a parameter csisig and this parameter ist made by javascript from the website. Robert Jäschke and i have decided: we will not scrape from this site WHEN we have multiple Articles in one Page...
-
- changed title to Scrapping BibTex from Google Scholar is broken via the Chrome Plugin
-
- changed status to resolved
fixes
#2544closing#2544clean the Scraper→ <<cset 57cc00520aa0>>
-
- changed status to closed
fixes
#2544closing#2544clean the Scraper→ <<cset 57cc00520aa0>>
- Log in to comment
Could you please report the publication you tried to scrape. I'm getting a 403 error using the provided link.