Merged in bugfix/simple-Scraper-fixes (pull request #43)
73647aa·Author: Til Barthel·Closed by: Mario Holtmüller·2022-01-24
Description
removing html and fixing citekey from scrapingResult
fixed scrapeReferences
added new host and fixed scraper
modified generic CitMgrScraper to fit more scrapers. childs were also changed overworked reference scraping from jap and aanda(regex didn t stop after references) moved tests and modified them
fixed Iucr Scraper
fixed faseb Scraper and tests
fixed rsoc Scraper and tests
fixed rspb Scraper and tests
fixed sage Scraper and tests
fixed sciencemag Scraper, test and added new host
fixed mdpi Scraper and tests
fixed nber Scraper and tests
removed opac Scrapr and tests added hebis Scraper and tests
added opac Scraper again, but removed it from KDEUCScraper fixed springer Scraper
spring Scraper really fixed now
WebUtils: changed getContentAsString to also accept post Methods added a getHeadersMethod
fixed DOIUtils getDOIFromUrl to first decode Url and changed regex for doi fixed SpringerScraper (WorldCatScraper returns false Bibtex) to use DOINegScraper fixed SpringerLinkScraper added and fixed tests for both scrapers
fixed apa Scraper and tests
fixed nasaads Scraper and tests
fixed osti Scraper and tests
fixed plos tests and cleaned the plos scraper up
fixed ProjectEuclidScraper and tests
added hebis and ahajournals scraper to KDEUrlCompositeScraper
CitMgrScraper clean up
Many scraper and testdata fixes. Some scraper fixes were more complicated (apa, nasaads, ProjectEuclidScraper), but most were straightforward. Had to adjust DOIUtils and WebUtils, but i tried to have as little side effects as possible. Also moved the testdata for each fixed scraper in its own directory.
removing html and fixing citekey from scrapingResult
fixed scrapeReferences
added new host and fixed scraper
modified generic CitMgrScraper to fit more scrapers. childs were also changed overworked reference scraping from jap and aanda(regex didn t stop after references) moved tests and modified them
fixed Iucr Scraper
fixed faseb Scraper and tests
fixed rsoc Scraper and tests
fixed rspb Scraper and tests
fixed sage Scraper and tests
fixed sciencemag Scraper, test and added new host
fixed mdpi Scraper and tests
fixed nber Scraper and tests
removed opac Scrapr and tests added hebis Scraper and tests
added opac Scraper again, but removed it from KDEUCScraper fixed springer Scraper
spring Scraper really fixed now
WebUtils: changed getContentAsString to also accept post Methods added a getHeadersMethod
fixed DOIUtils getDOIFromUrl to first decode Url and changed regex for doi fixed SpringerScraper (WorldCatScraper returns false Bibtex) to use DOINegScraper fixed SpringerLinkScraper added and fixed tests for both scrapers
fixed apa Scraper and tests
fixed nasaads Scraper and tests
fixed osti Scraper and tests
fixed plos tests and cleaned the plos scraper up
fixed ProjectEuclidScraper and tests
added hebis and ahajournals scraper to KDEUrlCompositeScraper
CitMgrScraper clean up
Many scraper and testdata fixes. Some scraper fixes were more complicated (apa, nasaads, ProjectEuclidScraper), but most were straightforward. Had to adjust DOIUtils and WebUtils, but i tried to have as little side effects as possible. Also moved the testdata for each fixed scraper in its own directory.
Â