When scraping from D-Lib Magazine, the scraper should decode HTML entities. E.g.,
Mönnich, Michael should not appear in the author field but instead
Please implement the decoding using
StringEscapeUtils.unescapeHtml(). You can look at other scrapers, how they do it. Just open the call hierarchy for that method.
Also add a JUnit test for the URL http://www.dlib.org/dlib/may08/monnich/05monnich.html.