title + author extraction from PDFs using Grobid

Issue #2839 new
Robert Jäschke created an issue

This is a subtask of issue #2836.

Grobid can extract title, author, affiliations, etc. of PDF files. Add this type of extraction to the Grobid class. Ideally, this should be done in the same pass as the extraction of the bibliographic data, but be careful not to destroy the thread-safety of the class. Possibly, this requires to change the method signature (to return a pair of Post and List<Post> or the metadata could be stored in the first post of the returned list (this must be documented!).

Comments (0)

  1. Log in to comment