title + author extraction from PDFs using Grobid
Issue #2839
new
This is a subtask of issue #2836.
Grobid can extract title, author, affiliations, etc. of PDF files. Add this type of extraction to the Grobid class. Ideally, this should be done in the same pass as the extraction of the bibliographic data, but be careful not to destroy the thread-safety of the class. Possibly, this requires to change the method signature (to return a pair of Post
and List<Post>
or the metadata could be stored in the first post of the returned list (this must be documented!).