When I upload a PDF, the bibliographic metadata shall be extracted such that I can post it.
Issue #2836
open
What we can extract
There are types of bibliographic metadata:
- The metadata of the article in the PDF itself (e.g., title, authors ... it's probably difficult to extract much more).
- The bibliographic metadata of the referenced articles.
How we extract it
Using Grobid and the grobid-pdf branch, where the Grobid class implements the getBibTeX method.
Subtasks
All tasks are collected in the milestone PDFextraction.
Comments (2)
-
reporter -
- changed status to open
- Log in to comment
I propose the following implementation order:
After the first step extraction should basically work and could be (beta)released. Each subsequent step would improve the handling.