Wiki

Clone wiki

TableAnnotator / Home

TableAnnotator

This is the home of the PDF TableAnnotator project.

The TableAnnotator software is a tool that allows to annotate tables within native PDF documents for benchmark creation. In addition, the tool can be used to evaluate table detection and understanding algorithms on a given benchmark.

The software is implemented in Java, with the GUI built on Eclipse RCP (SWT). PDF processing is achieved via the Apache PdfBox project (Apache PDFBox). PdfBox is published under the Apache License, v.2. Also, we use the Pdf-renderer Project, which is published under the LGPL 2.1 lincense.

TableAnnotator is open-source under the LGPL 2.1 lincense. It is distributed as is without any warranties.

Features:

  • Built-in PDF viewer
  • Multiple document granularities (Instruction, Word, Text Line)
  • Zoom
  • Annotations can be saved to XML (benchmark creation)
  • Annotation XML and/or benchmark XML can be loaded separately
  • Reporting features calculates Precision/Recall between annotations and ground truth
  • Adjustment of ground truth in case of unwanted offset (X,Y, scale)

Features Planned:

  • PDF Overlay on instruction view

Screenshot:

screenshot

Updated