Issue #7 open

Storage for scanned papers with OCR and thumbnails

Andy Mikhailenko avatarAndy Mikhailenko created an issue

User scans a paper document (single or multiple pages), adds it to the organizer (via CLI or web), writes a description and tags. The images are automatically parsed by OCR (so fulltext search is possible). Thumbnails are generated (so user can view multiple pages at once).

  • Multi-page documents
  • OCR on save (cuneiform seems to give the most accurate results compared with gocr and ocrad, especially if language if defined by user)
  • Allow user to edit the OCRed text but mainly use it for search.

Comments (4)

  1. Andy Mikhailenko

    Mostly implemented in dd0e3fa389fe (adding a page via CLI; OCR on the fly or by request; manual or automatic summary; details extracted from the image via OCR; web interface with thumbnails for list and detail views).

    To do:

    • complete one-stage import process (scan, parse, save);
    • edit summary via web UI;
    • "papers" as ordered lists of "pages";
    • categorization (at least tags);
    • search (by summary and/or details);
    • link to other OrgTool documents (like projects, events, people, plans, messages, needs).
  2. Log in to comment
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.