- edited description
Crashing when extracting non-English subtitles
OCR on English subtitles works splendidly.
But when attempting to extract non-English subtitles, Subler crashes without warning. The loading-bar is visible for only a second before the program crashes.
I have tried this with:
- several different sources
- several languages
- different Tesseract training files
The documentation is not clear wether the .traineddata files need to have a specific name, but I tried with both full and abbreviated language names.
I'm running Subler v 1.4.5, MacOS High Sierra v 10.13.1.
I thank you for your amazing program.
Comments (5)
-
reporter -
Just FYI, I do this regularly on media with danish (Blu-ray and DVD sourced) subtitles (Has the letters æøå, which Tesseract catches just fine), and I never see crashes.
-
Egill, the file, that you are supposed to put in '~/Library/Application Support/Subler/tessdata/', has to be downloaded from 'https://github.com/tesseract-ocr/tessdata'. You should not pick a file from the other folders. For example: 'https://github.com/tesseract-ocr/tessdata_fast' or 'https://github.com/tesseract-ocr/tessdata_best'. I hope this will help.
-
I experienced the same problem as original poster. I found that the linked tesseract files didn't work for me. Tesseract data files underwent some re-organization as it seems.
But in their wiki, the Tesseract developers maintain a list of links to old data files: https://github.com/tesseract-ocr/tesseract/wiki/Data-Files#data-files-for-version-302
-> I strongly suggest to update the link mentioned in https://bitbucket.org/galad87/subler/wiki/Subtitles%20Guide to the working one.
BR Christian
-
repo owner - changed status to closed
The link has been updated.
- Log in to comment