No text extraction for powerpoint slides when 'squishable' is not set.

Former user Account Deleted

Comment 1. originally posted by @ysavourel on 2013-03-22T12:00:53.000Z:

I can't reproduce the problem.
I've tried with a PPTX file with 'normal slides' and they get extracted.
The "slide+xml" type for those files seems to be handled in line 640.
If you could post an example file where the problem occurs it could help.
Thanks,
-yves

2013-03-22T12:00:53+00:00

Former user Account Deleted

attached resume-cover-letter-preparation-2011.pptx

Comment 2. originally posted by aurelien.tomass... on 2013-03-22T13:05:02.000Z:

I tried with this PPTX found into internet.
the OpenXmlFilter opens the zip files, and then reads correctly the [Content_Types].xml, and the file /ppt/slideMasters/slideMaster1.xml, but all the files into /ppt/slides are considered as "Document part", and not read.

PS: i tired with okapi-lib v0.19.
http://code.google.com/p/okapi/source/browse/okapi/filters/openxml/src/main/java/net/sf/okapi/filters/openxml/OpenXMLFilter.java?name=m19
Into this file version, i can't see the handler for "slide+xml" type

2013-03-22T13:05:02+00:00

Former user Account Deleted

Comment 3. originally posted by @ysavourel on 2013-03-22T13:38:26.000Z:

PS: i tired with okapi-lib v0.19.
Into this file version, i can't see the handler for "slide+xml" type

Line 606 in that file.

Thanks for the example file. I'll try it.

2013-03-22T13:38:26+00:00

Former user Account Deleted

attached test319_1.out.pptx

Comment 4. originally posted by @ysavourel on 2013-03-22T13:45:45.000Z:

Maybe we fixed something since M19, but M21-snapshot seems to be extracting that file properly (see pseudo-translated output).
I haven't tried with M20 (which is the current release)

2013-03-22T13:45:45+00:00

Former user Account Deleted

Comment 5. originally posted by aurelien.tomass... on 2013-03-22T13:48:16.000Z:

Thanks!
In fact, if i put the boolean "bSquishable" to true, the line 606 is accessible, but if i turn this boolean to false, then the line 606 is never reached. Then, i don't know if it considered as a bug for this version...
Thanks for the help

2013-03-22T13:48:16+00:00

Former user Account Deleted

Comment 6. originally posted by @ysavourel on 2013-03-22T13:58:27.000Z:

Mmm.. I'm not sure why an option about optimizing the text runs is tested there.
That variable seems also set to true evrywhere.
It looks like there is something fishy about this part of the code.
I'll keep the issue open for now.
Thanks for the input/feedback.
-ys

2013-03-22T13:58:27+00:00

Former user Account Deleted

changed status to resolved

Comment 7. originally posted by @ysavourel on 2013-07-24T19:24:23.000Z:

I changed the bSquishable test.

2013-07-24T19:24:23+00:00

Comments (7)