.xlsx attachment issue

Issue #25 resolved
Dimple Mehta created an issue

Indexing and search works fine on .xls file but it fails for .xlsx file.

Comments (12)

  1. Janos SUTO repo owner

    do you have libzip support compiled? Show me piler-config.h. You should have

    #define HAVE_ZIP 1
    

    and NOT

    #undef HAVE_ZIP
    
  2. Dimple Mehta reporter

    /* piler-config.h. Generated from piler-config.h.in by configure. */ /*

    • piler-config.h.in, SJ
    • /
    1. define CONFDIR "/usr/local/etc"
    2. define DATADIR "/var"
    1. define KEYFILE CONFDIR "/piler.key"
    1. define HAVE_DAEMON 1
    1. define HAVE_PDFTOTEXT "/usr/bin/pdftotext"
    2. define HAVE_CATDOC "/usr/bin/catdoc"
    3. define HAVE_CATPPT "/usr/bin/catppt"
    4. define HAVE_XLS2CSV "/usr/bin/xls2csv"
    5. define HAVE_UNRTF "/usr/bin/unrtf"
    6. define HAVE_ZIP 1
  3. Janos SUTO repo owner

    Ok, it looks good. Now please take a message having an xlsx attachment, and please run:

    ./src/pilertest /path/to/message.eml
    

    and check it it displays the contents of the xlsx file, too.

  4. Dimple Mehta reporter

    Its not showing me the right contents.It produce result like "123 124 125 126 5 2 0 6 7 8 0 9 10 2 0 11 12 8 0 13 14 8 0 15 16 2 0 17 18 1 0 19 20 1 0 21 22 8 4 23 24 2 0 25 26 2 0 27 28 2 0 29 30 3 0 31 32 2 0 33 34 1 0 35 36 3 0 37 38 1 4 39 40 3 0 41 42 1 0 43 44 3 4 45 46 3 0 47 48 2 4 49 50 3 0 51 52 2 4 53 54 2 0 55 56 1 0 57 58 1 0 59 60 3 0 61 62 1 0 63 64 3 4 65 66 3 0 67 68 1 69 70 71 1 0 72 73 1 4 74 75 8 4 76 77 1 0 78 79 2 4 80 81 2 0 82 83 8 0 84 85 1 0 86 87 1 0 88 89 8 69 90 91 3 0 9293 8 0 94 95 1 0 96 97 8 0 98 99 2 0 100 101 3 4 102 103 3 69 104 105 1 0 106 107 3 4 108 109 3 69 110 111 2 69 112 113 1 69 114 115 1 69 116 117 8 0 118 119 8 4 120 121 8 0 122 123 124 125 126 123 124 125 126"

  5. Janos SUTO repo owner

    please show me the contents of the zip(!) file (xlsx is basically a zip file), so please run:

    unzip -l /path/to/file.xlsx
    
  6. Dimple Mehta reporter

    Archive: test.xlsx Length Date Time Name --------- ---------- ----- ---- 1556 1980-01-01 00:00 [Content_Types].xml 588 1980-01-01 00:00 _rels/.rels 980 1980-01-01 00:00 xl/_rels/workbook.xml.rels 637 1980-01-01 00:00 xl/workbook.xml 7079 1980-01-01 00:00 xl/theme/theme1.xml 322 1980-01-01 00:00 xl/worksheets/_rels/sheet1.xml.rels 12580 1980-01-01 00:00 xl/worksheets/sheet2.xml 1082 1980-01-01 00:00 xl/worksheets/sheet3.xml 689 1980-01-01 00:00 xl/worksheets/sheet1.xml 3844 1980-01-01 00:00 xl/sharedStrings.xml 4602 1980-01-01 00:00 xl/styles.xml 865 1980-01-01 00:00 docProps/app.xml 7824 1980-01-01 00:00 xl/printerSettings/printerSettings1.bin 616 1980-01-01 00:00 docProps/core.xml --------- ------- 43264 14 files

  7. Janos SUTO repo owner

    Is it possible to send me that xlsx file? I'll treat it confidentially, and wipe it off after fixing the problem.

    Btw. the extracting utility searches in the xl/worksheets/sheet* files, and perhaps it has a bug removing the xml stuff and providing the raw text.

  8. Janos SUTO repo owner

    Please apply the following patch, recompile piler, then replace the binaries, and try again. I was searching in the wrong files...

    --- a/src/extract.c
    +++ b/src/extract.c
    @@ -212,7 +212,7 @@ void extract_attachment_content(struct session_data *sdata, struct _state *state
        }
     
        if(strcmp(type, "xlsx") == 0){
    -      extract_opendocument(sdata, state, filename, "xl/worksheets/sheet");
    +      extract_opendocument(sdata, state, filename, "xl/sharedStrings.xml");
           return;
        }
     
    
  9. Log in to comment