reindex failes with segmentation fault

Issue #1319 resolved
Mirko K. Mucko created an issue

Hi there,

I tried to switch over from sphinx to manticore. Regardless of running RT=1 or not, it works. But I dont have access to my old mails. When I triy so reindex, I get

piler@piler:~$ reindex -a -c /etc/piler/piler.conf
Segmentation fault

system log sais:

Oct 9 07:17:04 piler mariadbd[708]: 2023-10-09 7:17:04 4556 [Warning] Aborted connection 4556 to db: 'piler' user: 'piler' host: 'localhost' (Got an error reading communication packets)
Oct 9 07:17:04 piler kernel: [ 2253.185362] show_signal_msg: 1 callbacks suppressed
Oct 9 07:17:04 piler kernel: [ 2253.185367] reindex[6101]: segfault at 0 ip 0000000000000000 sp 00007ffc763206a8 error 14 in reindex[55fa732e8000+2000] likely on CPU 0 (core 0, socket 0)
Oct 9 07:17:04 piler kernel: [ 2253.185378] Code: Unable to access opcode bytes at 0xffffffffffffffd6.

It’s running on Debian 12 (BOOKWORM), installed is piler-1.4.4-jammy-553ebb4f with manticore-server-6.2.12-230822-dc5144d35

diff piler.conf piler.conf.dist lists default_retention_days=3650, the mysql-credentials and the hostid. That's it.

diff manticore.conf manticore.conf.distlists only the four mysql-related things.

piler runs OK, piler-smtp runs ok, pilersearch runs ok but without old mails. Btw, switching byck to sphinx even produce the same error, so my system is noch dead-locked ;-(

Any suggestions?

Kind regards

Mirko

Comments (16)

  1. Mirko K. Mucko reporter

    Perhaps this could help?

    piler@piler:~$ ldd /usr/local/bin/reindex
    linux-vdso.so.1 (0x00007ffd9dbee000)
    libpiler.so.0.1.1 => /lib/libpiler.so.0.1.1 (0x00007ff50beeb000)
    libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007ff50becc000)
    libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x00007ff50ba00000)
    libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x00007ff50b956000)
    libtre.so.5 => /usr/local/lib/libtre.so.5 (0x00007ff50beba000)
    libzip.so.4 => /lib/x86_64-linux-gnu/libzip.so.4 (0x00007ff50be9a000)
    libmariadb.so.3 => /lib/x86_64-linux-gnu/libmariadb.so.3 (0x00007ff50b901000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ff50b720000)
    libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007ff50be87000)
    /lib64/ld-linux-x86-64.so.2 (0x00007ff50bf2b000)

  2. Janos SUTO repo owner
    • changed status to open

    Try to recompile piler for your environment. See piler -V for the compile options.

  3. Mirko K. Mucko reporter

    That`s what I did before, sorry. I downloaded piler-1.4.4.tar.gz make/installed them with the same result. I also tried 1.4.3 without success. It seems, any lib is wrong or could be missing.

  4. Mirko K. Mucko reporter

    Hi Eduardo,

    thanks for the script. There is one point I’d like to ask before I nuke my 5 year archive:

    make postinstall

    This is the postinstall utility for piler
    It should be run only at the first install. DO NOT run on an existing piler installation!

    Continue? [Y/N] [N] n
    Aborted.
    make: *** [Makefile:129: postinstall] Error 1

    What do I have to archive bevor executing this or, in other words, how to recover my mailarchive?

    Cu

    Mirko

  5. Eduardo

    I never said no to make the post insall on the new instalation., you make the post install in order to make the conf files running, but must be a new install (a new system with out mails) in my case I have a old mailpiler system running on ubuntu and I made a new install from a new install system on debian and I export and import every mail from the old system….

  6. Mirko K. Mucko reporter

    Sounds great - so how to proceed to export/import? sqldump mysql and tar the /var/piler? And one thing I don’t understand: the meaning of the pem-File in /etc/piler vs. the piler.key ….?!

  7. Eduardo

    search on the documentation about “pilerexport” for every year --- copy to the new system --- “pilerimport”..

  8. Mirko K. Mucko reporter

    Hi Eduardo,

    I did it (except I use apache instead of nginx), than followed https://www.mailpiler.org/wiki/current:migration-to-new-host (so, MariaDB runs OK). I took over my piler.key, leave eversthing else unchanged except changes from your post. I started pilersearch which was also ok. Also starting pyler itself runs ok.

    But when I tried to reindex my old mails, it hrows again a “Segmentation fault” (regardlesses whether started as piler user or as root) and the systemlog says

    Oct 10 07:00:30 piler mariadbd[29791]: 2023-10-10 7:00:30 84 [Warning] Aborted connection 84 to db: 'piler' user: 'piler' host: 'localhost' (Got an error reading communication packets)
    Oct 10 07:00:30 piler kernel: reindex[30567]: segfault at 0 ip 0000000000000000 sp 00007ffddf1f5518 error 14 in reindex[564459bee000+2000] likely on CPU 1 (core 0, socket 1)
    Oct 10 07:00:30 piler kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6.

    So I assume something with my index files could wrong or with my SQL database…?

    Cu

    Mirko

  9. Janos SUTO repo owner

    Besides the above it’s worth to check if you can reindex a new email. Eg. grab the latest id from metadata mysql table (eg. 123), then run reindex -f 123 -t 123.

  10. Mirko K. Mucko reporter

    So I did:

    SELECT id FROM metadata ORDER BY id DESC LIMIT 0, 1;

    +-------+
    | id |
    +-------+
    | 86479 |
    +-------+
    1 row in set (0.001 sec)

    back to bash:

    piler@piler:~$ reindex -f 86479 -t 86479 -p -c /etc/piler/piler.conf
    processed: 1 [100%]
    put 1 messages to sph_index table for reindexing

    So I’ll start to process chunks of IDs, and found a lot of issues from this kind:

    piler@piler:~$ reindex -f 86381 -t 86381 -p -c /etc/piler/piler.conf
    Segmentation fault

    Back so mysql:

    MariaDB [piler]> SELECT * FROM metadata where id=86381\G
    *************************** 1. row ***************************
    id: 86381
    from: xxx
    fromdomain: xxx
    subject: xxx
    spam: 0
    arrived: 1696737978
    sent: 1696737969
    retained: 2012097978
    deleted: 0
    size: 5176
    hlen: 3931
    direction: 0
    attachments: 0
    piler_id: 4000000065222ac40e21aef400630b31a757
    message_id: 0102018b0d76c565-fc420531-76d2-4f04-ac72-0ae27ae9e7ed-000000@eu-west-1.amazonses.com
    reference:
    digest: 131cb1b398826379663cdac68ff937b00c9555bdd6b1071bda7200f6469dc619
    bodydigest: c5b54125df32181b9278bef07be95bd671bccdf9528e2d666fa1042b8f285ab7
    vcode: 80d71ce3bba3a04648da58f96d7e6c811809c62e463db216823bf72eb7243250
    1 row in set (0.001 sec)

    and again in bash

    piler@piler:~$ pilerget 4000000065222ac40e21aef400630b31a757

    works like a charme and shows the whole message.

  11. Janos SUTO repo owner

    It’s odd. Both pilerget and reindex retrieves the email from the archive store the same way.

  12. Mirko K. Mucko reporter

    I’ve got it. Finally. The main issue is ridiculous simple and I didn’t see it. We’re in Germany, so we have german Umlaute. The dist-setting in manticore.conf for CHARSET_TABLE is english, not "non_cjk". That’s one point. The other thing is, that there seems to be a diff between the deb-pakage and the self compiled piler 1.4.4 . Yes, of course, due to different libs, GCC…etc… but what should I say: in deb-pkg, it stills says “segment fault”, if I build it from source, it runs!

    While I’m writing, reindex is completely ready and I’l start checking mails. Later on, I’ll change to RT=1, but now it’s time for a very large cup of coffee.

    Thank you very much for your support, it was great to understand a bit more of pilers work….great!

    I will write some final words when my cheks are passed.

    Cu

    Mirko

  13. Janos SUTO repo owner

    Nice debugging :-) I’m glad that you made it eventually. The deb package targets ubuntu 22.04, and indeed there might be some diffs.

  14. Log in to comment