cannot start piler - Segmentation fault

Issue #146 resolved
Andy Vo created an issue

my installation of piler was working for a few hours, not sure what I did, but I could not start piler service again, as the output below. Please help.

[root@localhost ~]# service piler start starting piler . . . /etc/init.d/piler: line 9: 1662 Segmentation fault (core dumped) /usr/local/sbin/piler -d

Thanks, Andy

Comments (32)

  1. Janos SUTO repo owner

    please specify your exact OS type, distribution, mysql version, type. Also show me piler -V

  2. Andy Vo reporter

    Here's the request info.

    OS type - fedora 19 mysql = 5.5.33 (community)

    ler 0.1.24-master-branch, build 836, Janos SUTO sj@acts.hu

    Build Date: Sun Sep 15 14:26:45 EDT 2013 ldd version: ldd (GNU libc) 2.17 gcc version: gcc version 4.8.1 20130603 (Red Hat 4.8.1-1) (GCC) Configure command: ./configure --localstatedir=/home --with-database=mysql --enable-starttls

  3. Janos SUTO repo owner

    OK, then download the latest master branch, and run the following:

    ./configure --localstatedir=/home --with-database=mariadb --enable-starttls

    Then it will complain about a missing mariadb installation, and show you an url. Download MariaDB Client Library for C 1.0.0 Stable. After unpacking it run bin/mariadb_config to show where you should copy those files.

    Then run again ./configure --localstatedir=/home --with-database=mariadb --enable-starttls, and this time it should run without complaining, and will compile. Then run make install to overwrite the current binaries, and you can start piler.

    Let me know how you are doing. Btw. please show me mysqld --version

  4. Andy Vo reporter

    here's what I found in the messages log...

    Sep 15 13:42:16 XXXXXXX abrtd: Directory 'ccpp-2013-09-15-13:42:16-3854' creation detected Sep 15 13:42:16 XXXXXXX abrtd: Executable '/usr/local/sbin/piler' doesn't belong to any package and ProcessUnpackaged is set to 'no' Sep 15 13:42:16 XXXXXXX abrtd: 'post-create' on '/var/tmp/abrt/ccpp-2013-09-15-13:42:16-3854' exited with 1 Sep 15 13:42:16 XXXXXXX abrtd: Deleting problem directory '/var/tmp/abrt/ccpp-2013-09-15-13:42:16-3854' Sep 15 13:43:08 XXXXXXX kernel: [ 285.083551] piler[3874]: segfault at 2000006 ip 00007f5754d69019 sp 00007fff005edcf8 error 4 in libmysqlclient.so.1018.0.0[7f5754d50000+257000] Sep 15 13:43:08 XXXXXXX abrt[3877]: Saved core dump of pid 3874 (/usr/local/sbin/piler) to /var/tmp/abrt/ccpp-2013-09-15-13:43:08-3874 (10174464 bytes) Sep 15 13:43:08 XXXXXXX abrtd: Directory 'ccpp-2013-09-15-13:43:08-3874' creation detected Sep 15 13:43:08 XXXXXXX abrtd: Executable '/usr/local/sbin/piler' doesn't belong to any package and ProcessUnpackaged is set to 'no' Sep 15 13:43:08 XXXXXXX abrtd: 'post-create' on '/var/tmp/abrt/ccpp-2013-09-15-13:43:08-3874' exited with 1 Sep 15 13:43:08 XXXXXXX abrtd: Deleting problem directory '/var/tmp/abrt/ccpp-2013-09-15-13:43:08-3874'

  5. Andy Vo reporter

    I dont have mariadb installed. But I have community-mysql and server installed. Again, it was working for a few hours. I did not touch the database or the db configuration. I have checked and re-checked that mysql server is running properly. What else should I look at?

  6. Andy Vo reporter

    here mysql -v

    Your MySQL connection id is 7 Server version: 5.5.33 MySQL Community Server (GPL)

    Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

    Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners.

    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

    mysql>

  7. Andy Vo reporter

    I also keeps loosing my /var/run/piler directory after reboot the system. I have to re-create for the searchd to run. But still have problem starting up piler.

  8. Janos SUTO repo owner

    OK, can you do downloading mariadb client library, and the following steps as suggested above? If you gave me an access, then I could look around, and fix it for you. If you are interested in such a remote help, then please write to my email address, see piler -V.

  9. Andy Vo reporter

    I think there's something wrong with the fedora 19 distro. I setup piler with another distro and it works great now.

    For the /var/run/piler directory, I have to modify the init script to re-create the folder and set the proper permission.

    thanks for your help.

  10. Andy Vo reporter

    speak to soon of my new installation. I encountered the the same issue, after running for a few hours processing ~3K messages. Piler service cannot start. Below is the trace from /var/log/messages:

    Sep 16 16:27:50 piler piler[2951]: starting piler . . . Sep 16 16:27:50 piler kernel: [ 1167.035849] piler[2956]: segfault at 7 ip 00007fb4def72617 sp 00007fff4d353a98 error 4 in libmysqlclient.so.18.0.0[7fb4def58000+257000] Sep 16 16:27:50 piler abrt[2959]: Saved core dump of pid 2956 (/usr/local/sbin/piler) to /var/spool/abrt/ccpp-2013-09-16-16:27:50-2956 (10203136 bytes) Sep 16 16:27:50 piler abrtd: Directory 'ccpp-2013-09-16-16:27:50-2956' creation detected Sep 16 16:27:50 piler piler[2951]: /etc/rc.d/init.d/piler: line 13: 2956 Segmentation fault (core dumped) /usr/local/sbin/piler -d Sep 16 16:27:50 piler abrtd: Executable '/usr/local/sbin/piler' doesn't belong to any package Sep 16 16:27:50 piler abrtd: 'post-create' on '/var/spool/abrt/ccpp-2013-09-16-16:27:50-2956' exited with 1 Sep 16 16:27:50 piler abrtd: Corrupted or bad directory /var/spool/abrt/ccpp-2013-09-16-16:27:50-2956, deleting

  11. Andy Vo reporter

    It's strange, in a way that, if i reboot the whole system, piler would start properly and receive mail again. But if after the reboot, I run "service piler start" from the init.d script, then I got piler terminated and the error above.

  12. Andy Vo reporter

    no, i have not. I dont have mariadb installed.

    can i use community-mysql and server with libs and devel packages with piler?

  13. Jack Zielke

    I am also getting the segmentation fault. Using --with-database=mysql gdb would say:

    Starting program: /usr/local/sbin/piler 
    [Thread debugging using libthread_db enabled]
    [New Thread 0xb7880b70 (LWP 25472)]
    [Thread 0xb7880b70 (LWP 25472) exited]
    
    Program received signal SIGSEGV, Segmentation fault.
    0xb7c1f3b8 in net_field_length () from /usr/lib/libmysqlclient_r.so.16
    

    Using --with-database=mariadb (and installing the client) I get:

    Starting program: /usr/local/sbin/piler 
    [Thread debugging using libthread_db enabled]
    
    Program received signal SIGSEGV, Segmentation fault.
    net_field_length (packet=0x17) at /home/wlad/trunk/libmysql/libmysql.c:455
    455 /home/wlad/trunk/libmysql/libmysql.c: No such file or directory.
        in /home/wlad/trunk/libmysql/libmysql.c
    

    I insatlled libmysql++3 to see if that would help and it did not change anything.

  14. Janos SUTO repo owner

    Jack, does piler start at all or segfaults right after starting it? Can you show me mysqld --version? Please show me piler -V and the platform you are using, eg. centos 6.4 x64...

    Can you show me what files the libmysql++3 package has?

  15. Jack Zielke
    # rc.piler start
    starting piler . . .
    Segmentation fault
    
    mysqld  Ver 5.1.66-0+squeeze1 for debian-linux-gnu on i486 ((Debian))
    
    piler 0.1.25-master-branch, build 844, Janos SUTO <sj@acts.hu>
    
    Debian GNU/Linux 6.0.7 (squeeze) (32 bit)
    

    http://packages.debian.org/squeeze/i386/libmysql++3/filelist says:

    /usr/lib/libmysqlpp.so.3
    /usr/lib/libmysqlpp.so.3.0.9
    /usr/share/doc/libmysql++3/CREDITS.txt
    /usr/share/doc/libmysql++3/HACKERS.txt.gz
    /usr/share/doc/libmysql++3/changelog.Debian.gz
    /usr/share/doc/libmysql++3/changelog.gz
    /usr/share/doc/libmysql++3/copyright
    /usr/share/lintian/overrides/libmysql++3
    
  16. Andy Vo reporter

    Update with more testing... I did two more test install of piler on 2 different fedora-19 boxes. same releases for all the support packages, I mentioned above. 1st box seems to work fine, with all the services start up automaticly. I can manually restart piler or searchd services and they start up fine.

    The 2nd box also works, when after booting up. However, I found out that, everytime I manually restart piler service (with "service piler restart" command), piler could start "with Segmentation fault". Not sure what's the difference the between the two installations.

  17. Andy Vo reporter

    for the issue with the 2nd fedora-19, it's resolved. I installed piler and make postinstall (after remove piler db) and I can manually restart piler now.

  18. János Csárdi-Braunstein

    Hi!

    I run this isse after upgrading.

    mysql  Ver 14.14 Distrib 5.1.70, for pc-linux-gnu (x86_64) using readline 5.1
    
    piler 0.1.24-master-branch, build 836, Janos SUTO <sj@acts.hu>
    
    Build Date: Fri Oct 4 15:36:34 CEST 2013
    ldd version: ldd (GNU libc) 2.15
    gcc version: gcc version 4.5.4 (Gentoo 4.5.4 p1.0, pie-0.4.7) 
    Configure command: ./configure --enable-memcached --localstatedir=/var --with-database=mysql
    

    gcc -v

    Using built-in specs.
    COLLECT_GCC=/usr/x86_64-pc-linux-gnu/gcc-bin/4.5.4/gcc
    COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/4.5.4/lto-wrapper
    Target: x86_64-pc-linux-gnu
    Configured with: /var/tmp/portage/sys-devel/gcc-4.5.4/work/gcc-4.5.4/configure --prefix=/usr --bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/4.5.4 --includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.4/include --datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.5.4 --mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.5.4/man --infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.5.4/info --with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.4/include/g++-v4 --host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --disable-altivec --disable-fixed-point --without-ppl --without-cloog --disable-lto --enable-nls --without-included-gettext --with-system-zlib --enable-obsolete --disable-werror --enable-secureplt --enable-multilib --enable-libmudflap --disable-libssp --enable-libgomp --with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/4.5.4/python --enable-checking=release --disable-libgcj --enable-languages=c,c++,fortran --enable-shared --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --enable-targets=all --with-bugurl=http://bugs.gentoo.org/ --with-pkgversion='Gentoo 4.5.4 p1.0, pie-0.4.7'
    Thread model: posix
    gcc version 4.5.4 (Gentoo 4.5.4 p1.0, pie-0.4.7) 
    

    And there is strace: http://pastebin.com/azrABZi5

    in dmesg:

    piler[16051]: segfault at 118 ip 00007fd421bf952f sp 00007fff6c5bb1c0 error 4 in libmysqlclient_r.so.16.0.0[7fd421bd0000+13a000]
    
  19. Bubreg István

    I had these segfaults too, when the owner of the binaries (after manual upgrade from master) had became root. After changing the owner to piler:piler everything's fine.

  20. Jack Zielke

    Istvan, which binaries? chmod piler:piler /usr/local/sbin/piler* did not fix it for me. Vo, you removed the piler database and then it worked with a new one. I wonder if there is something about the upgrade path for the database that is causing problems. I might try that next. What would I need to save before doing that? user, user_settings, ldap, archiving_rule, group, option, retention_rule? Jsuto, what system do you develop on? This is a VM for me so I am willing to run this on whatever OS.

  21. Janos SUTO repo owner

    I develop piler on slackware linux 32-bit, and the demo site runs on the same. Piler also runs on debian 7 without issues.

    However I've already experimented creating a deb package for ubuntu (known for segfaulting), and now I'm creating an rpm package with mariadb libraries for x64. So I try to support as many distributions as possible. Btw. what platform do you have?

  22. Andy Vo reporter

    Since, I re-installed piler and db, so I need to remove the old db (from the 1st installation). This was a test box, so I did not need to save any data before removing the db.

    Vo, you removed the piler database and then it worked with a new one. I wonder if there is something about the upgrade path for the database that is causing problems.

  23. Janos SUTO repo owner

    Andy, I1ve updated the upgrade docs a few days ago to include all required database scheme upgrades. So does it work for you now?

  24. Log in to comment