Segmentation fault during write test

Issue #21 resolved
Petter Strandh created an issue

Hello again,

I have been chasing a segfault that I get some times. It started with a socketcan to blf logger that suffer from segfaults. Either there is a bug in the code that I have submitted with this ticket.

The segfault is not immediate. Sometimes the file need to be run a few times. Then I get a segmentation fault. I have tested this in several platform. Still I am humble that there can be a fault in my code. However this is a minimal rewrite from your own write example.

The debug back trace comes from Uncompressfile::write line 150. where the s value is not possible to access memory address.

I have been looking at the code something is causing a probable thread race condition and I have been able to find it (or my bad code?)

Best regards

Petter

Comments (13)

  1. Tobias Lorenz repo owner
    • changed status to open

    Hi Petter, I'll check it. I will first compile and run the program to try to reproduce it here. Bye Tobias

  2. Tobias Lorenz repo owner

    Hi Petter,

    I see the issue here as well. Still wondering as all relevant methods have lock_guards or condition_variable.wait functions to ensure multi thread safety.

    Bye Tobias

  3. Tobias Lorenz repo owner

    Hi Petter,

    I did some improvements to the code (changed LogContainer handling to get rid of std::shared_ptr<>), and I’m pretty confident that I also found and fixed the memory access violation issue. At least I wasn’t able to trigger it again now.

    The problem was indeed in line 150:

    std::copy(s, s + pcount, logContainer->uncompressedFile.begin() + offset);

    pcount is a size, and this results in a copy of one character more than allowed. Correct is:

    std::copy(s, s + pcount - 1, logContainer->uncompressedFile.begin() + offset);

    Can you check if this solves the issue also for you?

    I have the fix in branch https://bitbucket.org/tobylorenz/vector_blf/branch/fix-issue-21.

    Bye Tobias

  4. Petter Strandh reporter

    Hello,

    I tested. I unfortunately still get the same error with my socketcan logger and another thing the parser is unable to read any framedata from the file, I do see that there data in the header. I can see that the file size is increasing.

    The corefile is 18MB så not that great to paste here. I don’t know what more information can assist with.

    I am fairly confident that I am testing with the correct branch. deleted all files and rebuilt everything. I got suspicious because I get the same error as before. However with the added perk that the parser can’t collect any data.

    git branch

    • fix-issue-21

    Best regards

    Petter

    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
    Core was generated by `./socketcanany_save save.blf'.
    Program terminated with signal SIGSEGV, Segmentation fault.
    #0  0x00007fc7f7cea720 in std::__copy_move<false, false, std::random_access_iterator_tag>::__copy_m<char const*, unsigned char*> (
        __first=0x56239a201ea8 <error: Cannot access memory at address 0x56239a201ea8>, __last=0x56239a221ea7 "", __result=0x7fc7e8020d10 "") at /usr/include/c++/7/bits/stl_algobase.h:324
    324           *__result = *__first;
    [Current thread is 1 (Thread 0x7fc7f6b19700 (LWP 6400))]
    (gdb) bt full 
    #0  0x00007fc7f7cea720 in std::__copy_move<false, false, std::random_access_iterator_tag>::__copy_m<char const*, unsigned char*> (
        __first=0x56239a201ea8 <error: Cannot access memory at address 0x56239a201ea8>, __last=0x56239a221ea7 "", __result=0x7fc7e8020d10 "") at /usr/include/c++/7/bits/stl_algobase.h:324
            __n = 131071
    #1  0x00007fc7f7ce9e01 in std::__copy_move_a<false, char const*, unsigned char*> (__first=0x56239a201ea8 <error: Cannot access memory at address 0x56239a201ea8>, 
        __last=0x56239a221ea7 "", __result=0x7fc7e8020d10 "") at /usr/include/c++/7/bits/stl_algobase.h:386
            __simple = false
    #2  0x00007fc7f7ce9420 in std::__copy_move_a2<false, char const*, __gnu_cxx::__normal_iterator<unsigned char*, std::vector<unsigned char, std::allocator<unsigned char> > > > (
        __first=0x56239a201ea8 <error: Cannot access memory at address 0x56239a201ea8>, __last=0x56239a221ea7 "", __result=0 '\000') at /usr/include/c++/7/bits/stl_algobase.h:422
    No locals.
    #3  0x00007fc7f7ce8e13 in std::copy<char const*, __gnu_cxx::__normal_iterator<unsigned char*, std::vector<unsigned char, std::allocator<unsigned char> > > > (
        __first=0x56239a201ea8 <error: Cannot access memory at address 0x56239a201ea8>, __last=0x56239a221ea7 "", __result=0 '\000') at /usr/include/c++/7/bits/stl_algobase.h:456
    No locals.
    #4  0x00007fc7f7ce7b72 in Vector::BLF::UncompressedFile::write (this=0x7ffdb47517c8, s=0x56239a201ea8 <error: Cannot access memory at address 0x56239a201ea8>, n=131076)
        at /home/petter/vector_blf/src/Vector/BLF/UncompressedFile.cpp:164
            solc = {_M_off = 131072, _M_state = {__count = 0, __value = {__wch = 0, __wchb = "\000\000\000"}}}
            offset = 0
            logContainer = @0x7fc7e8040d50: {<Vector::BLF::ObjectHeaderBase> = {_vptr.ObjectHeaderBase = 0x7fc7f7f019a8 <vtable for Vector::BLF::LogContainer+16>, signature = 1245859660, 
                headerSize = 0, headerVersion = 1, objectSize = 0, objectType = Vector::BLF::ObjectType::LOG_CONTAINER}, compressionMethod = 0, reservedLogContainer1 = 0, 
              reservedLogContainer2 = 0, uncompressedFileSize = 131072, reservedLogContainer3 = 0, compressedFile = std::vector of length 0, capacity 0, 
              uncompressedFile = std::vector of length 131072, capacity 131072 = {0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 
                0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 
                0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 
                0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 
                0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 
                0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 
                0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 
                0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 
                0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 
                0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 
                0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 
                0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 0 '\000', 
                0 '\000', 0 '\000', 0 '\000'...}, compressedFileSize = 0}
            pcount = 131072
            lock = {_M_device = 0x7ffdb47518a8, _M_owns = true}
    #5  0x00007fc7f7ce3739 in Vector::BLF::ObjectHeader::write (this=0x56239a221e90, os=...) at /home/petter/vector_blf/src/Vector/BLF/ObjectHeader.cpp:42
    No locals.
    #6  0x00007fc7f7cc3744 in Vector::BLF::CanMessage::write (this=0x56239a221e90, os=...) at /home/petter/vector_blf/src/Vector/BLF/CanMessage.cpp:42
    No locals.
    #7  0x00007fc7f7cca4e0 in Vector::BLF::File::readWriteQueue2UncompressedFile (this=0x7ffdb4751610) at /home/petter/vector_blf/src/Vector/BLF/File.cpp:731
            ohb = 0x56239a221e90
    #8  0x00007fc7f7ccab02 in Vector::BLF::File::uncompressedFileWriteThread (file=0x7ffdb4751610) at /home/petter/vector_blf/src/Vector/BLF/File.cpp:823
    No locals.
    #9  0x00007fc7f7ccbc83 in std::__invoke_impl<void, void (*)(Vector::BLF::File*), Vector::BLF::File*> (
        __f=@0x56239a221730: 0x7fc7f7ccaac4 <Vector::BLF::File::uncompressedFileWriteThread(Vector::BLF::File*)>) at /usr/include/c++/7/bits/invoke.h:60
    No locals.
    #10 0x00007fc7f7ccb819 in std::__invoke<void (*)(Vector::BLF::File*), Vector::BLF::File*> (
        __fn=@0x56239a221730: 0x7fc7f7ccaac4 <Vector::BLF::File::uncompressedFileWriteThread(Vector::BLF::File*)>) at /usr/include/c++/7/bits/invoke.h:95
    No locals.
    ---Type <return> to continue, or q <return> to quit---
    #11 0x00007fc7f7cccc79 in std::thread::_Invoker<std::tuple<void (*)(Vector::BLF::File*), Vector::BLF::File*> >::_M_invoke<0ul, 1ul> (this=0x56239a221728) at /usr/include/c++/7/thread:234
    No locals.
    #12 0x00007fc7f7cccc1a in std::thread::_Invoker<std::tuple<void (*)(Vector::BLF::File*), Vector::BLF::File*> >::operator() (this=0x56239a221728) at /usr/include/c++/7/thread:243
    No locals.
    #13 0x00007fc7f7cccbea in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(Vector::BLF::File*), Vector::BLF::File*> > >::_M_run (this=0x56239a221720)
        at /usr/include/c++/7/thread:186
    No locals.
    #14 0x00007fc7f79ba6df in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    No symbol table info available.
    #15 0x00007fc7f74cd6db in start_thread (arg=0x7fc7f6b19700) at pthread_create.c:463
            pd = 0x7fc7f6b19700
            now = <optimized out>
            unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140496814053120, -2118110803864800367, 140496814051264, 0, 94710909769504, 140727631024992, 2095607663621092241, 2095611900325573521}, 
                  mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
            not_first_call = <optimized out>
    #16 0x00007fc7f71f688f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
    No locals.
    (gdb) 
    

  5. Petter Strandh reporter

    Hello again,

    I managed to brake it with my write-experiments example as well and as I wrote parser does not see anything.

    End of file.
    uncompressedFileSize: 655664
    objectCount: 0

    Now good night

    Petter

  6. Tobias Lorenz repo owner

    Hi,

    yes, I see. Some tests fail with the new version, and in my new IDE I didn’t saw this issue directly. So disregard the commits. I’ll reduce it to the issue in line 150, and see if this works better.

    Bye Tobias

  7. Tobias Lorenz repo owner

    Hi Petter,

    I created another set of commits in the fix-issue-21 branch.

    It fixes the following: If UncompressedFile::write gets very large chunks of data, then it’s not sufficient to just create one new log container, but this needs to be repeated until enough space to write is available.

    Now with this change, the issue didn’t showed here yet. Also I ensured that all unit tests were executed successfully.

    Give it a try. 🙂

    Bye tobias

  8. Tobias Lorenz repo owner

    I’m still unhappy with the patch. Performance dropped and maybe the issue is just not shown therefore. I try to implement a unit test in the master branch to see if I can trigger the issue reliably.

  9. Tobias Lorenz repo owner

    I added several assertions to the code and found that pcount can get negative in Uncompressed::write. So I added a check for it to prevent writing in this case. The fix is in the fix-issue-21 branch.

  10. Log in to comment