crash - can not reproduce - and ASAN experience
After some editing on a single RAW CR3 image, I played a bit with area masks, and then went to perspective correction and boom. Unfortunately not in the debug build and I can’t reproduce the crash.
So I did a debug build with the address sanitizer (WITH_ASAN=address), but ART didn’t even start because of a delete type mismatch. So I ignored that type of error but then ART did not start with a use after free in the thumbnail viewer.
I attach the ASAN results.
Question: do you find it useful if I use ASAN to try to solve these memory allocation/addressing errors and report my findings back here?
Comments (18)
-
repo owner -
reporter In my (limited) previous experience with ASAN I did not get any false positives. If I find anything useful I will report back here.
-
reporter The problem is in line 489 in rtengine/labmasks.cc:
if (show_mask_idx >= n || !masks[show_mask_idx].enabled) { show_mask_idx = -1; }
on entry n = 1 and show_mask_idx = -1, so masks[-1] gets hit, causing the ASAN alert.
When I change this to:
if (show_mask_idx != -1) { if (show_mask_idx >= n || !masks[show_mask_idx].enabled) { show_mask_idx = -1; } }
ASAN no longer complains. and ART starts up. I’m going to test some more with ASAN enabled now.
I’m going to ignore the “delete of inherited class through base class pointer where the base class destructor is not virtual” for now, it might be harmless (but the standard says the behaviour is undefined).
I also disabled tcmalloc in build-art because I read that using tcmalloc or jemalloc is not allowed when using ASAN, and I don’t know if this still true.
EDIT: this has apparently been fixed in current builds.
-
reporter - attached ASAN-2.pdf
Did some testing with ASAN and so far the only error I got was a heap buffer overrun when closing down ART (see ASAN2.pdf for the stacktrace and ASAN report).
This looks more difficult to figure out… It looks like the EditDataProvider has already been destroyed (ImageArea destructor) when it gets referenced further on in the Spot::on_hide() callback set in the Spot constructor.
0x6140000cf270 is located 48 bytes inside of 440-byte region [0x6140000cf240,0x6140000cf3f8)
freed by thread T0 here:
#0 0x7ffff6efb2c0 in operator delete(void*) (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xe12c0)
0x55555645384b in ImageArea::~ImageArea() /home/danny/programs/code-art/rtgui/imagearea.cc:72#1
0x55555645df28 in ImageAreaPanel::~ImageAreaPanel() /home/danny/programs/code-art/rtgui/imageareapanel.cc:43#2
0x55555645dfab in ImageAreaPanel::~ImageAreaPanel() /home/danny/programs/code-art/rtgui/imageareapanel.cc:44#3
0x555556156678 in EditorPanel::~EditorPanel() /home/danny/programs/code-art/rtgui/editorpanel.cc:857#4
0x555556157633 in EditorPanel::~EditorPanel() /home/danny/programs/code-art/rtgui/editorpanel.cc:932#5
0x7ffff58c899a in g_datalist_clear (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x2e99a)#6previously allocated by thread T0 here:
#0 0x7ffff6efa448 in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xe0448)
0x55555645d88f in ImageAreaPanel::ImageAreaPanel() /home/danny/programs/code-art/rtgui/imageareapanel.cc:24#1
0x55555614c0d6 in EditorPanel::EditorPanel(FilePanel*) /home/danny/programs/code-art/rtgui/editorpanel.cc:574#2
0x5555567980eb in RTWindow::createSetmEditor() /home/danny/programs/code-art/rtgui/rtwindow.cc:1145#3
0x55555678b5b3 in RTWindow::RTWindow() /home/danny/programs/code-art/rtgui/rtwindow.cc:370#4
0x555556504c1e in create_rt_window /home/danny/programs/code-art/rtgui/main.cc:257#5
0x555556507fb0 in main /home/danny/programs/code-art/rtgui/main.cc:566#6
0x7ffff11a5b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)#7
Not sure I’m up to figuring this one out.
-
repo owner seeing that you edited a CR3 file, this might be related to
#88… -
reporter The crash might be related, but the use after free in the onhide callbacks when closing ART is totally unrelated…
-
repo owner sure, I wasn't talking about that. Sorry for the confusion
-
reporter No need to apologize!
-
reporter I think I fixed the use after free problem when closing ART.
I forked ART and created a branch to test this: https://bitbucket.org/dheijl/art/src/ASANFIXES/.
-
repo owner great! I'll check it out as soon as I can. Thanks again!
-
reporter Although I think a better solution could be to check out why setProvider is used in the way it is, causing the problem in the first place, and fix that instead.
I fixed it the bottom up way I’m afraid.
-
repo owner Honestly I didn’t have the chance to take a look yet, so I don't know…
-
repo owner Hi @Danny Heijl , why can’t you just call
EditSubscriber::setEditProvider(nullptr)
from~EditDataProvider
instead of adding the extra flag and method? -
reporter That was the first thing I tried and it didn’t work because the currSubscriber in EditDataProvider was already gone. Then I added the flag and it worked. The clear is still there in ~EditDataProvider. I didn’t try to work out how it all ties together, never liked GUI programming.
-
reporter I suppose I could clear the provider in EditSubscriber:: unSubscribe instead of using the flag if I was sure that Subscribe is not going to be called again without setting the provider again.
-
repo owner nope. that wouldn’t work. But I’ve already simplified your patch a little bit, I’ll push shortly. Thanks!
-
reporter OK thanks!
-
reporter - changed status to resolved
Fixed by agriggio in commit d356318
- Log in to comment
surely useful! I don't have much experience with it though, does it give false positives or is it precise? I tried running on valgrind recently but it did not find anything worth investigating…