Segfault caused by security
I just had a segfault because security called net->SendToOne() on a player with uninitialized ConnData. This happens when net->SendToOne() is called before net->NewConnection() or net->MakeClientConnection().
(gdb) print conn.outlist
$7 = {[0] = {
prev = 0x0,
next = 0x0
},
[1] = {
prev = 0x0,
next = 0x0
},
[2] = {
prev = 0x0,
next = 0x0
},
[3] = {
prev = 0x0,
next = 0x0
},
[4] = {
prev = 0x0,
next = 0x0
}}
(gdb) bt full
#0 0x000000000040ad14 in DQAdd (base=0x7fffe0000ebc, node=0x805d90) at main/util.c:971
No locals.
#1 0x000000000041b813 in BufferPacket (conn=0x7fffe0000de4, data=0x7fffffffded0 "", len=6, flags=0, callback=0x0, clos=0x0) at core/net.c:2504
buf = 0x805d90
pri = 1
__PRETTY_FUNCTION__ = "BufferPacket"
#2 0x000000000041b8a7 in SendToOne (p=0x7fffe0000b40, data=0x7fffffffded0 "", len=6, flags=0) at core/net.c:2518
conn = 0x7fffe0000de4
#3 0x00007ffff2937bb2 in ?? () from /home/j/Desktop/zone/bin/security.so
No symbol table info available.
#4 0x0000000000415976 in RunLoop () at core/mainloop.c:63
ret = 1
td = 0x6f7970
l = 0x6f79a0
gtc = 130409037
#5 0x0000000000406b88 in main (argc=1, argv=0x7fffffffe068) at main/main.c:297
code = 0
Comments (6)
-
-
Yup, pretty sure I hit this bug too. If I remember correctly it was a libc incompatibility. Would be nice if security.so were statically compiled... Or something :)
-
What platforms/libc versions are you running into trouble with?
-
reporter It is a rare occurrence. I am guessing usually it times just right and everything remains okay.
The first time, I was just trying to set up asss fresh on my VM (virtualbox). The second time I got it, I was actually running gdb by accident, trying to debug a deadlock. I suspect the deadlock triggered it this time, it was locked in net.c (due to the new player callback). So this is probably why security was too early. (security should of course still wait for some kind of calback from net.c that indicates the player has been initialized)
I am using https://bitbucket.org/grelminar/asss/downloads/security_x86-64_debian_libc2.11.3.so and a checkout from mercurial.
$ uname -a Linux j-xubuntu-vm 3.11.0-17-generic #31-Ubuntu SMP Mon Feb 3 21:52:43 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux $ ldd --version ldd (Ubuntu EGLIBC 2.17-93ubuntu4) 2.17 $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 13.10 Release: 13.10 Codename: saucy
Also, I am thinking 0x7fffffff0000 is where some other kind of address space starts on my machine.
-
Hmm, the bug I was hitting was triggered every time I ran asss so it might've been something different. I was also running Xubuntu 13.10 but x86. I ended up running CentOS 5 to get it working without crashing if I recall.
-
reporter - changed status to resolved
A lot has changed in 1.6. If it occurs again open a new issue
- Log in to comment
It seems more likely this is some sort of binary incompatibility with security or an unrelated cause. I have never seen anything like this in 9 years of running this software. What steps are necessary to repeat this?
Are 0x7fffffff0000 range pointers normal by the way?