- changed status to closed
Invalid use of vfork in Command_execute leads to corrupted process state/crashing
Per the specification of vfork, the only valid actions after it are _exit or one of the exec-family functions. In practice you can often get away with most things that are async-signal-safe as long as you're not also using thread cancellation, but the more an action has potential to alter the memory state of the process (which is shared with the suspended parent) the more likely things are going to blow up.
I've had reports of monit crashing in Alpine Linux containers using musl libc, and the strace shows it happening right after vfork. Command_execute is doing a lot of things, all of them formally unsafe, in the child before exec, but the most suspicious is getpwuid, which can allocate memory, manipulate open file state, etc.
Use of vfork should be dropped unless you can ensure that it's done safely. Ideally posix_spawn would be used in its place but it looks like you need some functionality not offered by posix_spawn.
Comments (15)
-
repo owner -
reporter - changed status to open
That comment explains what vfork is. It does not define an invatiant which must hold for calling Command_exec; there are no circumstances under which a call to Command_exec in its present form is valid, as it has undefined behavior (and actively corrupts the parent process's memory space) no matter how it is called.
-
repo owner I should also include the second part of the comment, `The child also run first and suspend the parent process until exec or exit is called`. I.e. it is true what you say, but we thought as long as the control is sort of synchronous it would be safe.
Command_exec
does not differ much from (earlier) implementations of posix_spawn. Indeed posix_spawn typically uses vfork semantics itself. Though I hear you about `getpwuid`. Ps. It is an honour and I’m surprised you took the time to investigate. Thank you. -
Thanks Rich for investigating this. I’m replacing vfork with fork in Alpine Linux for now. This may blow up in future version of glibc/*bsd as well so it would be good to have it fixed upstream and it would be good to add a regression test for it.
Downstream report: https://gitlab.alpinelinux.org/alpine/aports/issues/11078
-
repo owner - changed status to resolved
Fix
#867move setuid guard from child to main process.→ <<cset 4d44453b0529>>
-
reporter - changed status to open
This change does not look correct. It changes group membership of the parent process before forking. It also looks like there is still a good deal of code that's likely to be unsafe not just in theory but in practice after the vfork, including getting error cause and emitting error messages. I'm not sure what mechanisms they use but if stdio is involved it's very unsafe.
-
repo owner This change does not look correct. It changes group membership of the parent process before forking.
It does not. It guard against setuid being used with a non-existing user. The child still call
setuid
as usual. Error reporting is done over the pipe. Appreciate if some could test with musl. -
reporter _guardSetuid is called in the parent before vfork, and it calls initgroups. I think what you want is getgrouplist in the parent and setgroups in the child.
-
repo owner Thanks. This last commit revert the previous nonsensical changes. The only change now is that we move the call to
getpwuid
out into the main thread. If this still breaks onmusl
please let us know exactly where and why it breaks and we’ll fix it, if we can. Though bear in mind that we do requirevfork
semantics, especially for error reporting back to the parent process etc. As long as vfork is supported on platforms Monit runs on we would prefer not to change this. -
reporter Conceptually initgroups is a call to getgrouplist followed by setgroups, and getgrouplist is the exact same type of heavy operation as getpwuid that’s extremely unsafe after vfork. This is why I suggested you probably want to call getgroupslist in the parent and setgroups in the child.
But really unless you’re prepared to dig into all this and make sure everything you’re doing is reasonably safe (note: formally NONE of it is safe; if it works it’s because you got lucky with implementation details), I think you really should just drop vfork. There are better ways to convey error status back to parent. My favorite is a close-on-exec pipe.
-
repo owner vfork is useful, especially in a constrained environment where Monit might run. Even though modern fork methods tries to minimise copy and use CoW etc, no copying is more efficient and as an extra bonus, error reporting from child to parent is trivial.
Anyway, I’m just curious where this crash on musl occurs and if it is possible to fix? The suggested fix of not using vfork is kind of unwanted. The code in
Command_execute
is not much different from e.g. posix_spawn on musl, except it don’t use syscalls directly. I.e. I want to narrow down where the crash occurs. Or is vfork just broken on musl. -
reporter vfork
is not broken on musl. It works exactly as specified: the only thing you are allowed to do aftervfork
isexecve
or_exit
. If something else happens to behave in a way that matches what you want, that’s purely by chance and is not behavior that can be relied on to continue to work. You were corrupting process state on glibc too; it just happened that the corruption was not having any immediately noticable consequences.I really don’t want to spend time tracking down the mechanism of how a particular instance of undefined behavior leads to a crash. It’s not a productive analysis because there’s no bug to find; the bug is already known (invalid use of
vfork
) and I’ve already spent an inordinate amount of time on this bug report.I agree that avoiding
fork
is valuable. There are two right ways to do this: you can either useposix_spawn
, orvfork
and immediatelyexecve
a separate program that the child setup takes place in. There are also hackish ways that are not valid and can’t be expected to work in arbitrary systems, but that might happen to work often enough that you don’t care. At the very least these would involve refraining from doing anything that’s not async-signal-safe in thevfork
child. I’ve mentioned factoringinitgroups
asgetgrouplist
andsetgroups
so that you avoid the former in the child, but there may be other things needed too. But in any case if you’re going to continue down this path you need to be very consciously thinking about the potential unsafety of every single thing you do in the child, not just haphazardly calling arbitrary functions there. -
repo owner Issue
#858was marked as a duplicate of this issue. -
repo owner - changed status to resolved
Fixed: Issue
#867Modify according to recommendation. This should address reports about Monit crashing in Alpine Linux containers using musl libc→ <<cset 8787ec12c6f1>>
-
repo owner Fixed: Issue
#867: Monit crashed when executing programs when linked with musl libc. Thanks to Rich Felker for report.→ <<cset cd22b3a1faa9>>
- Log in to comment
The comment above
Command_exec
is the invariant which must hold;Vfork has a special semantic in that the child process runs in the parent address space until exec is called in the child.