Invalid use of vfork in Command_execute leads to corrupted process state/crashing

Issue #867 resolved
Rich Felker created an issue

Per the specification of vfork, the only valid actions after it are _exit or one of the exec-family functions. In practice you can often get away with most things that are async-signal-safe as long as you're not also using thread cancellation, but the more an action has potential to alter the memory state of the process (which is shared with the suspended parent) the more likely things are going to blow up.

I've had reports of monit crashing in Alpine Linux containers using musl libc, and the strace shows it happening right after vfork. Command_execute is doing a lot of things, all of them formally unsafe, in the child before exec, but the most suspicious is getpwuid, which can allocate memory, manipulate open file state, etc.

Use of vfork should be dropped unless you can ensure that it's done safely. Ideally posix_spawn would be used in its place but it looks like you need some functionality not offered by posix_spawn.

Comments (15)

  1. Tildeslash repo owner

    The comment above Command_exec is the invariant which must hold; Vfork has a special semantic in that the child process runs in the parent address space until exec is called in the child.

  2. Rich Felker reporter
    • changed status to open

    That comment explains what vfork is. It does not define an invatiant which must hold for calling Command_exec; there are no circumstances under which a call to Command_exec in its present form is valid, as it has undefined behavior (and actively corrupts the parent process's memory space) no matter how it is called.

  3. Tildeslash repo owner

    I should also include the second part of the comment, `The child also run first and suspend the parent process until exec or exit is called`. I.e. it is true what you say, but we thought as long as the control is sort of synchronous it would be safe. Command_exec does not differ much from (earlier) implementations of posix_spawn. Indeed posix_spawn typically uses vfork semantics itself. Though I hear you about `getpwuid`. Ps. It is an honour and I’m surprised you took the time to investigate. Thank you.

  4. Rich Felker reporter
    • changed status to open

    This change does not look correct. It changes group membership of the parent process before forking. It also looks like there is still a good deal of code that's likely to be unsafe not just in theory but in practice after the vfork, including getting error cause and emitting error messages. I'm not sure what mechanisms they use but if stdio is involved it's very unsafe.

  5. Tildeslash repo owner

    This change does not look correct. It changes group membership of the parent process before forking.

    It does not. It guard against setuid being used with a non-existing user. The child still call setuid as usual. Error reporting is done over the pipe. Appreciate if some could test with musl.

  6. Rich Felker reporter

    _guardSetuid is called in the parent before vfork, and it calls initgroups. I think what you want is getgrouplist in the parent and setgroups in the child.

  7. Tildeslash repo owner

    Thanks. This last commit revert the previous nonsensical changes. The only change now is that we move the call to getpwuid out into the main thread. If this still breaks on musl please let us know exactly where and why it breaks and we’ll fix it, if we can. Though bear in mind that we do require vfork semantics, especially for error reporting back to the parent process etc. As long as vfork is supported on platforms Monit runs on we would prefer not to change this.

  8. Rich Felker reporter

    Conceptually initgroups is a call to getgrouplist followed by setgroups, and getgrouplist is the exact same type of heavy operation as getpwuid that’s extremely unsafe after vfork. This is why I suggested you probably want to call getgroupslist in the parent and setgroups in the child.

    But really unless you’re prepared to dig into all this and make sure everything you’re doing is reasonably safe (note: formally NONE of it is safe; if it works it’s because you got lucky with implementation details), I think you really should just drop vfork. There are better ways to convey error status back to parent. My favorite is a close-on-exec pipe.

  9. Tildeslash repo owner

    vfork is useful, especially in a constrained environment where Monit might run. Even though modern fork methods tries to minimise copy and use CoW etc, no copying is more efficient and as an extra bonus, error reporting from child to parent is trivial.

    Anyway, I’m just curious where this crash on musl occurs and if it is possible to fix? The suggested fix of not using vfork is kind of unwanted. The code in Command_execute is not much different from e.g. posix_spawn on musl, except it don’t use syscalls directly. I.e. I want to narrow down where the crash occurs. Or is vfork just broken on musl.

  10. Rich Felker reporter

    vfork is not broken on musl. It works exactly as specified: the only thing you are allowed to do after vfork is execve or _exit. If something else happens to behave in a way that matches what you want, that’s purely by chance and is not behavior that can be relied on to continue to work. You were corrupting process state on glibc too; it just happened that the corruption was not having any immediately noticable consequences.

    I really don’t want to spend time tracking down the mechanism of how a particular instance of undefined behavior leads to a crash. It’s not a productive analysis because there’s no bug to find; the bug is already known (invalid use of vfork) and I’ve already spent an inordinate amount of time on this bug report.

    I agree that avoiding fork is valuable. There are two right ways to do this: you can either use posix_spawn, or vfork and immediately execve a separate program that the child setup takes place in. There are also hackish ways that are not valid and can’t be expected to work in arbitrary systems, but that might happen to work often enough that you don’t care. At the very least these would involve refraining from doing anything that’s not async-signal-safe in the vfork child. I’ve mentioned factoring initgroups as getgrouplist and setgroups so that you avoid the former in the child, but there may be other things needed too. But in any case if you’re going to continue down this path you need to be very consciously thinking about the potential unsafety of every single thing you do in the child, not just haphazardly calling arbitrary functions there.

  11. Log in to comment