1. Shea Levy
  2. ref-trans

Overview

Original Motivation

Nix as it currently exists is very useful for using software, but for developing software it faces some significant shortcomings. In particular, trying to do incremental compilation with nix quickly runs into several problems:

  • There is very little nix-language library support for the individual building blocks of overall build processes (compilation, linking, etc.). It is unclear (to me) whether nix is even a good language for this (e.g. you can't currently iterate over all of the files in a directory, so there's no efficient way to express "this library is made by compiling all the .c files in that directory and linking them together").
  • Each derivation requires adding a .drv file to the store, which involves file I/O, hash calculation, database locking, etc. As the build process gets more granular, this cost starts to add up
  • The full dependency tree must be known in advance of any build. This has two problematic consequences:
  • the nix-language must fully process every derivation before it starts building any of them
  • you can't dynamically add build steps as the build progresses. This can be worked-around with imports from derivations, but those derivations don't benefit as much from nix's parallelism (since they are built as-needed at evaluation time and evaluation stops until the build completes) and has a very awkward interface (building nix expressions to be parsed back in by the evaluator).
  • The smallest unit of action is a process. As the build process gets more granular, the overhead of starting a whole new process (which is even greater if e.g. chroot builds are enabled) can start to dominate, especially in the case of a program like javac where just initializing the runtime has a non-negligible cost
  • The only form of sharing is files. If multiple stages in the build process want to use the same information generated by an earlier stage, their only option is to each read in a file created by that stage, getting hit with I/O penalties in the process
  • nix's parallelism is limited: while it will run multiple builds in parallel the computation required to determine which builds to start is all done single-threaded
  • nix's garbage collection is crude in some ways. In particular, it is impossible during a build to delete any intermediary products, even if they are no longer needed.

In particular, the facts that the smallest unit of action is a process and the only form of sharing is files are what led to the thinking that led to this project. For building things like C programs, this is in fact the status quo (it's how e.g. Make works), but for other more complex build processes this may be prohibitive (at work, the compiler we use in some cases requires communication with a service, implemented in java, running in another process. starting this service anew for every compile would kill performance). I hope, though, that my idea will be useful for much more than that if implemented well.

Reframing Nix

There are a lot of ways to think about what it is that nix does and why it's so valuable. In the context of this project, I think of nix as a tool for achieving referential transparency in a build process by making it possible to fully specify the inputs to the process, the outputs of the process, and the process itself. While it is possible to break these rules on purpose (e.g. using a random number generator), within the framework nix provides the default behavior is that the products of a build process are completely specified by the formal specification of that process, which opens up all of the benefits that referential transparency provides in a programming context (parallelism, reuse, reproducibility, etc.)

Generalizing Nix

Given the view of nix as a tool for achieving referential transparency, it becomes easier to see where nix might be made more general. The core of nix is the build process specification, which creates files as outputs, depends on files (whether specified completely in advance by copying them from the filesystem or produced by other builds) as inputs, and runs a process as its action. The core question motivating ref-trans is: What if there were a way to allow more kinds of outputs, inputs and actions while still retaining nix's core idea of referential transparency?

Architecture

As I currently envision it, ref-trans will be divided into a highly general core with plugins to provide specific functionality. The ref-trans core will be responsible for defining the interface action plugins and input/output plugins must follow, for actually running the actions in parallel while respecting the dependency requirements of the build process, for coordinating how the different inputs are passed to the actions, etc. The ref-trans core along with an input/output plugin for files and a "run a process" action plugin would be enough to re-implement nix, but more is possible (and planned, for now):

  • You could have a "memory page" input/output type, where a page of memory calculated in one build step can be mapped into the address space of another
  • You could have a "thread" action type, where a new thread in an existing process will be spun off, avoiding the overhead involved in creating a whole new process (certain care will be needed WRT global memory, file descriptors, etc. to make it possible for this to be referentially-transparent by default, but I don't think that's insurmountable)
  • You could have a "snapshotted process" input/output type and action type, where a one build step has a process that sets it up in a desirable state and then stops itself (i.e. with SIGSTOP) and then any number of other steps can be started with a fork of that process state (or multiple such processes running in communication with each other)

With just these three, you can do things like run multiple javac builds in parallel without re-incurring the JVM startup cost, or perform complex computations in parallel without worrying about locking (with the added benefit of sharing until the memory is GCd!), etc. But the real goal is to make the core interfaces generic enough that anything that can in principle be performed in a referentially transparent way can be implemented as a set of plugins to ref-trans and automatically gain all the benefits of sharing, parallelism, other plugin types, etc. For example, a versioned database might be able to provide plugins to bring an important kind of application state to ref-trans.

Design Goals

I plan on developing this project with a few design goals in mind from the start. Some of these are based on my experiences with nix and where it could be better, while others are simply what I think will help make this a maximally useful product:

  • A fully asynchronous (non-blocking), thread-safe API.
  • C at the interface (for easy integration with other languages).
  • Scalability as a religion. All design decisions will be considered (and, wherever possible, empirically tested) with a view toward how it will behave at scales at the upper limits of feasibility.
  • Comprehensively tested, both at an integration/black box level and a unit-test level. I'm initially starting with a test-driven design approach but am unsure of its effectiveness.
  • The primary interface will be an API. Individual programs will be provided when helpful, but they will ideally be simple combinations of functionality existing in the libraries, not implementing any complex logic of their own.
  • Inputs and outputs will be fully dynamic: ref-trans controlled build steps will be able to initiate builds of their own and declare that they need to create more outputs as they go along. There will be a mechanism whereby a caller can be notified of a dynamic output being declared by builds it has started.
  • The core and plugins will primarily be designed to make it possible and easy to write referentially-transparent builds. Features that enforce those properties (e.g. chroot builds) will have a lower priority.
  • Making it impossible for an unrelated build process to affect another's output (by spoofing, writing to files it doesn't belong to, etc.) without that other's permission will be a high priority.
  • Garbage collection will be more aggressive: un-referenced outputs will be deleted quickly.
  • Not fully decided on this, but reference counting may be manual. The "file" input/output plugin could provide a library function to scan for hashes like nix currently does and register those as referenced inputs, but the ref-trans programs themselves may not do that.
  • Portable wherever possible, with needed non-portability isolated in clearly identifiable ways to make porting a reasonable endeavour.