The First Year

Posted on December 29, 2020 by Rickard Nilsson

One year ago nixbuild.net was announced to the Nix community for the very first time. The service then ran as a closed beta for 7 months until it was made generally available on the 28th of August 2020.

This blog post will try to summarize how nixbuild.net has evolved since GA four months ago, and give a glimpse of the future for the service.

Stability and Performance

Thousands of Nix builds have been built by nixbuild.net so far, and every build helps in making the service more reliable by uncovering possible edge cases in the build environment.

These are some of the stability-related improvements and fixes that have been deployed since GA:

  • Better detection and handling of builds that time out or hang.

  • Improved retry logic should our backend storage not deliver Nix closures as expected.

  • Fixes to the virtual file system inside the KVM sandbox.

  • Better handling of builds that have binary data in their log output.

  • Changes to the virtual sandbox environment so it looks even more like a “standard” Linux environment.

  • Application of the Nix sandbox inside our KVM sandbox. This basically guarantees that the Nix environment provided through nixbuild.net is identical to the Nix environment for local builds.

  • Support for following HTTP redirects from binary caches.

Even Better Build Reuse

One of the fundamental ideas in nixbuild.net is to try as hard as possible to not build your builds, if an existing build result can be reused instead. We can trivially reuse an account’s own builds since they are implicitly trusted by the user, but also untrusted builds can be reused under certain circumstances. This has been described in detail in an earlier blog post

Since GA we’ve introduced a number of new ways build results can be reused.

Reuse of Build Failures

Build failures are now also reused. This means that if someone tries to build a build that is identical (in the sense that the derivation and its transitive input closure is bit-by-bit identical) to a previously failed build, nixbuild.net will immediately serve back the failed result instead of re-running the build. You will even get the build log replayed.

Build failures can be reused since we are confident that our sandbox is pure, meaning that it will behave exactly the same as long as the build is exactly the same. Only non-transient failures will be reused. So if the builder misbehaves in some way that is out of control for Nix, that failure will not be reused. This can happen if the builder machine breaks down or something similar. In such cases we will automatically re-run the build anyway.

When we fix bugs or make major changes in our sandbox it can happen that we alter the behavior in terms of which builds succeed or fail. For example, we could find a build that fail just because we have missed implementing some specific detail in the sandbox. Once that is fixed, we don’t want to reuse such failures. To avoid that, all existing build failures will be “invalidated” on each major update of the sandbox.

If a user really wants to re-run a failed build on nixbuild.net, failure reuse can be turned off using the new user settings (see below).

Reuse of Build Timeouts

In a similar vein to reused build failures, we can also reuse build timeouts. This is not enabled by default, since users can select different timeout limits. A user can activate reuse of build timeouts through the user settings.

The reuse of timed out builds works like this: Each time a new build is submitted, we check if we have any previous build results of the exact same build. If no successful results or plain failures are found, we look for builds that have timed out. We then check if any of the existing timed out builds ran for longer than the user-specified timeout for the new build. If we can find such a result, it will be served back to the user instead of re-running the build.

This feature can be very useful if you want to avoid re-running builds that timeout over and over again (which can be a very time-consuming excercise). For example, say that you have your build timeout set to two hours, and some input needed for a build takes longer than that to build. The first time that input is needed you have to wait two hours to detect that the build will fail. If you then try building something else that happens to depend on the very same input you will save two hours by directly being served the build failure from nixbuild.net!

Wait for Running Builds

When a new build is submitted, nixbuild.net will now check if there is any identical build currently running (after checking for previous build results or failures). If there is, the new build will simply hold until the running build has finished. After that, the result of the running build will likely be served back as the result of the new build (as long as the running build wasn’t terminated in a transient way, in which case the new build will have to run from scratch). The identical running builds are checked and reused across accounts.

Before this change, nixbuild.net would simply start another build in parallel even if the builds were identical.

New Features

User Settings

A completely new feature has been launched since GA: User Settings. This allows end users to tweak the behavior of nixbuild.net. For example, the build reuse described above can be controlled by user settings. Other settings includes controlling the maximum used build time per month, and the possibility to lock down specific SSH keys which is useful in CI setups.

The user settings can be set in various way; through the nixbuild.net shell, the SSH client environment and even through the Nix derivations themselves.

Even if many users probably never need to change any settings, it can be helpful to read through the documentation to get a feeling for what is possible. If you need to differentiate permissions in any way (different settings for account administrators, developers, CI etc) you should definitely look into the various user settings.

GitHub CI Action

A GitHub Action has been published. This action makes it very easy to use nixbuild.net as a remote Nix builder in your GitHub Actions workflows. Instead of running you Nix builds on the two vCPUs provided by GitHub you can now enjoy scale-out Nix builds on nixbuild.net with minimal setup required.

The nixbuild.net GitHub Action is developed by the nixbuild.net team and there are plans on adding more functionality that nixbuild.net can offer users, like automatically generated cost and performance reports for your Nix builds.

Shell Improvements

Various minor improvements have been made to the nixbuild.net shell. It is for example now much easier to get an overview on how large your next invoice will be, through the usage command.

The Future

After one year of real world usage, we are very happy with the progress of nixbuild.net. It has been well received in the Nix community, proved both reliable and scalable, and it has delivered on our initial vision of a simple service that can integrate into any setup using Nix.

We feel that we can go anywhere from here, but we also realize that we must be guided by our users’ needs. We have compiled a small and informal roadmap below. The items on this list are things that we, based on the feedback we’ve received throughout the year, think are natural next steps for nixbuild.net.

The roadmap has no dates and no prioritization, and should be seen as merely a hint about which direction the development is heading. Any question or comment concerning this list (or what’s missing from the list) is very welcome to [email protected].

Support aarch64-linux Builds

Work is already underway to add support for aarch64-linux builds to nixbuild.net, and so far it is looking good. With the current surge in performant ARM hardware (Apple M1, Ampere Altra etc), we think having aarch64 support in nixbuild.net is an obvious feature. It is also something that has been requested by our users.

We don’t know yet how the pricing of aarch64 builds will look, or what scalability promises we can make. If you are interested in evaluating aarch64 builds on nixbuild.net in an early access setting, just send us an email to [email protected].

Provide an API over SSH and HTTP

Currently the nixbuild.net shell is the administrative tool we offer end users. We will keep developing the shell and make it more intuitive for interactive use. But will also add an alternative, more scriptable variant of the shell.

This alternative version will provide roughly the same functionality as the original shell, only more adapted to scripting instead of interactive use. The reason for providing such an SSH-based API is to make it easy to integrate nixbuild.net more tightly into CI and similar scenarios.

There is in fact already a tiny version of this API deployed. You can run the following command to try it out:

$ ssh eu.nixbuild.net api show public-signing-key
{"keyName":"nixbuild.net/bob-1","publicKey":"PmUhzAc4Ug6sf1uG8aobbqMdalxW41SHWH7FE0ie1BY="}

The above API command is in use by the nixbuild-action for GitHub. So far, this is the only API command implemented, and it should be seen as a very first proof of concept. Nothing has been decided on how the API should look and work in the future.

The API will also be offered over HTTP in addition to SSH.

Upload builds to binary caches

Adding custom binary caches that nixbuild.net can fetch dependencies from is supported today, although such requests are still handled manually through support.

We also want to support uploading to custom binary caches. That way users could gain performance by not having to first download build results from nixbuild.net and then upload them somewhere else. This could be very useful for CI setups that can spend a considerable amount of their time just uploading closures.

Provide an HTTP-based binary cache

Using nixbuild.net as a binary cache is handy since you don’t have to wait for any uploads after a build has finished. Instead, the closures will be immediately available in the binary cache, backed by nixbuild.net.

It is actually possible to use nixbuild.net as a binary cache today, by configuring an SSH-based cache (ssh://eu.nixbuild.net). This works out of the box right now. You can even use nix-copy-closure to upload paths to nixbuild.net. We just don’t yet give any guarantees on how long store paths are kept.

However, there are benfits to providing an HTTP-based cache. It would most probably have better performance (serving nar files over HTTP instead of using the nix-store protocol over SSH), but more importantly it would let us use a CDN for serving cache contents. This could help mitigate the fact that nixbuild.net is only deployed in Europe so far.

Support builds that use KVM

The primary motivation for this is to be able to run NixOS tests (with good performance) on nixbuild.net.

Thank You!

Finally we’d like to thank all our users. We look forward to an exciting new year with lots of Nix builds!