If you instantiate this derivation on your local machine, it will fail1:

    nix repl ‘<nixpkgs>’

    nix-repl> :b runCommandLocal "myfetch" {
              	outputHash =  "sha256:04jnq6arig0amz0scadavbzn9bg9k4zphmrm1562n6ygfj1dnj45";
          	} ''Garbage. This should be a script executable by bash calling curl. It isn’t.''
        /build/.attr-0: line 1: Garbage.: command not found
    [0 built (1 failed)]
    error: build of '/nix/store/0bn9k5s227v6mb2pvyjdh9rh1rw4l1rm-myfetch.drv' failed

Reasonably so. Yet, on my machine, Nix contently answers with an output path. Wuuuut?

For experienced Nixers, this is not a surprise: this happens if you change a so-called “Fixed Output Derivation” without updating the hash.

However, when I first made this mistake, I didn’t know whether I did something wrong, had a crucial misunderstanding, or Nix had a bug. I was frustrated. I learned that I am not alone. Eventually, I understood what had happened and I could appreciate the tough implementation choices! Yet, there was this nagging feeling that there had to be a better way.

In this article, I explore your choices as a fellow Nixer to verify your Fixed Output Derivations without refetching and rebuilding everything all the time.

NOTE: This article is aimed at Nixers already familiar with writing derivations. For an introduction to Nix, see Nix: superglue for immutable infrastructure.

Fixed Output Derivations

Normally, Nix doesn’t allow derivations (= build steps) to access the network2.

If you want network access from a derivation to download some sources, Nix requires you to set clear expectations: an output hash. A derivation with such a predetermined output hash is called “Fixed Output Derivation” or, abbreviated, “FOD”.

This prevents you from silently introducing randomness into your build. If you interact with the network, you interact with the messy world beyond Nix’s control. But, at least, Nix can shout and abort the build, when your derivation does not produce the expected output.

Enter caching (substitution)

We surely do not want to re-download the Internet, again, in every build.

Therefore, Nix will helpfully cache the result by the name of the derivation and the output hash.3 You have given Nix a promise: Trust me, whatever command I specify, it will result in the given hash. If Nix can ascertain the right result, why do all the hard work?

Preventing huge rebuilds

It goes further than that. If we want to re-execute the derivation without messing with our Nix store that acts like a cache, we will need to change the output path. If we change the output path, all the beautiful derivations and process that output further, will need to be rebuilt.

Ooooh, that can be expensive. E.g. think of GNU libc. All the things we need to rebuild with exactly the same code that we happen to store at a different path!

Back to the problem

All this very reasonable thinking leads to the problem with which I started this article. The commands of the derivations are ignored if we already have a matching derivation result!

In fact, Nixpkgs contains plenty of code that will typically be evaluated but not executed

  • because there is still a matching result in the cache. fetchurl etc. are all implemented with FODs (Fixed Output Derivations), so that this applies to the hypothetical PR in which I change something like:
    fetchurl {
    	url = "https://pigeonhole.dovecot.org/releases/2.3/dovecot-2.3-pigeonhole-${version}.tar.gz";
    	sha256 = "0pk0579ifl3ymfzn505396bsjlg29ykwr7ag8prcbafayg4rrj28";
     }

To this:

    fetchurl {
    	url = "https://there.was.never.anything/here";
    	sha256 = "0pk0579ifl3ymfzn505396bsjlg29ykwr7ag8prcbafayg4rrj28";
     }

And most likely, the CI job would rubber stamp this lovely PR while a human would obviously object.

Note that we can never ensure perfectly that an URL in a derivation still works -- it is outside of our control. But we could ensure that it worked at least once. A big difference, in my opinion!

Rerunning FODs on all input changes

As an experiment, let’s see what is necessary to re-execute FODs (Fixed Output Derivations) on any input change.

FODs are rebuilt when their output hash or their name changes. Therefore, if we include a hash over all inputs in the name, we rebuild the FOD on every input change. We do this in four simple steps:

  1. To get Nix to calculate that hash conveniently for us, we “unfix” the given FOD and create a normal derivation based on it:

    let unfixedDerivation =
        fixedOutputDerivation.overrideAttrs(attrs: {
            outputHash = null;
            outputHashAlgo = null;
            outputHashMode = null;
        });
    
  2. We grab the output path with the hash:

    let outputPath = builtins.unsafeDiscardStringContext unfixedDerivation.outPath;
    

    We are using unsafeDiscardStringContext to make Nix allow us to use the string as part of our name. Despite the name, it works in pure/strict eval mode and is quite harmless, yet unusual.

  3. Extract the hash:

    let inputsHash = builtins.substring 11 32 outputPath;
    
  4. And use it as part of the derivation name:

    let name = "${inputsHash}_${name}";
    

You have just created a Fixed Output Derivation that will re-execute whenever any of its inputs changes!

Rebalancing

For me, rebuilding FODs on every input change, would have been easier to understand. For small projects, it is even a reasonable approach. But for medium and larger projects, it is simply unacceptable: If we use this technique in this extreme form, we will rebuild mostly everything if

  • we update curl,
  • we change the mirror list,
  • any other input changes of fetchurl!

For medium-sized projects I’d recommend another approach:

  1. Use only your original Fixed Output Derivation as a dependency for your real build.
  2. Then, use the instrumented Fixed Output Derivation in your continuous integration testing. That will rebuild all the Fixed Output Derivations themselves on any input change but it stops there. Nothing that depends on them needs to be rebuilt!

For huge projects like nixpkgs, further fine-tuning is desirable. For every change on fetchurl, you may want to execute a couple of basic tests but you may find it unnecessary to re-execute all fetchurls. In addition, you might want to execute each fetch at least once every 30 days or so, even without changes, so that you ensure the URLs are still meaningful. Elaborating on that deserves another article.

Easy Experimentation

I have created a “rerunFixed” function which can be used like this:

rerunOnChange {} myFixedOutputDerivation

By default, it will rerun the given FOD on every input change as a starting point. You can pre-override the attrs before they are used for the hash calculation:

rerunOnChange {
   	preOverrideAttrs = attrs: { irrelevant = null; };
} myFixedOutputDerivation

Look at the docs to see how you can tune it. And let me know via Twitter if you have questions or suggestions. Follow me if you want to see more articles like this.

Conclusion

I think that it is worth-while checking that the code of your Fixed Output Derivations still work, occasionally. When and how often - that depends heavily on the context. I am curious about what you come up with!

Thanks

Special thanks to Andreas Rammhold (@andir0815) for experimenting with the idea with me and telling me about unsafeDiscardStringContext. In my earlier prototype, I needed to do "Import from Derivation" to get rid of the string context!

Thank you to Florian Klink (@flokli) and Andi (again!) for giving me valuable feedback on an earlier draft of this article.

Mistakes are mine.

1

runCommandLocal is from nixpkgs.

2

When running in sandboxed mode. Which you should to get all the beautiful properties of Nix.

3

To be more exact: Nix will check if the output path of the derivation already exists locally or at any of the given remote substituters (like cache.nixos.org). The output path, in turn, depends on the name and the outputHash.