Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Is there any cache advantage to using ADD <url> vs RUN wget/curl <url> in a Dockerfile

Is there any advantage to layer cache invalidation by using ADD instead of RUN?

Background

I frequently see Dockerfiles that install wget or curl just to RUN wget … or RUN curl … to install some dependency that cannot be found in package management.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I suspect these could be converted to simple ADD <url> <dest> lines, and that would at least obviate the need for adding curl or wget to the image.

Further, it seems like the docker daemon could rely on HTTP cache invalidation to inform its own layer cache invalidation. At a minimum (e.g. in the absence of HTTP cache headers), it could GET the resource, hash it, and calculate invalidation the same way it does for local files.

NOTE: I am familiar with the usage of Add vs RUN …, but I am looking for a strong reason to choose one over the other. In particular, I want to know if ADD <url> can behave any more intelligently with regard to layer cache invalidation.

>Solution :

Certainly.

The RUN instruction will not invalidate the cache unless its text changes. So if the remote file is updated, you won’t get it. Docker will use the cached layer.

The ADD instruction will always download the file and the cache will be invalidated if the checksum of the file no longer matches.

I would recommend using ADD instead of RUN wget ... or RUN curl .... I imagine people use the latter as its more familiar, but the ADD instruction is quite powerful. It can untar files and set ownership. It’s also considered best practice to avoid downloading any packages that are not necessary for your process to run (though there are multiple ways to accomplish this, like using multi-stage builds).

Docs on cache invalidation:

https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#leverage-build-cache

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading