Cache Docker layers between builds

Issue #14144 resolved
JoshuaT
staff created an issue

Builds that build and push Docker could be faster if the layers of the image were cached.

Official response

  • Matt Ryall staff

    Hi everyone,

    We've completed our testing and analysis of the impact of Docker layer caching and found that about 25% of repositories had worse performance with the caches enabled. Based on this, we've decided to make the caching opt-in for each pipeline.

    So Docker layer caching is now available as a pre-defined cache with a name of docker, configured like this:

    pipelines:
      default:
        services:
          - docker
        caches:
          - docker
        script:
          - # .. do cool stuff with Docker ..
    

    If you previously were relying on the automatically-enabled cache to make your build faster, you will now need to explicitly enable it as shown above. Docker layer caches have the same limitations and behaviours as regular caches as described on Caching Dependencies: maximum size of 1 GB, they will refresh after a week, etc.

    Based on our test results, this caching significantly speeds up many types of Docker builds, so the majority of people using Docker will want to enable this cache. However we thought it was important that you have control over this feature so you can enable it, or not, depending on your needs.

    Docker layer caching is now available for everyone to enable and use, so I'll resolve this ticket now. Please let us know here, or via a new ticket if you have any ideas of how to improve it further.

    Thanks,
    Matt

Comments (79)

  1. Marc-Andre Roy

    Same here! That would save us a lot of time since our private docker repo is kind of slow on upload and our builder image is kind of huge (~900mb) with all the requirement we need..

  2. Matt Ryall staff

    A quick update on this ticket. We investigated whether we could do this, but the caching introduced some bugs in correctly building docker images when running in our infrastructure.

    So unfortunately this turned out not to be a quick fix, and we had to postpone further work until we get a few other higher priority tickets finished off. We hope to get back to fixing this before the end of the year, and will keep you posted here with any progress.

    We're aware that this is a high priority issue for many people building Docker images on Pipelines, so this is very close to the top of our list of priorities.

  3. Paul Carter-Brown

    Really need this feature. Builds & push on our Jenkins which took 30s take 4 minutes in BitBucket Pipeline. Considering the fact we pay for build time and the productivity impact, this is really a must-have feature.

  4. Nick Boultbee

    Yep, this is the difference between builds of 17 - 60mins and less than a minute locally for us (with base image cached, i.e. almost always). This is particularly bad as we have heavy (>3GB) builder images with a cache of precompiled (Haskell) binaries - to save (CPU) time, ironically...

  5. Kyle Cordes

    This really seems like a showstopper for anything other than occasional, minor use of the Docker build feature. I was really, really surprised to see it doing the whole thing again on the second build, it sent me researching for quite a while to figure out if I was maybe using Docker wrong.

  6. Matt Ryall staff

    Thanks for all the interest in this issue. We're well aware that building Docker images on Pipelines is not as fast as it could be.

    As mentioned above, we've been closing off a few higher priority improvements, so this ticket now is close to the top of our development queue. We aim to start work on it in January next year, and will have a further update around that time.

  7. NicolásM

    Happened to me 15 hours ago but I was not building a Docker image. The first "Pulling images" stage that pulls the base image of the pipeline had it's time reduced to a 20%-25%. So I think it's no exactly this issue the one that was fixed.

  8. Matt Ryall staff

    Yes, there's been some good news for Docker users on Pipelines lately.

    Last week, we changed the filesystem driver used by the Pipelines Docker-in-Docker daemon (the one used for Docker commands within Pipelines), switching from vfs (the default for DinD) to overlay2. This has the benefits of matching how our hosts are configured and also dramatically improving the performance of disk operations during Docker builds.

    We're also starting work on Docker image caching this week, which we hope to have available to customers in February. This is planned to work automatically, creating a docker cache behind the scenes that works the same as our other build caches, and will store up to 1 GB of image layers across builds.

    Thanks for your continued interest in this issue.

  9. Nick Boultbee

    @Matt Ryall thanks for the update. We thought it might be a filesystem driver change - everything is faster including pulling layers, and "lightweight" Docker build steps.

    Personally we could do with more than 1GB of Docker cache, but I'm sure this will increase eventually.

    Anyway good news and look forward to the updates.

  10. Anthony Lazam

    Hello! Is there an additional configuration needed besides from using "options: docker" and how long the cache is retain? It will be great to see the info similar to caches button for dependency.

  11. Matt Ryall staff

    @Anthony Lazam - just to be clear, the cache is not yet available. Expect it to appear in a few weeks. We'll post here when that happens.

    If our planned approach works, the Docker image cache will appear in the Caches dialog in the Pipelines UI and no additional configuration will be needed on top of enabling Docker.

  12. Nathan Burrell staff

    Hi

    Docker layer caching has been released :)

    What do you have to do to take advantage of faster docker builds*... Nothing!

    If you currently use docker as a service we will automatically cache the docker image (layers) between steps, allowing for faster docker builds by taking advantage of the immutable layers generated.

    Run docker commands in pipelines has a section detailing this feature and links to the dependency caching documentation for further information about the limitations/behaviours of the cache.

    Regards Nathan

    * provided you follow good dockerfile practices

  13. Brett Orr

    @Nathan Burrell Super excited to start using the docker caching! Can you please elaborate on what you're referring to with 'good dockerfile practices'? Additionally, should we be seeing a docker entry in the Caches tab in Pipelines?

  14. Nathan Burrell staff

    If we upload a cache as part of your step for docker (provided its less than 1GB when compressed as per our regular cache limitations) you will see:

    • An entry in the caches dropdown called docker allowing you to see the size of the docker cache and the ability to manually expire it before the weekly automatic expiry
    • Logs in the Build Setup and Build Teardown section showing the status of download/uploading the cache for each step

    As for the good dockerfile practices:

    Lets say you create a dockerfile and add your application binary as one of the first layers with the ADD instruction, the layer cache would be invalidated on every build as the hash of each layer is influenced by the contents of the previous and your application binary would have a different hash on each build as your adding/removing functionality to it, and you would receive potentially no speed up (as it would have to rebuild each subsequent layer n your dockerfile as the hashes would no longer match with the ones from the previous step).

    bad.dockerfile

    FROM some/base-image
    # Add my binary on each build
    ADD mybinary
    # This layer is always rebuilt as the previous layers content have changed generating a different hash
    RUN apt update -y && apt install all-the-things -y
    

    Now if you were to add the binary as one of the last instructions in the dockerfile only the layer containing your binary would be invalidated as the hashes for the previous layers remain unchanged as they execute the same instructions and arent influenced by the dynamic binary you keep adding, so you only have to "build" the last layer on each build taking full advantage of the layer cache in docker.

    good.dockerfile

    FROM some/base-image
    # This layer is never rebuilt unless you change the instructions
    RUN apt update -y && apt install all-the-things -y
    # Add my binary on each build and only this layer is "rebuilt"
    ADD mybinary
    

    For further information (and potentially a better explanation than mine ;)) please refer to the documentation provided by docker

    https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/#build-cache

    There are many blogs/books that have recommendations for best practices aswell :)

  15. Brett Orr

    @Nathan Burrell Thanks for the detailed reply! I'm not seeing any dependencies yet, and I'm a little confused as to whether I've done my cache declaration correctly (as I understand it, we don't have a cache entry, just declare docker in a service step? Capture.PNG

    My pipelines YAML is below:

    image: node:9
    
    pipelines:
      branches:
        master:
          - step:
              script:
                - docker login -u $DOCKER_USER -p $DOCKER_PASSWORD
                - docker build -t xxxx/app:latest .
                - docker push xxxx/app:latest
              services:
                - docker
          - step:
              script:
                - mkdir -p ~/.ssh
                - (umask 077; echo $PRIVATE_KEY | base64 --decode > ~/.ssh/id_rsa)
                - echo 'cd XXXX && docker-compose pull && docker-compose down && docker-compose up -d' > dc.sh
                - ssh ubuntu@X.X.X.X 'bash -s' < dc.sh
    

    Overall it doesn't seem like my build times are affected (build times are variable for me thanks to how Create-React-App varies with how long it'll take to build a production folder), but I'm not sure I'm caching correctly?

  16. Nathan Burrell staff

    Hi @Brett Orr

    Correct you just need to have docker in your services section for this to work along with the following conditions being met

    • The layer cache has to be < 1GB compressed
    • The size of the images in the docker daemon must be < 2GB for a cache to be created (you can check this by adding this command to your yml docker image inspect $(docker image ls -aq) --format {{.Size}} | awk '{totalSizeInBytes += $0} END {print totalSizeInBytes}'

    What do the logs say in the build setup/teardown section for a step that should be consuming/producing a docker cache?

    Can I have a link to the repository/step that is exhibiting this behaviour or alternatively can you raise a support case so I can investigate for you?

  17. Nathan Burrell staff

    Hi @João Malcata

    What do the logs show in the Build Teardown section for your docker cache?

    Also can you confirm for me that your docker cache is within the limits:

    • The layer cache has to be < 1GB compressed
    • The size of the images in the docker daemon must be < 2GB (2147483648 bytes) for a cache to be created, you can check this by adding this command to your yml: docker image inspect $(docker image ls -aq) --format {{.Size}} | awk '{totalSizeInBytes += $0} END {print totalSizeInBytes}'
  18. Matt Ryall staff

    @Manzoor Hussain - there are now two different kinds of Docker caching we do in Pipelines, which might lead to a bit of confusion:

    • DockerHub pull-through image cache: any public images pulled from DockerHub for build and service containers are cached by a "pull-through" image cache. This is a shared cache across all our customers, and has been in effect for six months or so. This cache is unbounded, but only works for publicly accessible images on DockerHub used as build or service containers.
    • Docker-in-docker (DinD) layer cache: we implemented this for this feature request, which caches Docker image layers privately for each repository, to support reuse across builds. It also caches images which are pulled when you execute docker run or similar in your build script.

      This new cache is limited to 1GB per repository, with an additional requirement that the total of all images used in your build must be less than 2GB for us to attempt compression and storage (see @Nathan Burrell's example above). It works for any image pulled or built by Docker commands within your build, including docker run, docker build, etc., and will incorporate images pulled from any Docker registry.

    Based on what you've said, it sounds like your ECS image is used as a build or service container, meaning it will not be cached by Pipelines. To get it cached, you have the option of starting it via docker run yourself, so it uses the DinD cache instead of the pull-through image cache, as long as it falls within the size limits above.

    Adding caching for Docker images pulled from ECS for build/service containers is not covered by this feature request, and is actually fairly complex due to how the Docker pull-through proxy works. If that's something you'd like, please raise a separate ticket and we can review and respond to it there.

  19. Anthony Lazam

    Hello, upon checking the comments above the total images must be less than 2GB. My current setup only creates two docker images (3 tags) which only results to around 400 MB but when I tried to add the total size in bytes it results to 3.7GB. This result to Pipelines not caching our docker, may I know what's the reason why it shows different value?

    docker image ls
    REPOSITORY                                                         TAG                 IMAGE ID            CREATED             SIZE
    **************         1.6.0-SNAPSHOT      54671081c605        43 seconds ago      222MB
    **************         latest              54671081c605        43 seconds ago      222MB
    **************   1.1.0               bf69ce97c4a4        6 weeks ago         172MB
    
    docker image inspect $(docker image ls -aq) --format {{.Size}} | awk '{totalSizeInBytes += $0} END {print totalSizeInBytes}'
    3791369347
    

    Since it exceeds a total of 2GB, it doesn't cache the docker as seen below:

    Cache "docker": Skipping upload for empty cache
    

    We're using a our own Docker image for building this docker which is around 878 MB.

  20. Perit Bezek

    @Anthony Lazam, as far as I know, the 2GB limit includes the images your final image depends on like the image of the debian/ubuntu/alpine linux your image is based on. Running just docker image inspect $(docker image ls -aq) will display all the images which would have been put into cache.

  21. Nathan Burrell staff

    @Anthony Lazam your docker image ls command above is missing the -a argument which is hiding the intermediate layers that your images depend on and we include in the cache :)

    If you add that option it ill show you those aswell (the command I provided above and you linked contains that argument aswell)

  22. Matt Ryall staff

    @Dana Asbury - please see my comment above for a summary of the Docker caching currently offered by Pipelines.

    If you would like to have caching for images hosted privately on ECR/GCR/etc. that are used as build or service containers, please raise a separate feature request for that. As far as I know, we don't have one open currently for this.

  23. Rick Kuipers

    While trying to use this cache, we seem to be getting the following error during "Build Setup".

    Cache "docker": Downloading
    Cache "docker": Error downloading. Please contact support if this error persists.
    

    Any idea why this may be? All the layers together are about 550 MB so that should be fine.

  24. Alexander Shulgin

    @Matt Ryall The size calculation that you are using is completely wrong. You can not just add together all bytes from all sub-layers. When you are building image, each next sub-layer is just a JSON directive, and it is actually extending previous layer/sub-layer inheriting the size information from it. So, as result, the image cache estimated by your approach that should be aprox 8GB, in reality is only 150Mb (i am using custom script to create docker cache with all sub-layers, btw)

  25. Markus Hanslik

    @Nathan Burrell @Matt Ryall Is there any known workaround or do you guys plan to fix the erroneous image size calculation - as pointed out by Martin and Alexander, we now cannot use caching (at all) as even though our resulting Docker cache files are only about 100megs in size, Bitbucket ends up thinking it's 14+gigs due to us having a couple of USER, WORKDIR, ENV etc. layers, which don't add anything to the actual file size in Docker, but apparently appear as separate full-sized images to Bitbucket and thus greatly exceed the Bitbucket limit.

  26. Adam Robertson

    this doesn't seem to be working at all for me, even between steps in a single build that use the same public container, I get in every step...

    + (./.pipelines/mvp/js/pull.sh)
    Unable to find image 'node:8' locally
    8: Pulling from library/node
    4176fe04cefe: Pulling fs layer
    .... etc
    Status: Downloaded newer image for node:8
    

    Takes 25-30 seconds to pull a container to run 20 seconds worth of tests :(

  27. Marvin Killing

    +1 for what Alexander, Martin and Markus said. Our final image is < 1GB, but the size calculation snippet posted above shows the intermediate files weighing in at around 18GB. It would be great if somebody could look into it and check whether the size calculation is correct or not. Thanks!

  28. Alexander Shulgin

    As a workaround you can still create a custom cache file:

    docker save $(docker images -q | tr '\n' ' ') $(docker images -q | while read line ; do docker history -q $line; done | tail -n +2 | grep -v \<missing\> | uniq -u | tr '\n' ' ')> ~/docker/cache.tar
    

    and then just load it on the next build

    docker load < ~/docker/cache.tar
    
  29. Matt Ryall staff

    Thanks for all the feedback, folks. We're definitely still looking at how to improve this feature, which is why the ticket is still open.

    For the cache/no-cache heuristic, we have identified a Docker upgrade as the easiest way to improve this, so that's on our short term list.

    For the performance concerns, we're measuring the relative build time changes (positive or negative) on repositories using Docker, so we can decide whether to continue with this cache as a default setting or make it a configurable option.

    As @Alexander Shulgin mentions, it is possible to implement a cache yourself if the image size calculation is not working for you. But it is not currently possible to turn the default caching off. So we're prioritising the build time investigation first.

    We'll keep you posted on the changes here.

  30. Martin Norbäck Olivers

    The cache does indeed work now, but it takes almost a minute in the pre build stage just to initialize the cache (it's 500 MB zipped), so I think it would be just better for us to cache the respective builds of the components separately.

  31. Alexander Shulgin
    Is it possible to turn off the docker caching?
    

    Even if it is impossible you always can delete your images after the build process, so it will create an empty cache in seconds.

    docker rmi -f $(docker images -q)
    
  32. Matt Ryall staff

    Hi everyone,

    We've completed our testing and analysis of the impact of Docker layer caching and found that about 25% of repositories had worse performance with the caches enabled. Based on this, we've decided to make the caching opt-in for each pipeline.

    So Docker layer caching is now available as a pre-defined cache with a name of docker, configured like this:

    pipelines:
      default:
        services:
          - docker
        caches:
          - docker
        script:
          - # .. do cool stuff with Docker ..
    

    If you previously were relying on the automatically-enabled cache to make your build faster, you will now need to explicitly enable it as shown above. Docker layer caches have the same limitations and behaviours as regular caches as described on Caching Dependencies: maximum size of 1 GB, they will refresh after a week, etc.

    Based on our test results, this caching significantly speeds up many types of Docker builds, so the majority of people using Docker will want to enable this cache. However we thought it was important that you have control over this feature so you can enable it, or not, depending on your needs.

    Docker layer caching is now available for everyone to enable and use, so I'll resolve this ticket now. Please let us know here, or via a new ticket if you have any ideas of how to improve it further.

    Thanks,
    Matt

  33. Conor Sheehan

    I'm having issues with this feature.
    I have the docker cache enabled, but the only thing in the cache is docker.tar.
    It doesn't cache any of the layers created by docker build commands in steps in my pipeline.
    Is there a way I can explicitly cache layers created by steps in my pipeline?

    I'm currently using docker save to preserve layers between steps,
    but that doesn't solve the problem of the pipeline building the image from scratch every time it runs, rather than loading from cache.

  34. Chris Stryczynski

    How do we add to to a specific branch?

    The below is considered invalid:

    options:
      docker: true
    
    pipelines:
      branches:
        staging:
          - step:
              deployment: staging
              script:
                - thisDoesNotWork???
            services:
              - docker
            caches:
              - docker
    
  35. Nathan Burrell staff

    Hi @Chris Stryczynski

    Caches are supported on branches your yaml example above is invalid you need to tab in one more level the services and caches settings as they are properties of a step not elements of the pipeline :)

    This is a valid example of your yaml (note the indentation of services and caches)

    options:
      docker: true
    
    pipelines:
      branches:
        staging:
          - step:
              deployment: staging
              script:
                - echo "valid"
              services:
                - docker
              caches:
                - docker
    
  36. Rahul Raj

    I am building an android app. But the cache doesn't seem to be working for me too.

    This is my yaml file

    image: bitriseio/docker-android
    
    pipelines:
      branches:
        release:
          - step:
              services:
                - docker
              caches:
                - docker
              script:
                - fastlane beta
    

    Build Setup

    Cache "docker": Downloading
    Cache "docker": Not found
    

    Build teardown

    Cache "docker": Skipping upload for empty cache
    

    Build

    Image: bitriseio/docker-android
    Memory: 3072 MB
    

    Docker

    Image: atlassian/pipelines-docker-daemon:prod-stable
    Memory: 1024 MB
    

    Can anyone please help? The cache limit documentation is hard for beginners.

  37. Nathan Burrell staff

    We only cache layers iff there is:

    a) no docker cache already uploaded, you can verify this from the Build setup logs it should say

    Cache "docker": Not found

    b) and the total cache size uncompressed is less than 2gb, you can verify this from running as the final command in your step, and confirming its output is less than 2gb (it prints human readable so it prints in kb, mb, gb...)

    /usr/bin/docker system df | grep Images | awk '{print $4}'
    

    c) and once compressed the cache must be less than 1gb. You can verify if this is occurring from the Build teardown logs it should say

    Skipping cache upload.

    There is an issue currently where if your cache is more than 2gb it simply prints Cache "docker": Skipping upload for empty cache, we have plans to improve, as to better indicate the second condition is failing but we haven't yet due to priorities.

    If your cache is passing all of the above conditions and still not uploading feel free to raise a support case with Support, so we can investigate further.

  38. Log in to comment