Using Squid To Cache Apt Updates For Debian And Ubuntu

I run several Debian-based Linux machines and virtual machines at home and periodically install or reinstall one to test something. They all need updates—and mostly the same updates—so I wanted to cache the updates locally rather than download them several times when I upgrade.

There is an apt-proxy package, and although I can’t recall the problems with it I remember deciding it was not going to work well for me. I could rsync the entire package archive, but that’s just wasteful. I finally decided on setting up a Squid proxy dedicated—by intent, not controls—to caching deb packages from Debian and Ubuntu archives. And rpm’s and such if I should use other distro’s.

So I set up Squid and looked through the configuration options. Squid is by default set up to be most efficient at getting cache hits. I wanted to be sure it doesn’t expire the seldom-accessed large deb files to make room for tiny files, so I changed the cache replacement policy to LFUDA to optimize byte hit rate. I also increased the maximum object size to 100 megabytes from the default 4096 kilobytes. In a typical Squid cache the larger files aren’t cached because they often aren’t requested as often as smaller files by web surfers, however my cache’s purpose is to save these large files locally for updating several machines.

Now I needed to make my machines use the proxy for apt. For that I just added a code snippet to each /etc/apt/apt.conf, or in my cases I just slipped this file named jimproxy into /etc/apt/apt.conf.d/ :

Acquire {
        Retries "0";
        HTTP {
                Proxy "http://address-or-URL-of-squid-proxy.example.tld:3128/";
        };
};

Now when I run apt or aptitude or any manager that uses apt, they will use my Squid proxy to obtain the distribution packages.

This worked quite well, but I recently noticed some problems. The issue appeared to be that there were missing deb files from the archives, but what really was happening was that new Package.bz2 lists were on the archives, but my Squid cache was serving older lists it had cached. It listed some older packages which were no longer there. So my “apt-get update” would read an old package list and then “apt-get -u upgrade” wouldn’t find those older packages. So I need to tell Squid to be sure to check for new package lists. To do that I changed the refresh pattern option for “refresh-ims”. Voilà, it works properly now.

Squid.conf lines before:

# maximum_object_size 4096 KB
# cache_replacement_policy lru
refresh_pattern ^ftp:           1440    20%     10080
refresh_pattern ^gopher:        1440    0%      1440
refresh_pattern .               0       20%     4320

Squid.conf lines after:

maximum_object_size 100 MB
cache_replacement_policy heap LFUDA
refresh_pattern ^ftp:           1440    20%     10080
refresh_pattern ^gopher:        1440    0%      1440
refresh_pattern .               0       20%     4320 refresh-ims

I turned on refresh-ims for everything, but I probably would have been fine with turning it on for just the frequently-changing files as shown in the following code. But in my case I don’t think turning it on for all files will adversely affect things.

maximum_object_size 100 MB
cache_replacement_policy heap LFUDA
refresh_pattern ^ftp:          1440    20%     10080
refresh_pattern ^gopher:       1440    0%      1440
refresh_pattern Packages\.bz2$ 0       20%     4320 refresh-ims
refresh_pattern Sources\.bz2$  0       20%     4320 refresh-ims
refresh_pattern Release\.gpg$  0       20%     4320 refresh-ims
refresh_pattern Release$       0       20%     4320 refresh-ims
refresh_pattern .              0       20%     4320

You may also be interested in using Squid in web accelerator mode in a small VPS to boost performance.

Related posts:

  1. 2008 Update on Running Drupal on a Small VPS
Tagged with:
Posted in How To ..., Linux
5 comments on “Using Squid To Cache Apt Updates For Debian And Ubuntu
  1. Nick says:

    Hi,

    thanks for these instructions. Works great!

  2. Nick says:

    ok, I take that back. I am getting this in store.log:

    1287110828.080 RELEASE 00 00000944 56044C7925720070B8383C195EA9E44A 200 1287110064 1278360306 -1 application/x-debian-package 216428/216428 GET http://us.archive.ubuntu.com/ubuntu/pool/main/n/ncurses/libncursesw5_5.7+20100626-0ubuntu1_i386.deb
    1287110830.201 SWAPOUT 00 00000B00 56044C7925720070B8383C195EA9E44A 200 1287110827 1278360306 -1 application/x-debian-package 216428/216428 GET http://us.archive.ubuntu.com/ubuntu/pool/main/n/ncurses/libncursesw5_5.7+20100626-0ubuntu1_i386.deb
    1287110830.361 RELEASE 00 00000946 D967C8201CCCC8F7339E9D6FF38DC828 200 1287110067 1283771177 -1 application/x-debian-package 381040/381040 GET http://us.archive.ubuntu.com/ubuntu/pool/main/s/sqlite3/libsqlite3-0_3.7.2-1_i386.deb
    1287110832.825 SWAPOUT 00 00000B01 D967C8201CCCC8F7339E9D6FF38DC828 200 1287110830 1283771177 -1 application/x-debian-package 381040/381040 GET http://us.archive.ubuntu.com/ubuntu/pool/main/s/sqlite3/libsqlite3-0_3.7.2-1_i386.deb

    which looks to me like the packages were in the cache, are released and then downloaded and swapped out into cache again. The download speed for the second upgrade is not faster than for the first either. What could be the problem here?

  3. Nick says:

    well, sorry for all those messages, but it does help to configure apt _correctly_ on the second machine as well :( Not it works like a charm. Sorry again for my spamming you, the instructions work great!

  4. tcpdump says:

    Hi,

    very nice information.
    For your problem with package.bz2 you may consider no_cache statement to prevent those files from beiing cached.

    Regards tcpdump

4 Pings/Trackbacks for "Using Squid To Cache Apt Updates For Debian And Ubuntu"
  1. [...] may also be interested in Using Squid To Cache Apt Updates. Be Sociable, Share! Tweet This entry was posted in Linux and tagged apache, [...]

  2. [...] Squid3 & it seems to be holding pretty well so far, based largely on the work found here & here. What I'd really like to do is not filter/cache *all* my traffic, but dedicate the instance to [...]

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>