zsync – transfer large files efficiently

downloadA few days ago I have stumbled upon a zsync tool used for a fast transfer of very large files. The reason I have noticed it is because Ubuntu started to use it for its daily live images. And because I am curious, I have studied it and realized that zsync is great! 🙂 And I have also created some tests to see how well it works.

So what is the zsync?

zsync is a file transfer program. It allows you to download a file from a remote server, where you have a copy of an older version of the file on your computer already. zsync downloads only the new parts of the file. It uses the same algorithm as rsync. However, where rsync is designed for synchronising data from one computer to another within an organisation, zsync is designed for file distribution, with one file on a server to be distributed to thousands of downloaders. zsync requires no special server software — just a web server to host the files — and imposes no extra load on the server, making it ideal for large scale file distribution.

How does it work?

You simply generate a small .zsync file on the server for each big file you offer users to download. This .zsync file contains description of the contents of the big file. The user can then use the “zsync” tool with your .zsync file as an argument and use arbitrary file as a base for the new big file. It can be really any file, a previous version of the big file, a pre-pre-pre-previous version of the big file, or even a newer version of it! But the more it is similar the better. The zsync tool will then compare the files and download only the differences needed to assemble the new big file.

Usages

zsync is a great tool for developers who regularly download updated big files like daily CD/DVD images and similar stuff. It can really save you a lot of time and bandwidth it the files haven’t changed a lot. Especially if you have a slow internet line or the server has a slow line you might appreciate it. So what’re you waiting for?

Pitfalls

Oh, I haven’t told you yet. zsync is not currently available in Fedora because of some license problems 😦 But you can always find RPMs elsewhere.

Testing the efficiency

Here are the tests I have performed to see how well it works.

Ubuntu Karmic development images:

base image resulting image download size saved bandwidth
20090830 (703 MB) 20090831 (687 MB) 57 MB 91.7%
20090831 (687 MB) 20090901 (687 MB) 1 MB 99.9%
20090830 (703 MB) 20090901 (687 MB) 57 MB 91.7%
20090901 (687 MB) 20090830 (703 MB) 73 MB 89.6%

As you can see, zsync works great for Ubuntu images. When updating very recent images, you can save about 90% of bandwidth (and time). Unfortunately I haven’t had any older images to try to update from not-so-recent image. But notice that with zsync you can also go “back in time” from a more recent image to an older one (the last line).

Fedora 12 Alpha images:

base image resulting image download size saved bandwidth
RC1 (701 MB) RC2 (705 MB) 241 MB 65.8%
RC2 (705 MB) Final (705 MB) <1 MB 100.0%
RC1 (701 MB) Final (705 MB) 241 MB 65.8%
Final (705 MB) RC1 (701 MB) 237 MB 66.2%

For Fedora 12 Alpha images you can still save more than half when updating from RC1 to RC2 and you will be extremely pleased when updating from RC2 to Alpha Final. Although those two files are not the same, they are nearly the same (probably just some system label has changed) and you will get the image instantly.

Fedora 12 nightly composes:

base image resulting image download size saved bandwidth
20090809 (703 MB) 20090818 (702 MB) 287 MB 59.1%
20090818 (702 MB) 20090824 (630 MB) 404 MB 35.8%
20090824 (630 MB) 20090825 (636 MB) 361 MB 43.3%
20090825 (636 MB) 20090827 (640 MB) 340 MB 46.9%
20090809 (703 MB) 20090827 (640 MB) 423 MB 33.9%
20090827 (640 MB) 20090809 (703 MB) 486 MB 30.8%

For Fedora 12 nightly composes the savings number is apparently not so great as for Ubuntu images. Quite interestingly some long term updates may be more efficient than some short term updates – it depends how much the contents have changed over time. You still get around 45% of saved bandwidth, which is very nice. Why the Fedora images differ so much as opposed to Ubuntu images? You can read the Oxf13’s explanation on Fedora QA IRC meeting – the whole ISO is a squashfs image, that is not suitable for this kind of difference comparison. Ubuntu is probably using other method, which is more compatible with zsync algorithm.

Conclusion

So what do you think? Should there be .zsync files for ISO images in Fedora? I hope this article will help the infrastructure team to decide 🙂

13 thoughts on “zsync – transfer large files efficiently

  1. I’d be curious to see the difference in efficiency to just using rsync on the same files. If there isn’t one (or isn’t much of one, I’d rather just use rsync, even if it means having an additional server.

  2. Hey,

    zsync will not be included in fedora because it need a modified zlib to be rsync compatible. rsync is using a forked zlib which compiled static in rsync binary.

    rsync doesn’t meet packaging guidelines, but no one cares, because you can’t remove it from fedora and a build against systemzlib would make fedora rsync incompatible to rsync itself.

    I’m a fan of zsync and i try to include it in Fedora, but I have no chance. rsync-upstream must drop modified zlib or create a correct fork for it. fork means. own name (f.e. rsync-zlib) and own sourcetarball and so on… then you can compile against the fork zlib and all will be fine! 🙂

  3. Nice article, Kamil!

    Zsync efficiency is a surprise for me. It looks like something I could use for getting current iso images for testing.

  4. For installation images, which contain RPMs (as opposed to live images), deltaisos are much more efficient. I’ve provided some for installation DVD images at

    http://thepiratebay.org/user/andre14965/

    Going from 12 Alpha TC to 12 Alpha RC1 saves around 88%, and from 12 Alpha RC1 to 12 Alpha Final (same as RC2) saves about 98%. Going from Fedora N Final to Fedora (N+1) Final normally saves about half, except that due to the change in payload compression from Gzip to XZ shortly after 11 Final, it won’t save anything going from 11 to 12 (although the change should make all the F12 ISO images around 20% smaller). The savings should be back to usual going from 12 to 13 and so on.

    1. Deltaisos have the disadvantage that you have to go from a particular version to a particular version. Zsync allows you to go from any version to any version (and you can use multiple images as a base image, which I forgot to mention in the article). And for any data files (not only ISOs with RPMs). On the other hand when it is possible to use deltaiso it really seems to have much better bandwidth save ratio.

      I have read that using the deltaiso is quite slow, can you comment on that? Zsync is superb fast, it has just a few tens of seconds overhead (reading the base file).

      1. In the description for each of my torrents I list the estimated time for running applydeltaiso with typical hardware. The time is greatest when the most deltarpms are being applied. In cases where most of the RPMs in the target are either the same as the source or not in the source at all (so deltarpms aren’t used at all for those RPMs) it’s fast. Although in the latter case it won’t save much bandwidth either.

        A typical time going from Fedora N to Fedora (N+1) Final DVD is around 45-60 minutes (you can see this in the F9 -> F10 deltaisos). The time between the 12 Alpha candidates was around 4 minutes (here most of the source and target RPMs were the same). Between F10 and F11, there was an i586 package rebuild, which had a severe effect on the size of the i386 deltaiso (around 83% of full size instead of the usual 50%), but since most of the target RPMs didn’t exist in the source (since the arch changed from i386 to i586), the running time was much faster than usual, around 15 minutes.

Leave a Reply (Markdown syntax supported)