Sunday, July 22, 2012

rpmbundle - Copy RPMs for offline installation

Installing Fedora (and most other free distros) on computers without a fast (and economical) Internet connection is a bit painful since most of the popular multimedia file formats aren't supported out of the box. Being a free distribution, Fedora only supports and includes free and open source software.

Support for all the other patent encumbered formats are available from RPMFusion repository. However, you'll need a fast (and reasonably cheap) Internet connection to download several megabytes worth of RPM files from these repositories.

I present here the steps and a small utility program for copying RPM files from a PC with Internet connection for offline installation at other computers. This can be used to update existing installations or to install new software on stand-alone PCs.


Step 1 - Download RPMs

The simplest way to harvest RPMs is to enable YUM's cache on the source machine. This way, everything YUM fetches on the machine will be available in one place from where we can copy the required ones.
 
To enable yum cache, edit /etc/yum.conf and set keepcache=1 and cachedir to a directory of your choice (for example, cachedir=/mnt/disk/yumcache). Keep in mind that you might be downloading quite a lot of RPMs when you update your computer and install new software and so the cache location must have sufficient free space. Don't forget to manually create the cache directory (/mnt/disk/yumcache).

From now on, every time you install/update a package using YUM, the corresponding RPM file will be available in the cache directory.
For example, when you do something like:
yum install vlc
all RPM files downloaded by YUM (including the VLC package and any other dependency/update packages) will be cached.

Step 2 - Copy RPM files for bundling

When you're ready to copy out files, follow these steps:
mkdir ~/rpms
cd /mnt/disk/yumcache
find -iname *.rpm -exec cp -arvu {} ~/rpms \;
The above commands would copy all RPM files in the cache directory to 'rpms'. The arguments to cp are selected to avoid unnecessarily overwriting existing files in the destination directory.

Step 3 - Removing old versions

One problem with using yum's cache is that it tends to bloat over time. After a while, the cache will contain multiple old versions of the same package files. Although yum should gracefully handle these multiple versions and select the newest version during installation, copying all these files will only serve to increase the size of the offline install bundle. Further, I recently noticed a (probably x86_64 specific) bug that causes problems when you try to install multiple version RPM files of the same package.

So, I cooked up an extremely naive Python script that scans the list of RPMs and deletes the old versions. It will tell you which files are to be kept and which ones are to be deleted and if the user wishes, it will delete the redundant ones too.
Here's how to use it:
cd ~/rpms
./rpmbundly.py
Files to keep:
        anjuta-3.4.3-1.fc17.x86_64.rpm
        apg-2.3.0b-16.fc17.x86_64.rpm
        apper-0.7.2-2.fc17.x86_64.rpm
        argyllcms-1.4.0-2.fc17.x86_64.rpm
        .....
Files to remove:
        apper-0.7.1-5.fc17.x86_64.rpm
        apper-0.7.2-1.fc17.x86_64.rpm
        ...

confirm delete[y/n]:
As you can see from the above example, two old versions of the package apper have been marked for deletion. If the user chooses 'y' for the confirmation, these files will be deleted from the current directory.

You can download the script file from github, or even clone the repository by doing:
git clone git://github.com/syamcr/rpmbundle.git

Step 4 - Copy the RPMs & install on target machine

You need to manually copy the RPM files to the target computer. Note that the distro version and architecture (i686, x86_64 etc.) of both machines must match. You can't use a 32-bit machine to download the RPMs and then try to install them on a 64-bit machine. Nor can you install the RPMs for Fedora 17 on a Fedora 16 installation. Technically, it is possible to download the RPMs matching the target machine on any other machine. But the steps involved are lot more complicated than those described here.

Once you've copied the RPMs to the target machine, you can install them with yum:
cd ~/rpms
yum install *.rpm
Now just sit back and enjoy as yum does all the heavy lifting involved in the installation/updates.

------------------------------------------------


PS: I'm a pathetic noob when it comes to Python programming. This is my first serious Python program and my first upload to github. So feel free to offer advice and suggestions.


2 comments:

  1. Thanks for this clear explanation.

    One thing I require clarification on though: do the two machines (i.e. both the online machine used to download packages, and the offline machine) require the same dependencies to have already been installed?
    For instance, I'm installing package X which requires dependencies Y and Z. The online machine already has Y and Z packages installed, but the offline machine doesn't; would yum still download Y and Z packages?

    ReplyDelete
  2. No. That's a problem. Yum would only download the packages required on that machine. But if on the online machine, you had yum caching enabled, you'd have Y and Z rpms already in the cache. Otherwise, you'll have to manually download the RPMs. There's a tool yumdownloader (from yum-uitls, I believe) that can download dependencies as well. But even that one uses the deplist on the particular machine, if I remember correctly.

    ReplyDelete