Monday, May 30, 2011

The Natty Narwhal Upgrade

Like Travis McGee, this week I'm taking an installment of my retirement. Unlike McGee, my retirement seems to consist of fixing up computers.

So today, Ladies and Gentlemen, Boys and Girls, Desktops and Netbooks, we're going to talk about upgrading Hal here to Ubuntu 11.04.

Yawn. This shouldn't really be that big a deal, should it? You just hit the update button, or type do-release-upgrade at the command line, right?

Well, it's not always that simple. A friend of mine did that, and he seemed to have a lot of problems. I'm not sure why, as I never got to examine his system before he reinstalled 10.10. Maybe it was the change from the Xorg window system and Gnome desktop to Wayland and the Unity desktop. Or not. Maybe it was just the way the wind was blowing that day. But I decided not to take the chance on the upgrade, but rather install 11.04 from the ground up.

I didn't have to be pushed hard, because I was leaning toward a clean install anyway. When I bought this version of Hal I was in a hurry to get Linux set up, but wanted to keep Windows 7 around. As a result, I let Ubuntu do the repartitioning of the disk, and ended up with the entire Linux system in one giant partition.

This is less than optimal. (Thanks, Dave. Don't mention it, Hal.) Ideally one wants /home and possibly /usr/local and /opt in partitions separate from the root operating system. That way when you upgrade the OS, you don't lose your previous data. (That doesn't mean you shouldn't back up that data before upgrading. There's a word for people who don't. The polite form of it is idiot)

But this requires repartitioning Hal's disk. Fortunately, all of Hal's memories reside in the extended partition, so I don't have to fiddle with the Windows partition. After about 30 secons of thought, I decided to repartition Hal's big Linux partition into 15GB for /, the root system, another 15GB for /usr/local, /opt, and /scratch (more on how to do this later), and leave the rest for /home. That's probably too much for /, and maybe not enough for /scratch, depending on what I right there, but that's how I'm going to do it this go-round. Ideally gparted would leave the data on /home as is, but that takes forever because of all the data that needs to be moved around, so we're just going to repartition everything and hope the backup holds.

Preparation

First, get the software we'll need. I downloaded and burned CDs for the 64-bit versions of:

OK, Gparted is obvious: that's what's going to do the disk repartitioning. But why the alternative distributions?

Because you never know. The largely hypothetical long-time reader will remember that I used to use Fedora, but that I switched to Ubuntu when the installation of Fedora 5 failed miserably, and Ubuntu was there, ready and waiting. So having alternative Linux CDs on hand seems to be a really good idea. Besides, we bought 100 CDs maybe five years ago and still have a bunch left. They just aren't used much anymore. (That's right, rcjhawk is still a trailing indicator for device popularity.)

OK, we've got hopefully every bit of software that we need, so let's go.

Find out what's on your machine

I never remember which packages are installed from one upgrade to the next. So let's make a list:

dpkg --get-selections > installed.txt

Edit installed.txt, leaving out any packages you think will be installed by the system (e.g., kernel, window manager, etc.). If you can't remember what a package does, delete it too. If you need it, it will eventually show up as a dependency to a package you want, or you can reinstall with apt-get or synaptic. The important thing to do here is to remove cruft from the system. For example, I have the bsd-games package on here. I never play those games, so why leave them in place?

It would have been nice to make a copy of /etc/fstab, but some dingbat forgot to do that (more on this later).

Back it all up

Well, not everything. No need to back up the kernel, or Emacs, or latex, or any of that. The distribution will provide. No, I want to back up my data (basically $HOME), the Intel Fortran compiler, a royal pain to install, and my renegade version of SoX. So do the following (/backup is my ext4 formatted USB disk):

$ cd
# -rpv == recurse directories,
#  preserve permissions and timestamps,
#  and speak up about it verbosely.
$ cp -rpv dave /backup
# Assuming you're named dave, of course
$ cd /usr
$ cp -rpv local /backup
$ cd /
$ cp -rpv opt /backup

To be safe, I also had backintime, my backup manager, do an additional labeled backup, which should keep it on the disk forever and a day, or at least until the end of 2012.

Repartition

The next step is the scariest one. I've taken what was an 800+GB partition and broken it up into three parts: 15GB for /, 15GB for what will be /usr/local, /opt and /scratch, and the remainder for /home. It's scary because this wipes out the primary data, so you are now relying on the kindness of backups. It's theoretically possible to gave gparted to keep all the data in what's going to be the /home partition, but that requires moving a lot of stuff around, at least the way I tried it, and an estimated 18 hours to get it done. So I just said a prayer, bit my lip, and reformatted. Fortunately it all went well.

Installation

Not much to say here. I told Ubuntu that I wanted to do an install with a custom installation, had it mount everything where I wanted it, and let it go. The usual waiting around occurred.

Copying Files

This was rather easy: turn on the backup USB disk drive, and copy the files back. I copied back configuration files I knew I wanted to keep, e.g. things like .emacs and the .devilspie directory, but I'm letting Ubuntu set up the desktop. This means I'll have to play with preferences later, but it should get eliminate any conflicts between old and new configuration parameters.

Mounting the Backup Drive

I could just plug the backup drive in and access it at /media/really_long_hexadecimal_string, but what I really want to do is have the backup drive mounted to /backup. I've covered this before, but as I mentioned above, I forgot the UUID of the drive. You get that with the command:

blkid

which gives you all the UUID of all connected disks.

Restoring /usr/local and /opt

This is awkward. Add-on programs that you compile yourself generally go into /usr/local. Other programs, such as the Intel Fortran compiler and Google's Picasa, end up in /opt. I want both to stick around between upgrades, so they need to be out of the / partition. But I don't want two extra partitions. So here's what I did:

  • Create a partition /usr/local
  • Create a directory /usr/local/opt
  • Copy all of the /usr/local files where they belong.
  • Copy all of the /opt files to /usr/local/opt
  • Then, as root, run the command
    ln -s /usr/local/opt /opt
  • If you want a /scratch directory, do the same thing, but don't forget to make the original world writable:
    # mkdir /usr/local/scratch
    # chmod 777 /usr/local/scratch
    # ln -s /usr/local/scratch /scratch
    

Restoring your add-on packages.

Now we want to get back all of our old packages, the ones that aren't installed by default, at least so far as Ubuntu will let us. Remember that file installed.txt I mentioned you should create and edit? Here's how we'll use it:

  • Select System => Administration => Update Manager
  • Click on Settings in the lower left corner
  • Under Ubuntu Software make sure all of the sources you want enabled are. If you don't want proprietary drives, etc., turn them off. Do the same under Other Software
  • Open a terminal window and go the the directory where you have that installed.txt
  • Run the following commands:
    sudo apt-get update
    sudo apt-get install `awk '{print $1}' installed.txt | xargs`
    
  • You'll probably get some error messages, but they're pretty clear about what you need to fix. Edit the installed.txt file to match.
  • This will take a while, as all the packages have to be downloaded from the net.

What Works

Pretty much everything. I selected Gnome-classic for my desktop, and after a little fiddling to with System => Preferences => Startup Applications got the system to look pretty much as it did before.

As for third party software, the Intel Fortran compiler still works fine. I was able to reinstall Picasa from Google's supplied 64-bit .deb.

Wayland seems to work just like Xorg, at least from my point of view. It pops up windows in the same way as before, and I can run, say
ssh -X majel
to get me to another machine, and then run
firefox
on majel, and the window pops up here on Hal. So no major problems for me there.

What doesn't work

Google Earth. Maybe this is a 64-bit problem, I don't know. I tried using Google's .deb file, and the official Ubuntu method. Neither worked.

By default Ubuntu installs the 3-D version of Unity, which requires a pretty good graphics accelerator. Hal lives in 2-d, so that wouldn't run. (Funny, the Unity desktop shows up when you run the live CD.) I installed the 2-D version and that works. Which brings me to

What Sucks

Unity. From my limited experience (about two minutes, after which I ran away screaming), it's an overblown and somewhat hideous version and of the Mac desktop. But don't mind me, I was the last person on Earth to use fvwm. I suppose if I played around awhile I could make Unity behave the way I wanted it to, but why bother, since Ubuntu still supplies the classic Gnome desktop. Mind you, I've seen Gnome 3, and I don't like that, either. (The word you're looking for is Luddite.)

Summary

All in all, a successful update, as long as I stay way from Unity. I have one more computer that needs updating to 11.4, I think I'll just try that as a distribution update. If it doesn't work, I can always do a full install.

More Later

Troubles will surface, they always do. When they do, I'll write about them here.

Saturday, May 21, 2011

Compression

The other day an email arrived from sourceforge which mentioned that they were hosting the file compression program 7-zip. Now I had used 7-zip under windows, as an all purpose archiving tool, mostly for reading zip files. I'd never thought of it in a Linux context. But both openSuSE and Ubuntu have the command-line version available, and what more do you need?

In OpenSuSE, the RPM is called, simply enough, 7z. In Ubuntu it's a bit more complicated. There's p7zip, which provides the bare-bones standalone version of the compression program, 7zr, and a wrapper p7zip, which makes 7zr work like gzip. For the full-blown 7z program, you want the package p7zip-full, which includes 7z. While you're at it you might want to get the p7zip-rar package, which lets you decompress RAR files.

The reason you want 7z and not just 7zr is that the smaller program only compresses to the 7z format, but, as it says in the blurb,

not only does [7z] handle 7z but also ZIP, Zip64, CAB, RAR, ARJ, GZIP, BZIP2, TAR, CPIO, RPM, ISO and DEB archives.

And by handle, they mean read and write to these formats (except RAR). So I could create a zip archive with 7z, or a gzip/bzip2 compressed tarball. And guess what?

7z compression is 30-50% better than ZIP compression.

Is it? Well, let's find out.

Test 1

Presented for your consideration: an uncompressed tarball:

$ ls -l Ru.tar
-rw-r--r-- 1 dave dave 26716160 2011-05-21 15:41 Ru.tar

This is IO from the elk FP-LAPW code, so it has a lot of repetition of text, and what for our purposes are a bunch of random numbers. First we'll try compressing it with programs using their native formats. I'll try for maximum compression in all cases. Note that zip and 7z will create separate archives, while gzip and bzip2 compress the file in place.

Program Command File Size Ratio
zip zip -9 Ru Ru.tar 3899523 0.146
gzip gzip -9 Ru.tar 3899386 0.146
bzip2 bzip2 -9 Ru.tar 2992422 0.112
7z 7zr a -mx=9 Ru.7z Ru.tar 2242708 0.084

Pretty good, huh? As advertised, 7z is about 40% better than zip/gzip, and 25% better than bzip2. But wait, there's more. Not every computer is going to have 7z available, so you may want to compress files using a more established protocol. 7z can do that, too, which is why we wanted it, not just 7zr:

Format Command File Size Ratio
zip 7z a -mx=9 -tzip Ru.zip Ru.tar 3287420 0.123
gzip 7z a -mx=9 -tgzip Ru.tar.gz Ru.tar 3287335 0.123
bzip2 7z a -mx=9 -tbzip2 Ru.tar.bz2 Ru.tar 2989193 0.112

So 7z compresses to zip/gzip better than the native programs do it themselves. It doesn't really outperform bzip2 here, though. The only disadvantage compared to gzip or bzip2 is that it doesn't compress the files in place, unless you go through a script such as the one in /usr/bin/p7zip.

Test 2

Pi to 4 million Decimals has, duh, π to, actually, 4,194,034 places. The file pi.tar.gz has it in ascii, with a bit of header information. If we uncompress that file, it comes in at 4362370 bytes. Since the digits of π don't repeat, it's hard for a compression program to find blocks of bytes to compress. The following table lists the compressions achieved by our test programs, in whatever formats they can use. (See the above tables for the appropriate commands.) Let's see how everybody does:

Program Protocol File Size Ratio
None None 4362370 1.000
zip zip 2041130 0.468
gzip gzip 2040997 0.468
bzip2 bzip2 1863892 0.427
7z zip 1983378 0.455
7z gzip 1983297 0.455
7z bzip2 1860047 0.426
7z 7z 1884999 0.432

Here, bzip2 is competitive with 7z. Oddly, though, you should use 7z to do the bzip2 compression. Weird, huh? But still, 7z is pretty good.

Wrapping Up

So what's not to like?

For one thing, 7z is slow. If you just want to quickly compress a file, go ahead and use gzip, or bzip2. There is a price to pay for better compression.

Then, too, there's a warning on the 7z man page:

DO NOT USE the 7-zip format for backup purpose on Linux/Unix because 7-zip does not store the owner/group of the file.

you can get around this by piping tar into 7z:

tar cf - directory_to_be_archived | 7z a -si directory.tar.7z

which creates the analog of a gzip/bzip2 tarball.

And finally, the native 7z format isn't standard, yet, so it's not going to be available everywhere, and might even vanish. But 7z and its compression algorithm LZMA are open source, so they are likely to stay around for awhile. A few years ago bzip2 wasn't standard, and once upon a time neither was gzip. It's probably safe to compress your files to 7z format, but if you want to be safe, use 7z to compress to gzip or bzip2 format.