Saturday, February 10, 2007

The Perils of Packratitis

Well, the day started out promising. I'd slept well, read the paper, had breakfast, and headed downstairs with a cup of coffee to read my email.

Turn on the computer, and I'm greeted with a text screen including the words

Uncompressing Linux ...  OK, booting the kernel
[17179570.784000] Kernel panic - not syncing:  VFS : Unable to mount rootfs on unknown-block (0,0)
[17179570.784000]

and other such unpleasantness.

I remembered that a kernel upgrade had pushed through last night, so I suspected that was the trouble. After all, it was a kernel upgrade of Fedora that without sound for nearly a week a couple of years ago.

The new kernel is 2.6.15-28. (2.6.15 is the actual kernel version, the 28 is the latest Ubuntu fix.) So I shut off the box, turned it on again, and hit the ESC key when the option was offered. This brings up the GRUB menu, from which I selected the 2.6.15-27 kernel. Booting continued normally.

But what went wrong? How do I fix it? Well, after searching many forums, it appears that kernel panics happen for a lot more reasons than you'd expect, and have different causes. Mine? Well, after much searching, I was prompted to look at the GRUB menu, which is controlled by the file /boot/grub/menu.lst (that's an "ell", not a "one"). In there, I found these lines:

## ## End Default Options ##

title           Ubuntu, kernel 2.6.15-28-686
root            (hd0,0)
kernel          /vmlinuz-2.6.15-28-686 root=/dev/hda2 ro quiet splash
savedefault
boot

title           Ubuntu, kernel 2.6.15-28-686 (recovery mode)
root            (hd0,0)
kernel          /vmlinuz-2.6.15-28-686 root=/dev/hda2 ro single
boot

title           Ubuntu, kernel 2.6.15-27-686 (recovery mode)
root            (hd0,0)
kernel          /vmlinuz-2.6.15-27-686 root=/dev/hda2 ro single
initrd          /initrd.img-2.6.15-27-686
boot

title           Ubuntu, kernel 2.6.15-27-386
root            (hd0,0)
kernel          /vmlinuz-2.6.15-27-386 root=/dev/hda2 ro quiet splash
initrd          /initrd.img-2.6.15-27-386
savedefault
boot

See what's missing? The 15-28 kernel entries don't have the "initrd" lines. Simple enough, huh? This kernel update had problems. Maybe this manifested itself here by not updating the menu.lst file properly. Easy enough to fix, right? Just edit (as root) the menu.lst file, adding the line

initrd          /initrd.img-2.6.15-28-686

in the appropriate place.

Wrong. I got an edit error, suggesting that the file system was full. Oops.

The /boot sector is a separate partition. So see how full it really is:

$ df /boot
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/hda1                97826     97862         0 100% /boot

Oh boy. What's going on?

Well, I suffer from packratitis — the inability to throw out something that might be useful. Turns out that I had several copies of the old kernel. Well, more then several. It looks like I had a copy of every kernel since I first started with Ubuntu. Both i686 and i386 versions — though the later are completely unnecessary.

Go back to the old kernel, reboot, then go into synaptic and delete a few old kernel modules. The relevant packages are linux-image-version_number and linux-restricted-modules-version_number, where "version_number is something like 2.6.15... This freed up enough space so I could successfully edit the menu.lst file.

This time, when I booted, I got the message that the 2.6.15-28 compressed kernel was corrupted — obviously the install hadn't completed. So I went back to synaptic, reinstalled linux-image-2.16.15-28-686 and linux-restricted-modules-2.16.15-28-686, and, once again, rebooted.

Success. Finally.

Lessons learned?

  • You don't need every version of the kernel that comes down the pike. The upgrade process leaves the old kernel lying around for good and sufficient reason, as we have seen, but if a kernel is working well you can delete most of the older ones.
  • I don't think we need any of the 386 kernels. A Celeron chip is equivalent to at least a Pentium 4, so the 686 kernel should be sufficient.
  • Every once in a while, you need to check all your partitions to see how full they are getting.
  • Don't blame the upgrade process — at least until you've checked out the possibility that you've fraked yourself.
  • You know, it's kind of fun working on a broken computer — just like the old days when we booted "alternative" OSes on an Apple //e.
  • During my (as opposed to the kernel's) panic, I downloaded Slackware (still reading, Pete?). I tried installing it on my alternative Linux box. I failed, but I think I know why. It's certainly a more hands-on distribution. I will get it installed on the alternative Linux box, and at some time in the future I may change over entirely.

So fun's over, and we're back to running regular, updated Ubuntu Dapper. Stay tuned for more adventures in Linux debugging.

2 comments:

Anonymous said...

Penguin Pete said:

That was kind of jarring... Just read through a post at random and it calls you by name.

Those init files can be tricky little devils. Just tonight, I gave a thought to making a few custom styles for TWM, just for the giggles. Except I hosed my xinitrc putting it from fluxbox to twm. Then I crashed X by starting Fluxbox slit programs on TWM. Then I went root to copy the /etc/ xinitrc template over to my home directory, and instead accidentally overwrote the master copy with the botched up one. Yes, it was like Homer Simpson on fast forward for awhile over here.

So, Ubuntu kp's on you one time, and you bail out to Slackware? Come on, where's your brand loyalty?

rcjhawk said...

Not being a devotee of NASCAR, I'm not beholden to any one brand, except for something Unix-like, maybe with a penguin.

I do, however, believe in having a plan B, and here was this poor little orphan Pentium IV just sitting there, begging to be noticed. So I noticed it.

Anyway, my first experience with Linux was with Slackware, way back in '95 or there'bouts.

And Slackware looks like there's more to tinker with, hence more to write about.