When building bootdisks, the first few tries often will not boot. The general
approach to building a root disk is to assemble components from your existing
system, and try and get the diskette-based system to the point where it displays
messages on the console. Once it starts talking to you, the battle is half over
because you can see what it is complaining about, and you can fix individual
problems until the system works smoothly. If the system just hangs with no
explanation, finding the cause can be difficult. The recommended procedure for
investigating the problem where the system will not talk to you is as follows:
You may see a message like this:
Kernel panic: VFS: Unable to mount root fs on XX:YY |
This is a common problem and it has only a few causes. First, check the device
XX:YY against the list of device codes in
/usr/src/linux/Documentation/devices.txt. If it is
incorrect, you probably didn't do an rdev -R, or you did it
on the wrong image. If the device code is correct, then check carefully the
device drivers compiled into your kernel. Make sure it has floppy disk, ramdisk
and ext2 filesystem support built-in.If you see many errors like:
end_request: I/O error, dev 01:00 (ramdisk), sector NNN |
This is an I/O error from the ramdisk driver, usually because the kernel is
trying to write beyond the end of the device. The ramdisk is too small to hold
the root filesystem. Check your bootdisk kernel's initialization messages for a
line like:
Ramdisk driver initialized : 16 ramdisks of 4096K size |
Check this size against the uncompressed size of
the root filesystem. If the ramdisks aren't large enough, make them
larger.Check that the root disk actually contains the directories you think
it does. It is easy to copy at the wrong level so that you end up
with something like /rootdisk/bin instead of
/bin on your root diskette.
Check that there is a /lib/libc.so with the same link that
appears in your /lib directory on your hard disk.
Check that any symbolic links in your /dev
directory in your existing system also exist on your root diskette
filesystem, where those links are to devices which you have included
in your root diskette. In particular,
/dev/console links are essential in many cases.
Check that you have included /dev/tty1, /dev/null, /dev/zero,
/dev/mem, /dev/ram and /dev/kmem files.
Check your kernel configuration -- support for all resources
required up to login point must be built in, not modules.
So ramdisk and ext2 support must be built-in.
Check that your kernel root device and ramdisk settings are correct.
Once these general aspects have been covered, here are some more
specific files to check:
Make sure init is included as
/sbin/init or /bin/init.
Make sure it is executable.
Run ldd init to check init's libraries. Usually
this is just libc.so, but check anyway. Make
sure you included the necessary libraries and loaders.
Make sure you have the right loader for your libraries --
ld.so for a.out or ld-linux.so
for ELF.
Check the /etc/inittab on your bootdisk filesystem for
the calls to getty (or some getty-like
program, such as agetty, mgetty or
getty_ps). Double-check these against your hard
disk inittab. Check the man pages of the program you use
to make sure these make sense. inittab is possibly the
trickiest part because its syntax and content depend on the init program used
and the nature of the system. The only way to tackle it is to read the man
pages for init and inittab and work
out exactly what your existing system is doing when it boots. Check to make
sure /etc/inittab has a system initialisation entry.
This should contain a command to execute the system initialization script,
which must exist.
As with init, run ldd on your
getty to see what it needs, and make sure the necessary
library files and loaders were included in your root filesystem.
Be sure you have included a shell program (e.g., bash or
ash)
capable of running all of your rc scripts.
If you have a /etc/ld.so.cache file on your rescue disk,
remake it.
If init starts, but you get a message like:
Id xxx respawning too fast: disabled for 5 minutes |
it is coming from
init, usually indicating that
getty or
login is dying as soon as it
starts up. Check the
getty and
login
executables and the libraries they depend upon. Make sure the invocations in
/etc/inittab are correct. If you get strange messages
from
getty, it may mean the calling form in
/etc/inittab is wrong.
If you get a login prompt, and you enter a valid login name but the
system prompts you for another login name immediately, the problem may be
with PAM or NSS. See Section 4.4. The problem may also be
that you use shadow passwords and didn't copy
/etc/shadow to your bootdisk.
If you try to run some executable, such as df, which is on your rescue
disk but you yields a message like: df: not found, check two things: (1)
Make sure the directory containing the binary is in your PATH, and (2) make
sure you have libraries (and loaders) the program needs.