After upgrading from Mandrake Linux 10.0 to Mandriva Linux LE 2005 10.2, I found out that there is a serious issue with either Mandriva or with kernel 2.6.11.
This article describes the issues faced upon upgrade, adn how to work around them.
Background
Under Mandrake 10.0 and kernel 2.6.3, my Seagate IDE Travan ST20000A worked perfectly, with SCSI over IDE emulation. After the upgrade to 10.2, I noticed several things that were not right.
To give a bit of a background, my backup script has three phases:
- The first phase is to collect a list of files to be backed up.
- The second phase is to do the actual backup, using the cpio command, with the -C 32768 option to increase the block size. This ensures faster backups, since it makes the tape stream.
- The last phase is to verify the backup. While it does not compare every backed up file with its original, it does the next best thing: list the names of the files on the tape and compares this list to the list of files that were backed up in phase two.
Issues
After the upgrade, I noticed the following issues:
- The tape was not autoconfigured. This was not a big deal, and all I needed was to add hdd=ide-scsi to my boot command in grub's menu.lst (lilo is similar, with an append command)
- The weekly backup ran fine, but the verify phase of the script took 19 hours, as opposed to about 3 hours normally.
That drastic increase in time taken bothered me. So I did some more investigation, and found that a more serious problem was present: the tape could not read back anything that was written other than with the default block size of 512 bytes.
This means that all my old tapes are unusable, since the data is unreadable.
So, I ran some tests to troubleshoot this issue.
First, I ran the same command that I use for backup, with the time command preceeding it. It uses 64K block size.
time cpio -ovH crc -C 65536 -O /dev/st0
t
t2
t
t2
157 blocks
0:42.13elapsed
This backed up two test files, t and t2. Each was 10,000 bytes. It took only 42 seconds to do the backup.
Now, when attempting to read them, cpio will not recognize the tape as a valid cpio archive:
time cpio -itv -C 65536 -I /dev/st0
Found end of tape. Load next tape and press RETURN.
Then, I tried a different smaller block size (2048 bytes):
time cpio -ovH crc -C 2048 -O /dev/st0
t
t2
t
t2
5001 blocks
1:24.68elapsed
Notice that the backup time has increased, since the tape does not stream as much.
Again, when trying to read the tape, cpio will not recognize the tape as a valid cpio archive:
time cpio -itv -C 65536 -I /dev/st0
Found end of tape. Load next tape and press RETURN.
Default Block Size
Now, when using the default block size, the time taken is much more than before.
time cpio -ovH crc -O /dev/st0
t
t2
t
t2
20001 blocks
2:35.59elapsed
But, at least the tape is finally readable:
time cpio -it -I /dev/st0
t
t2
20001 blocks
2:15.33elapsed
Using strace
Next, I used the strace command, and was able to confirm that any read from a tape that was written with a non-default block size return a string of null bytes.
Here is a write command using tar cvf /dev/st0 directory which defaults to 10 K blocks.
read(4, "X11R6/lib/X11/fonts/100dpi/courB"..., 10240) = 10240
write(3, "X11R6/lib/X11/fonts/100dpi/courB"..., 10240) = 10240
And here is the read using tar tvf /dev/st0
open("/dev/st0", O_RDONLY|O_LARGEFILE) = 3
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 10240) = 10240
Bugzilla
After writing this article, I noticed that Bugzilla has an almost identical issue, albeit on Fedora Core 3, and an earlier version of the kernel (2.6.10). The symptom is that old tar tapes are not readable after a Fedora kernel upgrade. This confirms my finding that tar does not work because it uses a 10K block size by default.
Craig Goodyear, the reporter of the above Bugzilla issue, has posted on the mailing lists stating that disabling DMA on the disks solves this problem. I tried adding ide1=nodma in Grub's configuration, but it does not seem to take effect. Perhaps because the IDE controller that has the tape (/dev/hdd) also has a CR-RW on it (/dev/hdc). Upon checking the IDE.txt that comes with the kernel source, it turns out that this is not possible, and DMA has to be turned off for all the IDE subsystem.
I emailed Craig, as well as Dave Jones, the Redhat developer who has this issue assigned. Awaiting replies so far.
Other Issues
I also noticed some other issues and observations:
- If I press Interrupt (Ctrl-C) when a tape operation is being done, I get back to the prompt immediately. Previously, the command blocked until the tape did a rewind.
- Sometimes, the tape would give and I/O error, and the mt -f /dev/st0 status command would return only IM_REP_EN as opposed to other status bits, like BOT ONLINE. To reset the tape without rebooting, enter the command mt -f /dev/st0 load.
Solution: Use IDE Device
The solution I ended up with was to revert back to configuring the tape drive as an IDE tape, and not the SCSI over IDE. To do this I replaced ide-scsi by ide-tape in the boot command in Grub.
I also added the following lines to /etc/rc.local so that they are executed on boot.
echo "Loading IDE tape module"
modprobe ide-tape
echo "Creating tape devices"
mknod /dev/ht0 c 37 0
mknod /dev/nht0 c 37 128
Instead of /dev/st0 I use /dev/ht0 as the tape device. The tape now works with non-default block size, and my old tapes are readable.
Disadvantages
There is a disadvantage to this workaround of course. The command mt -f /dev/ht0 status does not return a detailed report on the media density and drive status (e.g. write protected, beginning of tape, ...etc.).
My backups relied on the above command to check that a tape was loaded in the drive, and that it was write enabled. I could do the first part by issuing an mt -f /dev/ht0 load and checking the return status. If it is 0, then a tape is loaded, if it is 1, then a message (drive not ready) is returned.
I did not find a way to check that the tape is write enabled so far. Of course, I could try using dd to write a block and see if it is successful, but that is too drastic at the moment.
Comments
Anonymous (not verified)
On Fedora
Thu, 2006/04/27 - 03:17On Fedora Core 4 and 5 do the following:
1. Add in /boot/grub/menu.lst line kernel
ide=nodma hdx=ide-scsi
2. Reboot
3. Try. "tar czfv /dev/st0 /home"
4. You must view with "tar tzfv /dev/st0" the files.
Tony Groves (not verified)
Excellent article
Fri, 2007/06/22 - 07:55Thanks for that analysis. I had been trying to read on my home machine (kernel 2.6.18, ide-tape) tapes written on a work machine (ide-scsi, block size 5120) and getting nowhere. Now I know what's happening.