Recovering files from tape

Hello friends,

We have recently had an issue with one of our tapes - the label would fail to verify and retrieve operations could not be started in cta. I manually read the label block from the tape, and instead of something valid like this:

mt -f /dev/st16 rewind
dd if=/dev/st16 bs=80
VOL1A01157                           CASTOR                                  0231+0 records in
1+0 records out
80 bytes (80 B) copied, 0.0157375 s, 5.1 kB/s

the output was actually binary data. We are not sure how it happened - the tape has 6T of data in the cta catalog, so it was definitely writing the data fine. I manually wrote the label out to the tape in hopes of fixing the issue:

echo "VOL1A01157                           CASTOR                                  023" > label.file
dd if=label.file of=/dev/st16 bs=80

The label now reads back fine, and the label is verified by cta, but cta is now not able to read the files back (HDR1 is not verifying for the files).

I am looking at how I can restore some / all files from the tape using dd / cpio / tar / anything else. Is there a manual way that the files can be read back from tape? I am confident the files are there, maybe just with an incorrect offset. I would to try and recover them manually if possible.

Any suggestions?

Warm Regards,

Denis

Denis,

I am not sure I understand the issue.

One thing is clear though, if you wrote back the label onto that tape, you can not recall anything beyond that label (without sending the tape to the vendor for data recovery (often payable service)).

Please explain what are you trying to do? Creating a procedure how to recover binary blobs (= could be valid files) in case of some metadata is missing? Or you have a real problem with realy problematic tape?

Best regards,

Vladimir

One more side remark - are you familiar with the cta-readtp command?

[root@tpsrv480 ~]# rpm -qi cta-readtp-4.6.0-1.el7.cern.x86_64
Name        : cta-readtp
Version     : 4.6.0
Release     : 1.el7.cern
Architecture: x86_64
Install Date: St 23. február 2022, 16:38:45 CET
Group       : Application/CTA
Size        : 731064
License     : GPLv3+
Signature   : (none)
Source RPM  : cta-4.6.0-1.el7.cern.src.rpm
Build Date  : Pi 18. február 2022, 10:24:54 CET
Build Host  : runner-vpwyqtnr-project-6044-concurrent-0
Relocations : (not relocatable)
Summary     : The command-line tool for reading files from a CTA tape.
Description :
CERN Tape Archive:
The command-line tool for reading files from a CTA tape.

It can be used to read and validate files on tape. If a tape can not be read on one drive, maybe it can be read on another one. This command helps to confirm that hypothesis before we launch repack.

Hello Vlado,

Thank you very much for your reply. I think you answered my question - if the label was written back to tape, the tape is now unreadable (aside from vendor data recovery). That makes the path forward clear. Also, thank you for pointing me towards cta-readtp. It appears we don’t have it installed on our hosts, but I will have a good play with it.

Sorry I did not explain the issue properly. We had an actual issue with a tape. In summary:

We issued a retrieve request, but it failed. Looking at the error, it was not validating the tape label.
We manually checked tape label (read tape blocks with dd) and noted binary output instead of tape label. This confirmed there were label issues.
We manually wrote label tape label back to tape (with dd). Risky move which did not pay off.

Thank you again,

Denis

Denis,

can you please elaborate on the phrase: “binary output instead of tape label”?

CTA tape file labels do contain some data that look a bit binary.

I would encourage you to check this discussion from 2020:

and then look at this document in particular:

https://gitlab.cern.ch/cta/CTA/-/blob/master/doc/TapeServer.pdf

where section 1.6.5.1 explains the format of the CTA tape file labels.

HTH. Best regards,

Vladimir Bahyl

Hello Vlado,

Thank you again for your reply! Sorry for the confusion.

When I say “binary output instead of tape label”, I mean the output of dd is not ASCII. If I load a healthy tape into a tape drive, and run:

dd if=/dev/st19 bs=80 count=

I will get back something like this:

VOL1A01157                           CASTOR                                  023

The troublesome tape however comes back with non ascii output. Like if you run:

dd if=/dev/urandom bs=80

That’s why I suspect it’s reading the actual DATA for some reason, and not VOL1. I actually did study that section 1.6.5.1. It’s quite useful.

We did use the cta-tape-label binary to label our tapes. Our other tapes are working perfectly fine, no issues. I don’t suspect the cta software had anything to do with this tape going funny. I would put it down to operator error maybe.

Warm Regards,

Denis

Denis,

thank you for the clarification. Indeed, always make sure (with mt stat) that you are at BOT when readnig VOL1.

Bon weekend.

Vladimir

Hi Vlado,

Sorry for late reply. Excellent tip regarding BOT!

Have a great week,

Cheers,

Denis