I have written and tested using MHVTL some Enstore-reading code. However, when trying it in the wild I am running into issues.
First caveat, things are not setup 100% correctly in that we have the tapes marked as LTO-8 in the CTA database but they are really M8 tapes being read by LTO-8 drives. I assume this doesn’t matter and that the M8 designation is just used to set the capacity at 9 TB instead of 12 TB.
As you probably know, Enstore has a similar label format to CTA. If I read the label with dd
, I get 84 bytes instead of 80, so I assume the last four bytes are a checksum (it looks like it). Our labels don’t specify if there is a crc32 checksum or not, but I assume I should just set it on read-only?
Then when CTA tries to read the real block (I think), I get error like this in journalctl:
Jul 29 18:43:34 gmv18018.fnal.gov cta-taped[142940]: LVL="ERROR" PID="142940" TID="142989" MSG="In RecallReportPacker::ReportError::execute(): failing retrieve job after exception." thread="RecallReportPacker" tapeDrive="LTO8D0" tapeVid="VR5775" mountId="66" failureLog="Jul 29 18:43:34.367608 gmv18018 In DriveGeneric::readBlock: Failed ST read (with checksum) Errno=12: Cannot allocate memory" fileId="161"
Where the cannot allocate memory
seems to come from the system level read (I get the same sorts of errors trying to read these back with dd). (If I don’t set the read-only crc32 mode, I get the same error without the checksum, so I can identify the code path in DriveGeneric which is being taken).
So what I am thinking is that I’ve got to get my block sizes and checksumming settings on the drive done correctly. When reading back a checksummed data block (256k) should I set the block size to 256k or 256k+4 bytes, for instance. Does the low level block reading take care of that extra 4 bytes?
In MM Jorge asked to see the output of sdparm on the tape drives and here they are:
[root@gmv18018 tmp]# lsscsi -g
[2:0:0:0] disk ATA HGST HUS722T1TAL WA07 /dev/sda /dev/sg0
[10:0:0:0] tape IBM ULT3580-TD8 N4Q0 /dev/st0 /dev/sg1
[10:0:0:1] mediumx IBM 03584L32 1705 /dev/sch0 /dev/sg2
[11:0:0:0] tape IBM ULT3580-TD8 N4Q0 /dev/st1 /dev/sg3
[11:0:0:1] mediumx IBM 03584L32 1705 /dev/sch1 /dev/sg4
[root@gmv18018 tmp]# sdparm --page=10,240 /dev/sg1
/dev/sg1: IBM ULT3580-TD8 N4Q0 [tape]
Control data protection (SSC) mode page:
LBPM 2 [cha: y, def: 0, sav: 0]
LBPIL 4 [cha: y, def: 0, sav: 0]
LBP_W 0 [cha: y, def: 0, sav: 0]
LBP_R 1 [cha: y, def: 0, sav: 0]
RBDP 0 [cha: n, def: 0, sav: 0]
[root@gmv18018 tmp]# sdparm --page=10,240 /dev/sg3
/dev/sg3: IBM ULT3580-TD8 N4Q0 [tape]
Control data protection (SSC) mode page:
LBPM 0 [cha: y, def: 0, sav: 0]
LBPIL 0 [cha: y, def: 0, sav: 0]
LBP_W 0 [cha: y, def: 0, sav: 0]
LBP_R 0 [cha: y, def: 0, sav: 0]
RBDP 0 [cha: n, def: 0, sav: 0]
Thanks for any insight.