Occasional I/O errors on LTO-10 drives

At Fermilab, we have recently placed LTO-10 drives in production. A Spectra TFinity Plus library with these drives has not shown any problems. An IBM TS4500 library is showing occasional issues.

After a drive has written for anywhere from 30 minutes to several hours, the following errors occur:

Failed ST write with crc32c in DriveGeneric::writeBlock Errno=5: Input/output error
Failed ST write with crc32c in DriveGeneric::writeBlock Errno=5: Input/output error
SCSI error in DriveLTO::isEncryptionCapEnabled status=0 host_status=0xb driver_status=0: SCSI command failed with host_status: SOFT ERROR
Failed ST ioctl (MTUNLOAD) in DriveGeneric::unloadTape Errno=5: Input/output error
Failed SG_IO ioctl in DriveGeneric::getTapeAlerts Errno=5: Input/output error
Failed SG_IO ioctl in DriveLTO::isEncryptionCapEnabled Errno=5: Input/output error
Failed SG_IO ioctl in DriveLTO::isEncryptionCapEnabled Errno=5: Input/output error

The last few errors occur during a cleaner/unload operation. After that, the drive goes down and the tape is left in the drive. I cannot immediately dismount it using cta-smc:

[root@tpsrvg2603 cta]# cta-smc -d -D 39

[root@tpsrvg2603 cta]# cta-smc -d -D 39
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: read_elem_status of  on drive 39 detected Drive Not Unloaded
smc_dismount: SR018 - demount of  on drive 39 failed : Drive Not Unloaded

I have to use mt -f [device] rewoffl and then cta-smc -d -D [drive] works.

This issue is occuring with both IBM SBN0 and the newest T3S0 firmware.

Has anyone else seen these issues? Are there any suggestions?

Hi Tim,

Please note that (as mentioned in a different thread to PIC), CERN does not have LTO-10 tape drives.

Still - I will try to give you my general comments which might (or not) be relevant.

  1. Regarding the error messages you mention, I guess these are from cta-taped.log file. Is there more (sense) information in the /var/log/messages file around that time? Occasionally, there could be additional information to suggest whether the problem is with the drive or with the tape.
  2. cta-smc or even cta-rmcd will not dismount the tape cartridge if the drive is not unloaded so this behavior is correct. The issue is why the session failed so badly that it couldn’t do tape rewind?
  3. Did you try to catch the drive logs (either with itdt or with the library GUI) and open a ticket to IBM to inspect it? Maybe they see something there.

Best regards,

Vladimir Bahyl
CERN

  1. The following errors generally appear in /var/log/messages around that time:
[Sat May 16 23:43:14 2026] mpt3sas_cm0: log_info(0x31120101): originator(PL), code(0x12), sub_code(0x0101)
[Sat May 16 23:43:14 2026] st 1:0:2:0: [st2] Error b0000 (driver bt 0, host bt 0xb).
[Sat May 16 23:43:15 2026] mpt3sas_cm0: log_info(0x31110e05): originator(PL), code(0x11), sub_code(0x0e05)
[Sat May 16 23:43:22 2026] st 1:0:2:0: Power-on or device reset occurred
[Sat May 16 23:43:37 2026] st 1:0:2:0: [st2] Power on/reset recognized.

I believe that whatever error is occurring is serious enough to trigger a reset of the drive, or the bus.

  1. Understood.
  2. I provided a drive dump to IBM from the TS4500 CLI. They said they did not see any errors, and suggested I put the tape and drive back in service to try the operation again.

These errors are less frequent with the newer T3S0 firmware, but they have not gone away completely.

Hi Tim,

Thank you for the additional details.

This looks indeed like a firmware issue because whatever is happening it is causing the drive to reset / loose the connection.

This is for IBM eventhough I understand they do not see any errors because the dump you provide to them is AFTER the drive reset (so the internal drive memory was most likely emptied).

Good luck with LTO-10.

Best regards,

Vladimir