Problems with containerized instructions

I didn’t see any issues with MHVTL yesterday. Today I rebuilt the image and went through the script line by line and still see no issues

$ systemctl list-units | grep vtl
mhvtl-load-modules.service                                         loaded active exited    Load mhvtl modules
vtllibrary@10.service                                              loaded active running   Robot Library Daemon for Virtual Tape & Robot Library
vtllibrary@30.service                                              loaded active running   Robot Library Daemon for Virtual Tape & Robot Library
vtltape@11.service                                                 loaded active running   Tape Daemon for Virtual Tape & Robot Library
vtltape@12.service                                                 loaded active running   Tape Daemon for Virtual Tape & Robot Library
vtltape@13.service                                                 loaded active running   Tape Daemon for Virtual Tape & Robot Library
vtltape@14.service                                                 loaded active running   Tape Daemon for Virtual Tape & Robot Library
vtltape@31.service                                                 loaded active running   Tape Daemon for Virtual Tape & Robot Library
vtltape@32.service                                                 loaded active running   Tape Daemon for Virtual Tape & Robot Library
vtltape@33.service                                                 loaded active running   Tape Daemon for Virtual Tape & Robot Library
vtltape@34.service                                                 loaded active running   Tape Daemon for Virtual Tape & Robot Library
system-vtllibrary.slice                                            loaded active active    system-vtllibrary.slice
system-vtltape.slice                                               loaded active active    system-vtltape.slice
mhvtl.target                                                       loaded active active    mhvtl service allowing to start/stop all vtltape@.service and vtllibrary@.service instances at once

and

$ systemctl status mhvtl.target
● mhvtl.target - mhvtl service allowing to start/stop all vtltape@.service and vtllibrary@.service instances at once
   Loaded: loaded (/usr/lib/systemd/system/mhvtl.target; enabled; vendor preset: disabled)
   Active: active since Wed 2021-06-02 15:14:18 CEST; 3min 30s ago
     Docs: man:man:vtltape(1)
           man:man:vtllibrary(1)

and

$ lsmod | grep mhvtl
mhvtl                  36617  30 

and

$ lsscsi -g
[2:0:0:0]    mediumx STK      L700             0106  /dev/sch1  /dev/sg9 
[2:0:1:0]    tape    IBM      ULT3580-TD5      0106  /dev/st3   /dev/sg3 
[2:0:2:0]    tape    IBM      ULT3580-TD5      0106  /dev/st2   /dev/sg2 
[2:0:3:0]    tape    IBM      ULT3580-TD4      0106  /dev/st4   /dev/sg4 
[2:0:4:0]    tape    IBM      ULT3580-TD4      0106  /dev/st7   /dev/sg8 
[2:0:8:0]    mediumx STK      L80              0106  /dev/sch0  /dev/sg6 
[2:0:9:0]    tape    STK      T10000B          0106  /dev/st6   /dev/sg7 
[2:0:10:0]   tape    STK      T10000B          0106  /dev/st5   /dev/sg5 
[2:0:11:0]   tape    STK      T10000B          0106  /dev/st1   /dev/sg1 
[2:0:12:0]   tape    STK      T10000B          0106  /dev/st0   /dev/sg0 

then had to switch to root (permission denied on /dev/sg)

# mtx -f `lsscsi -g | awk '$2~/mediumx/{print $7}' | head -1` status
  Storage Changer /dev/sg9:4 Drives, 43 Slots ( 4 Import/Export )
Data Transfer Element 0:Empty
Data Transfer Element 1:Empty
Data Transfer Element 2:Empty
Data Transfer Element 3:Empty
      Storage Element 1:Full :VolumeTag=E01001L4                            
      Storage Element 2:Full :VolumeTag=E01002L4                            
      Storage Element 3:Full :VolumeTag=E01003L4                            
      Storage Element 4:Full :VolumeTag=E01004L4                            
      Storage Element 5:Full :VolumeTag=E01005L4                            
      Storage Element 6:Full :VolumeTag=E01006L4                            
      Storage Element 7:Full :VolumeTag=E01007L4                            
      Storage Element 8:Full :VolumeTag=E01008L4                            
      Storage Element 9:Full :VolumeTag=E01009L4                            
      Storage Element 10:Full :VolumeTag=E01010L4                            
      Storage Element 11:Full :VolumeTag=E01011L4                            
      Storage Element 12:Full :VolumeTag=E01012L4                            
      Storage Element 13:Full :VolumeTag=E01013L4                            
      Storage Element 14:Full :VolumeTag=E01014L4                            
      Storage Element 15:Full :VolumeTag=E01015L4                            
      Storage Element 16:Full :VolumeTag=E01016L4                            
      Storage Element 17:Full :VolumeTag=E01017L4                            
      Storage Element 18:Full :VolumeTag=E01018L4                            
      Storage Element 19:Full :VolumeTag=E01019L4                            
      Storage Element 20:Full :VolumeTag=E01020L4                            
      Storage Element 21:Empty
      Storage Element 22:Full :VolumeTag=CLN101L4                            
      Storage Element 23:Full :VolumeTag=CLN102L5                            
      Storage Element 24:Empty
      Storage Element 25:Empty
      Storage Element 26:Empty
      Storage Element 27:Empty
      Storage Element 28:Empty
      Storage Element 29:Empty
      Storage Element 30:Full :VolumeTag=F01030L5                            
      Storage Element 31:Full :VolumeTag=F01031L5                            
      Storage Element 32:Full :VolumeTag=F01032L5                            
      Storage Element 33:Full :VolumeTag=F01033L5                            
      Storage Element 34:Full :VolumeTag=F01034L5                            
      Storage Element 35:Full :VolumeTag=F01035L5                            
      Storage Element 36:Full :VolumeTag=F01036L5                            
      Storage Element 37:Full :VolumeTag=F01037L5                            
      Storage Element 38:Full :VolumeTag=F01038L5                            
      Storage Element 39:Full :VolumeTag=F01039L5                            
      Storage Element 40 IMPORT/EXPORT:Empty
      Storage Element 41 IMPORT/EXPORT:Empty
      Storage Element 42 IMPORT/EXPORT:Empty
      Storage Element 43 IMPORT/EXPORT:Empty

and finally

# ps auxw | grep vtl
root     30043  0.0  0.0  15208  3104 ?        Ss   15:14   0:00 /usr/bin/vtltape -F -q11 -v1
root     30044  0.0  0.0  15208  3092 ?        Ss   15:14   0:00 /usr/bin/vtltape -F -q32 -v1
root     30045  0.0  0.0  15208  3096 ?        Ss   15:14   0:00 /usr/bin/vtltape -F -q33 -v1
root     30046  0.0  0.0  15208  3096 ?        Ss   15:14   0:00 /usr/bin/vtltape -F -q13 -v1
root     30047  0.0  0.0  15208  3096 ?        Ss   15:14   0:00 /usr/bin/vtltape -F -q14 -v1
root     30048  0.0  0.0  15208  3100 ?        Ss   15:14   0:00 /usr/bin/vtltape -F -q34 -v1
root     30049  0.0  0.0  15208  3088 ?        Ss   15:14   0:00 /usr/bin/vtltape -F -q31 -v1
root     30050  0.0  0.0  15208  3100 ?        Ss   15:14   0:00 /usr/bin/vtltape -F -q12 -v1
root     30053  0.0  0.0   9764  2012 ?        Ss   15:14   0:00 /usr/bin/vtllibrary -F -q30 -v1
root     30059  0.0  0.0   9764  2012 ?        Ss   15:14   0:00 /usr/bin/vtllibrary -F -q10 -v1

So as far as I can tell everything looks OK, right? But still:

# systemctl status mhvtl
Unit mhvtl.service could not be found.

I see now that mhvtl.service is only relevant if you used our rpms, it’s not relevant if you’ve built mhvtl from source so you can ignore this.

So now you have a working mhvtl installation, but the drives and libraries are the default ones, you’ll need to run ./recreate_buildtree_running_environment.sh to configure it then check the mtx load and mtx unload commands I quoted above to see if it functions as expected.

Thanks. I’m going to start over again because I did things a little out of order. Is there a version of the mtx commands I can issue on the VM rather than in the pods to try to take that layer out?

Yes, it’s in the OS, yum install mtx should be sufficient.

I don’t have /dev/smc either on the pod or the VM:

[cta@ewv-cta2 orchestration (master)]$ kubectl exec -it tpsrv01 -n cta -- ls -l /dev/s*
Defaulting container name to rmcd.
Use 'kubectl describe pod/tpsrv01' to see all of the containers in this pod.
crw-rw----. 1 root cdrom 86,   0 Jun  2 21:12 /dev/sch0
crw-rw-rw-. 1 root tape  21,   0 Jun  2 21:12 /dev/sg0
crw-rw-rw-. 1 root tape  21,   1 Jun  2 21:12 /dev/sg1
crw-rw-rw-. 1 root tape  21,   2 Jun  2 21:12 /dev/sg2
crw-rw-rw-. 1 root tape  21,   3 Jun  2 21:12 /dev/sg3
crw-------. 1 root root  10, 231 Jun  2 21:12 /dev/snapshot
crw-rw-rw-. 1 root tape   9,   0 Jun  2 21:12 /dev/st0
crw-rw-rw-. 1 root tape   9,  96 Jun  2 21:12 /dev/st0a
crw-rw-rw-. 1 root tape   9,  32 Jun  2 21:12 /dev/st0l
crw-rw-rw-. 1 root tape   9,  64 Jun  2 21:12 /dev/st0m
crw-rw-rw-. 1 root tape   9,   1 Jun  2 21:12 /dev/st1
crw-rw-rw-. 1 root tape   9,  97 Jun  2 21:12 /dev/st1a
crw-rw-rw-. 1 root tape   9,  33 Jun  2 21:12 /dev/st1l
crw-rw-rw-. 1 root tape   9,  65 Jun  2 21:12 /dev/st1m
crw-rw-rw-. 1 root tape   9,   2 Jun  2 21:12 /dev/st2
crw-rw-rw-. 1 root tape   9,  98 Jun  2 21:12 /dev/st2a
crw-rw-rw-. 1 root tape   9,  34 Jun  2 21:12 /dev/st2l
crw-rw-rw-. 1 root tape   9,  66 Jun  2 21:12 /dev/st2m

So I guess it’s not surprise that that actions on /dev/smc are not functional. Is there a missing alias in this setup? Should smc be one of the devices above or is it something else entirely?

I thought I was about to have a breakthrough. But uncommenting a line in 00-cta-tape.rules I get a smc device on the pod

lrwxrwxrwx. 1 root root 8 Jun 2 22:09 /dev/smc -> /dev/sg2

but the archive retrieve test still fails with this:

Aborting: Failed to mount tape for read/write access: vid=V01001 slot=smc0: Failed to mount tape in SCSI tape-library for read/write access: vid=V01001 librarySlot=smc0: Received error from rmcd: rmcRc=2203 rmcErrorStream=smc_mount: SR018 - mount of V01001 on drive 0 failed : /dev/smc : scsi error : Hardware error ASC=4 ASCQ=3

 RMC03 - illegal function 4

Jun  2 22:11:40.344408 tpsrv01 cta-tape-label: LVL="WARN" PID="224" TID="224" MSG="Drive does not support LBP" userName="UNKNOWN" tapeVid="V01001" tapeOldLabel="" force="false" 
ERROR: failed to label the tape V01001
ERROR: failed to prepare namespace for the tests

and the load/unload also don’t look great.

[cta@ewv-cta2 tests (master)]$ kubectl exec -n cta tpsrv01 -- mtx -f /dev/smc load 1 0
Defaulting container name to rmcd.
Use 'kubectl describe pod/tpsrv01' to see all of the containers in this pod.
Loading media from Storage Element 1 into drive 0...mtx: Request Sense: Long Report=yes
mtx: Request Sense: Valid Residual=yes
mtx: Request Sense: Error Code=70 (Current)
mtx: Request Sense: Sense Key=Hardware Error
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Residual = 00 00 00 00
mtx: Request Sense: Additional Sense Code = 04
mtx: Request Sense: Additional Sense Qualifier = 03
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=no
mtx: Request Sense: SKSV=no
MOVE MEDIUM from Element Address 1000 to 500 Failed
[cta@ewv-cta2 tests (master)]$ kubectl exec -n cta tpsrv01 -- mtx -f /dev/smc status
Defaulting container name to rmcd.
Use 'kubectl describe pod/tpsrv01' to see all of the containers in this pod.
  Storage Changer /dev/smc:3 Drives, 11 Slots ( 1 Import/Export )
Data Transfer Element 0:Empty
Data Transfer Element 1:Empty
Data Transfer Element 2:Empty
      Storage Element 1:Full :VolumeTag=V01001TA                            
      Storage Element 2:Full :VolumeTag=V01002TA                            
      Storage Element 3:Full :VolumeTag=V01003TA                            
      Storage Element 4:Full :VolumeTag=V01004TA                            
      Storage Element 5:Full :VolumeTag=V01005TA                            
      Storage Element 6:Full :VolumeTag=V01006TA                            
      Storage Element 7:Full :VolumeTag=V01007TA                            
      Storage Element 8:Empty
      Storage Element 9:Empty
      Storage Element 10:Empty
      Storage Element 11 IMPORT/EXPORT:Empty
[cta@ewv-cta2 tests (master)]$ kubectl exec -n cta tpsrv01 -- mtx -f /dev/smc unload 1 0
Defaulting container name to rmcd.
Use 'kubectl describe pod/tpsrv01' to see all of the containers in this pod.
Data Transfer Element 0 is Empty

Maybe I made the symlink to the wrong device?

Hi Eric,

The /dev/smc symlink is created automatically on the tpsrv01 and tpsrv02 pods which is the only place it’s really needed. In the other pods, and on your VM, you need to run lssci -g and find the mediumx device, typically something like /dev/sg1, and use that as the argument for mtx.

So, from the VM, both the following should work;

mtx -f `lsscsi -g | awk '$2~/mediumx/{print $7}' | head -1` status
kubectl exec tpsrv01 -- mtx -f /dev/smc status

Oliver.

So status works on both the vm (/dev/sg2) and the pod (/dev/smc after updating the 00-cta-tape.rules files). But the load and unloads appear to give an error:

[cta@ewv-cta2 ~]$ mtx -f /dev/sg2 load 1 0
Loading media from Storage Element 1 into drive 0...mtx: Request Sense: Long Report=yes
mtx: Request Sense: Valid Residual=yes
mtx: Request Sense: Error Code=70 (Current)
mtx: Request Sense: Sense Key=Hardware Error
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Residual = 00 00 00 00
mtx: Request Sense: Additional Sense Code = 04
mtx: Request Sense: Additional Sense Qualifier = 03
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=no
mtx: Request Sense: SKSV=no
MOVE MEDIUM from Element Address 1000 to 500 Failed

and tests/archive_retrieve.sh -n cta has a problem in the tape labeling part of the test:

Cannot create rule to assign mount-policy ctasystest to requester-group ctaeos:powerusers because a rule already exists assigning the requester-group to mount-policy ctasystest
Labeling tapes:
  cta-tape-label --vid V01001
Jun  3 16:25:03.269135 tpsrv01 cta-tape-label: LVL="WARN" PID="26472" TID="26472" MSG="Drive does not support LBP" userName="UNKNOWN" tapeVid="V01001" tapeOldLabel="" force="false" 
Aborting: Failed to mount tape for read/write access: vid=V01001 slot=smc0: Failed to mount tape in SCSI tape-library for read/write access: vid=V01001 librarySlot=smc0: Received error from rmcd: rmcRc=2203 rmcErrorStream=smc_mount: SR018 - mount of V01001 on drive 0 failed : /dev/smc : scsi error : Hardware error ASC=4 ASCQ=3

 RMC03 - illegal function 4

ERROR: failed to label the tape V01001
ERROR: failed to prepare namespace for the tests

Is this to be expected?

Hi,
I think you’ve managed to find an inconsistency in our scrips, the “non CERN” mhvtl configures its tapes slightly differently. I suspect the contents of your /opt/mhvtl dir is inconsistent or simply empty.
I’ve updated recreate_buildtree_running_environment.sh, please refresh your checkout and run it and try mtx load again.
Thanks,
Oliver.

Thanks, Oliver!

[cta@ewv-cta2 orchestration (master)]$ kubectl -n cta exec tpsrv01 -- mtx -f /dev/smc load 1 0
Defaulting container name to rmcd.
Use 'kubectl describe pod/tpsrv01' to see all of the containers in this pod.
Loading media from Storage Element 1 into drive 0...done

And the archiving test seems to be working as well.

Have a good weekend. I’ll keep playing around now that I seem to have something functional.

Phew! Enjoy the weekend!