Mark a drive down without waiting for the complete tape session

Hi everyone,

I was wondering if it’s possible to quickly take a drive down, that is, as soon as the current read or write request finishes.

It waits for the tape session to finish, but in most cases, this takes several hours.
Because of this, when we have specific situations where we need to drain the drives quickly, I’m not quite sure how to achieve it. I’ve already tried using the force parameter, but I ended up getting the same result.

Thanks in advance!

Esther

If we are i n a hurry, then we just restart the taped for that drive. Don’t think there is a middle ground between the two extremes.

Tim

Dear Esther (and Tim),

Thank you for bringing this up. We are aware of this issue / lack of solution in a situation you describe.

We do have a ticket opened since 2023 (Eject tape when a tape server is stopped. (#516) · Issues · CTA / CTA · GitLab) that should resolve this.

Unfortunately, this wasn’t high on our priority list, but now, @nbugel is assigned to work on improving cta-taped.

The main problem is that while the CTA tapeserver daemon can be terminated/restarted quickly, the tape will not be properly dismounted from the drive.

The current workaround is to identify the drive process of cta-taped with ps -u cta and terminate it. The parent process will then start a cleaner and the tape will be dismounted.

Example how to terminate an ongoing retrieve request (you may want to put the drive DOWN first (which will prevent new request to jump in) which I didn’t do in this example):

[root@tpsrv072 ~]# cta-admin dr ls first
     library                drive     host desired  request   status since    vid    tapepool  vo files data  MB/s session priority activity scheduler      instance age reason 
IBMLIB1-LTO9 IBMLIB1-LTO9-F12C4R4 tpsrv072      Up Retrieve Transfer 23404 L66506 vo_CMS_2025 CMS   783 4.2T 177.2 2312656        0        -  cephUser ctaproduction  16 -      

[root@tpsrv072 ~]# mt -f /dev/nst0 status
/dev/nst0: Device or resource busy

[root@tpsrv072 ~]# ps -u cta
    PID TTY          TIME CMD
   2795 ?        00:00:21 cta-rmcd
  15272 ?        1-12:43:37 cta-maintd
1855249 ?        00:02:52 F12C4R4-parent
3988343 ?        02:33:04 F12C4R4-drive
[root@tpsrv072 ~]# kill -9 3988343

Few seconds later the cleaner will start to dismount the tape (it can take some time depending on how much tape has to be unwound + unload + dismount):

[root@tpsrv072 ~]# cta-admin dr ls first
     library                drive     host desired request  status since    vid    tapepool  vo files data MB/s session priority activity scheduler      instance age reason 
IBMLIB1-LTO9 IBMLIB1-LTO9-F12C4R4 tpsrv072      Up       - CleanUp    32 L66506 vo_CMS_2025 CMS     -    -    -       0        0        -  cephUser ctaproduction  32 -      

[root@tpsrv072 ~]# cta-admin dr ls first
     library                drive     host desired request  status since    vid    tapepool  vo files data MB/s session priority activity scheduler      instance age reason 
IBMLIB1-LTO9 IBMLIB1-LTO9-F12C4R4 tpsrv072      Up       - CleanUp   218 L66506 vo_CMS_2025 CMS     -    -    -       0        0        -  cephUser ctaproduction 218 -      

and then the drive is free:

[root@tpsrv072 ~]# cta-admin dr ls first
     library                drive     host desired request status since vid tapepool vo files data MB/s session priority activity scheduler      instance age reason 
IBMLIB1-LTO9 IBMLIB1-LTO9-F12C4R4 tpsrv072      Up       -   Free    59   -        -  -     -    -    -       -        0        -  cephUser ctaproduction   6 -      

[root@tpsrv072 ~]# mt -f /dev/nst0 status | grep IM
 DR_OPEN IM_REP_EN

Hope this helps. Best regards,

Vladimir Bahyl
CERN

Dear Vlado and Tim,

Thank you so much for your replies and the detailed example. We will do so.

Best regards,

Esther