Configure drives to write simultaneously in the same tapepool?

eli_carr · 29 June 2023 11:35

Hi all

Is there any way to adjust how many drives can write simultaneously to tapes of the same tapepool?

Thank you

Eli

vlado · 29 June 2023 13:30

Dear Eli,

first of all in a given tape pool there has to be enough free tapes.

At CERN, we have a script that supplies empty tapes to pools by looking at partial tapes parameter of each tape pool.

Once you have enough empty pools, you need to look at connection between VO and tape pool:

storage class defines to which VO the given sc belongs to and how many copies it should have
archive route tells you which file copy of a given storage class goes to which tape pool

So in order to increase or decrease number of drives that can write into a tape pool:

make sure there is enough free tapes
make sure --wmd (write max drives) option of a VO is at the value you want.

Did I answer you question? Let me know if you need more.

Vladimir

eli_carr · 29 June 2023 14:00

Hi Vladimir, thanks for the reply.

Suppose I have a VO with 3 write max drives. I also have a tapepool with more than 3 tapes available. I have thousands of TBs waiting to be written to this tapepool (just one copy to that tp).

I observed that CTA’s behavior is to only start writing one tape with one drive and keep going until the tape is full. Then the drive will grab the next tape available. If requests were diversified to several tapepools linked to that VO, the three drives will start working simultaneously on them. But as long as requests are all to the same tapepool, at least in my case, CTA just uses one drive. I suppose this is to avoid having several half-written tapes. But then, even if the amount of data on queue will occupy more than two tapes of the tapepool, there’s no way to start writing into two tapes of the same tp in parallel. Right? Or am I missing something?

Cheers!
Eli

vlado · 29 June 2023 14:04

Eli,

what you describe is not normal.

During the data taking we occasionaly have more than 30 tape drives in parallel that write data into the same pool.

Please check why the other drives are not picking up the work.

Are the drives up and free in the logical tape library? What is written in cta-taped.log file? There should be something like: ignoring this mount because x, y, z.

Vladimir

eli_carr · 29 June 2023 14:07

Oh, okay… That is good news though. I never saw any messages as such, but I’ll reproduce it as soon as we get rid of the humidity issues

Thank you again for your help Vladimir!

Eli

vlado · 29 June 2023 14:08

Maybe what I am suggesting is only shown for retrieves.

To debug this, please send me the output of:

cta-admin sq | grep -i archive | grep your_VO
cta-admin tape ls -t tape_pool_for_VO -f false
cta-admin your_VO ls
cta-admin sc ls | grep your_VO
cta-admin ar ls | grep sc_of_your_VO

eli_carr · 29 June 2023 14:49

We’ve got everything stopped due to the humidity issue now, but I’ve sent one random file for you to see that it is pointing at the right tp and vo while at queue:

#queue
[root@ctatps001 ~]# cta sq | grep -i archive | grep dteam
ArchiveForUser   dteam1  dteam       -   -            1       10.2G   9199     9199        1   14500               2                2           0          0         0          36.0T             78        798.7G          0              3 

#tapes
[root@ctatps001 ~]# cta tape ls -t dteam1 -f false
   vid media type vendor library tapepool    vo encryption key name capacity occupancy last fseq  full from castor  state state reason label drive       label time last w drive      last w time w mounts last r drive      last r time r mounts c.user    c.host           c.time m.user    m.host           m.time comment 
V03646       LTO8    IBM     cta   dteam1 dteam                   -    12.0T    798.7G        78 false       false ACTIVE            -    IBML9534 2023-05-11 12:43     IBML9534 2023-06-20 17:20       11     IBML9541 2023-06-29 14:12        2    cta ctatps001 2023-05-11 12:38    cta ctatps001 2023-06-14 13:20 -       
V03653       LTO8    IBM     cta   dteam1 dteam                   -    12.0T         0         0 false       false ACTIVE            -    IBML9534 2023-05-11 12:51            -                -        0            -                -        0    cta ctatps001 2023-05-11 12:38    cta ctatps001 2023-06-14 13:20 -       
V03654       LTO8    IBM     cta   dteam1 dteam                   -    12.0T         0         0 false       false ACTIVE            -    IBML9534 2023-05-11 12:53            -                -        0            -                -        0    cta ctatps001 2023-05-11 12:38    cta ctatps001 2023-06-14 13:20 -   

#vo
[root@ctatps001 ~]# cta-admin vo ls | grep dteam
 dteam               3                3             0           cta    cta ctatps001 2023-05-11 12:12    cta ctatps001 2023-05-11 12:13 dteam       

#sc
[root@ctatps001 ~]# cta-admin sc ls | grep dteam
   dteam.dteam@osm                1  dteam    cta ctatps001 2023-05-15 13:25    cta ctatps001 2023-06-15 10:05 dteam   

#ar
[root@ctatps001 ~]# cta-admin ar ls | grep dteam
   dteam.dteam@osm           1   dteam1    cta ctatps001 2023-06-14 13:21    cta ctatps001 2023-06-14 13:21 dteam1

I’ll try again as soon as I can start writing and will keep you informed!

vlado · 30 June 2023 12:27

Hi Eli,

unfortunately one file is not sufficient to see why more tapes are not being used.

By default, CTA tries to fill the tapes one after another so it will always mount the tape that has some data already (= has least free space).

In order to see more tapes to be mounted, you need to queue lot more data / files.

See the option MountCriteria of cta-taped:

# Criteria to mount a tape, specified as a tuple (number of bytes, number of files). An archival or
# retrieval queue must contain at least this number of bytes or this number of files before a tape mount
# will be triggered. This does not apply when the timeout specified in the applicable mount rule is
# exceeded. Defaults to 500 GB and 10000 files.
taped MountCriteria 500000000000,10000

So if you queue <500 GB of data, the system should be able to cope with that with one tape drive.

Only if you submit >500 GB of data (let’s say 700 GB or even 1 TB), then the 2nd tape drive will kick in and mount the 2nd tape.

You can play with those default values to lower them to see what happens.

Otherwise, I do not see anything obviously wrong with your configuration.

Please confirm how the system behaves with >500 GB of data queued.

Best regards,

Vladimir

eli_carr · 4 August 2023 12:13

Hi Vladimir,

Just writing to confirm everything seems to be working well now! I probably had something misconfigured on the past… Anyways, thank you again!

Eli