Question about tape pools for file copies

dmitrylitvintsev · 9 November 2023 21:16

Hi,

I am storage developer at Fermilab, I am trying to understand mapping of Enstore concepts to CTA concepts and I have a question about tape_pool and how it maps to the following real life scenario.

In Enstore the closest thing to virtual_organization_name is so called storage_group, the closest thing to storage_class is storage_group + “.” + file_family (file_family is a concept associated with datasets, datasets do not share tapes).

I may need some guidance with the copies. Please direct me to docs/conversations if this has already been covered.

In Enstore file copies are written to different volumes physically located in different tape libraries and “connected” to incoming data by virtual library name passed as an option to Enstore client executable
called “encp”. Enstore does not have a concept of tape pools, there is singe volume table having a field library pre-defined when labels are loaded into DB. So “archive_route” is essentially is the library name volume attribute in Enstore.

To translate this to CTA, I will speak using DB level terms (rather than cta-amin CLI) because migratrion script works directly with DB schemas

I create two logical_librarry entries: names A and B
Then I need to create two tape_pools: pool_A and pool_B
Then insert tapes with logical_library A, pool_A and logical_library B, pool_B

I create storage_class entry with storage_class_name = “foo” and nb_copies = 2

Then I create 2 archive_routes:
archive_route_foo_A with copy_nb = 1 for storage_class_name foo
archive_route_foo_B with copy_nb = 2 for storage_class_name foo
(I explicitly used storage_class and library name in the route name to express connection and because this is “machine generated” from Enstore metadata)

Does the above sound right? (this is what I coded, but wanted to run it by CTA experts before I move further into migration)

I apologize if I missed to find existing documentation that covers this case.

Additional question: is archive_route only relevant for writes?

Dmitry

vlado · 10 November 2023 13:57

Hello Dmitry,

thank you for your comment, let me try to help.

First of all the terminology.

In CTA, we are using ARCHIVE word when we are talking about writing to tape and RETRIEVE word when reading from tape. We decided to re-use the terminology from a IBM product. With using ARCHIVE we wanted to empasize the archive functionality of CTA (= CERN Tape Archive).

Regarding your example, what you outline looks correct to me.

I will nevertheless give you an example how we do it at CERN for dual copy data, in this example old data from LEP experiments:

pcvlado ~ > cta-admin vo ls | egrep 'name|PRES'
         name read max drives write max drives max file size    disk instance is repack vo   c.user                  c.host           c.time   m.user                  m.host           m.time comment
 PRESERVATION               2                2        128.8G eosctapublicdisk        false   jleduc           eosctafst0114 2020-09-03 09:30 vyurchen ctaproductionfrontend01 2022-05-05 15:47 PRESERVATION on eosctapublic, eosctapublicdisk

pcvlado ~ > cta-admin sc ls -n delphi
storage class number of copies           vo c.user c.host           c.time m.user m.host           m.time comment
       delphi                2 PRESERVATION CASTOR CASTOR 2021-12-03 14:33 CASTOR CASTOR 2021-12-03 14:33 Imported from CASTOR

pcvlado ~ > cta-admin ar ls | egrep "class|delphi"
           storage class copy number                   tapepool   c.user                  c.host           c.time   m.user                  m.host           m.time comment              
                  delphi           1               r_lep_data_1 vyurchen ctaproductionfrontend01 2021-12-06 10:38 vyurchen ctaproductionfrontend01 2021-12-06 10:38 delphi -> r_lep_data_1
                  delphi           2               r_lep_data_2 vyurchen ctaproductionfrontend01 2021-12-06 11:47 vyurchen ctaproductionfrontend01 2021-12-06 11:47 delphi -> r_lep_data_2

pcvlado ~ > cta-admin tp ls | egrep "name|lep_data"
                           name            vo #tapes #partial #phys files   size   used  avail   use% encrypt                                                                supply   c.user                  c.host           c.time   m.user                  m.host           m.time comment                                                                                   
                   r_lep_data_1  PRESERVATION     50        1    16325995 801.0T   1.1P      0 132.3%   false                                                         supply_IBM513   CASTOR                  CASTOR 2021-12-03 14:33    vlado ctaproductionfrontend01 2021-12-06 15:07 Data from various LEP experiments - 1st copy                                              
                   r_lep_data_2  PRESERVATION     85        1    16325822 834.0T   1.1P      0 127.1%   false                                                         supply_LTO613   CASTOR                  CASTOR 2021-12-03 14:33    vlado ctaproductionfrontend02 2022-07-09 10:13 Data from various LEP experiments - 2nd copy                                              

pcvlado ~ > cta-admin tape ls -t r_lep_data_1 | head -2
   vid media type   vendor library order     tapepool           vo encryption key name capacity occupancy last fseq  full from castor  state state reason label drive       label time last w drive      last w time w mounts last r drive      last r time r mounts c.user                  c.host           c.time     m.user                  m.host           m.time comment
I40318   3592JC7T      IBM  IBM460     - r_lep_data_1 PRESERVATION                   -     7.0T     13.3T     70585  true        true ACTIVE            -      CASTOR 1970-01-01 01:00     tpsrv215 2017-12-03 16:46       22     I4550832 2023-05-15 06:28       47 CASTOR                  CASTOR 2021-12-03 14:33      vlado ctaproductionfrontend02 2023-10-31 20:42 -
pcvlado ~ > cta-admin tape ls -t r_lep_data_2 | head -2
   vid media type       vendor library order     tapepool           vo encryption key name capacity occupancy last fseq  full from castor  state state reason label drive       label time last w drive      last w time w mounts last r drive      last r time r mounts c.user                  c.host           c.time     m.user                  m.host           m.time comment
L70441      LTO7M IBM-FUJIFILM  IBM1L8     - r_lep_data_2 PRESERVATION                   -     9.0T     19.3T      5374  true        true ACTIVE            -      CASTOR 1970-01-01 01:00     tpsrv114 2019-02-09 23:28       15     I1L80914 2023-08-31 15:15       11 CASTOR                  CASTOR 2021-12-03 14:33 tape-local ctaproductionfrontend02 2023-08-31 18:00 -

In this example you see the whole path from of the DELPHI experiment (which as an old LEP experiment we made a part of the PRESERVATION virtual organization). In the storage class you define how many copies you need, then use archive routes to define into which tape pools the data will flow (depending on the copy number). The list of 1 tape from each tape pools shows you different physical tape technologies in distinct logical libraries.

To answer the last question, yes the archive routes are only relevant for writing, for retrieve, there is some randomisation which copy is read (if I am not mistaken).

Hope this answers all your questions. Best regards,

Vladimir Bahyl

dmitrylitvintsev · 10 November 2023 16:04

Hi Vladimir,

Thank you for the prompt reply and thorough explanation.
The main reason I brought up “archival only” use of archive_path was the following:
We have many physical tape libraries that have been purchased over the span of time.
Say one library is full and we start writing data belonging to the same storage class to a different library. For that we need to create a new tape pool, drop “old” archive_route and create new archive_route. At least this is my understanding.

vlado · 11 November 2023 09:36

Dmitry,

in the example above I concentrated on the situation where you have a storage class with 2 copies. At CERN most of the files only have 1 copy.

That is why I feel I should clarify that a tape pool can contain tapes from multiple tape libraries. Tape pool is nothing more than just a set of tapes.

So in your latest scenario, if you have archive route AR1 pointing to a tape pool TP1 and you want to migrate data from one library to another, just simply mark all tapes in TP1 and L1 as full and add new empty tapes from L2 where the data should flow.

Then start the repack of the tapes and the data will flow from L1 to L2. The TP1 will contain mix of old and new tapes. As the old tapes become empty, you remove them and at some point the TP1 will only contain new tapes from L2.

There no need to touch/modify AR1.

BTW - for large repack automation we recently made public our ATRESYS system here: tools/pip/atresys · master · cta / CTA Operations utilities · GitLab

HTH, bon weekend.

Vladimir

dmitrylitvintsev · 15 November 2023 18:02

Thanks Vladimir,

So, just to make sure. I can have one tape pool for a VO and tapes in each tape pool can be physically located in multiple libraries? That is tape_pool fulfills purely accounting concern.

If yes, I may be overcomplicating things by mapping Enstore (vo, library, file_family) → CTA (tape_pool). I can just create one pool per vo say and it will work fine (based on what you are saying).

Thanks,
Dmitry

P.S.: Well with the caveat that if there are 2 copies - there must be 2 tape pools corresponding because of (storage_class_id, tape_pool_id) unique constraint in archive_route.