Identifying repack candidate tapes

How do you identify which tapes are ripe for a repack? I recently went through and deleted a large number of files. Going tape by tape and listing the files I can see that I have tapes where the majority of files are no longer in tapefile. I naively expected that tape ls would tell me how many active files were still on a tape.

Thanks!

Eric

Hi Eric,
more often than not when we repack we pick tapes based on the media generation, repacking from an older generation on the way out to a new generation with higher capacity.

However, the Automated Tape REpacking SYStem comes with a tool called cta-ops-repack-0-scan, which is intended for precisely this use case. It allows you to automatically select tape VIDs based on the percentage of their occupancy, filter by tape pool etc., and optionally feed those selected VIDs into the automated repacking system.

Additional info and install instructions can be found on the dedicated ops tool wiki: Home · Wiki · cta / CTA Operations utilities · GitLab

Cheers,
Richard

Perfect! Thank you! This looks great.

On further examination, that script appears to be looking at master data in bytes divided by capacity. Which makes sense and was what I naively expected. However, we have a tape where every file has been put into the recycle log (visible in recycletf ls) and yet looks like this:

[
  {
    "vid": "FL0590",
    "mediaType": "LTO8",
    "vendor": "Unknown",
    "logicalLibrary": "TS4500G1_CTACMS",
    "tapepool": "cms.Run3Winter20DRPremixMiniAODMCGenSimRaw",
    "vo": "cms",
    "encryptionKeyName": "-",
    "capacity": "12000000000000",
    "occupancy": "11636563312201",
    "lastFseq": "1665",
    "full": true,
    "fromCastor": false,
    "readMountCount": "45",
    "writeMountCount": "28",
    "nbMasterFiles": "1665",
    "masterDataInBytes": "11636563312201",
    "state": "ACTIVE",
    "stateReason": "",
    "stateUpdateTime": "1707864003",
    "stateModifiedBy": "eosdev@cmscta01",
    "dirty": true,
    "verificationStatus": ""
  }
]

So any deletions are not reflected in the usage. Is there, perhaps, something we are missing which should be updating these columns as deletions happen?

[root@cmscta01 atresys]# XrdSecPROTOCOL=sss XrdSecSSSKT=/etc/cta/ctafrontend_forwardable_sss.keytab cta-admin --json rtf ls --vid FL0590|jq | grep fseq | wc
   1665    3330   32193

So all the files have been deleted in EOS

Hi Eric,
so the masterDataInBytes of a tape is supposed to be computed by CTA as something along the lines of

masterDataInBytes = occupancy - size_of(deleted_files)

However, certain operations in CTA, such as a file being marked as deleted, don’t trigger immediate statistics updates.

You can force a statistics update when needed using the cta-statistics-update (from the CTA RPMs) script, which should result in an up-to-date masterDataInBytes value.

We run this script about once a day as part of our monitoring.

Thanks. This works great, of course. Are there any other such things which need to be done for a functional system? (As opposed to monitoring-only tasks?)

Hi Eric,
yes we know having to call this somewhat obscure script is sub-optimal. It is scheduled for review at some point.

The other needed-for-production scripts/executions I can think of are more conditional, and not part of the core CTA software:

How do we repack all M8 tapes in one logical library to LTO9 tapes in another logical library using ATRESYS? Can this be done regardless of VO?

Dear Jeff,

Identifying all M8 tapes in a particular logical library is easy with these two commands (because we call M8 tapes LTO7M at CERN):

pcvlado ~ > cta-admin mt ls|grep LTO7M
     LTO7M     LTO-7  9000000000000                   93                      0             168     2696   171097 mdavis           ctadevmichael 2020-06-22 16:54  vlado ctaproductionfrontend01 2020-11-02 19:39 ctaproduction LTO-7 M8 cartridge formated at 9 TB                              

pcvlado ~ > cta-admin tape ls --mt LTO7M -l IBMLIB1-LTO8 | wc -l
6600

Once you have the list of VIDs, you can put them into a file (one VID per line) and submit all tapes for repack using cta-ops-repack-1-prepare -F FILE command.

What is more tricky is to make sure that the destination tapes are LTO9 tapes in another logical libraries. The only way to do it is to make sure that for each tape pools where the M8 tapes are located, the only supply tapes come from your desired destination LTO9 logical library.

Let me know if you need more explanations. Best regards,

Vladimir