What should happen during repack?

Summary

We have a repack going of about 5 TB. But what’s happening seems very strange.

The repack read the tape just fine and place all the contents into an NFS mounted disk area. The archive started and looking at cta-admin dr ls everything looks fine. So far it has written over 3 TB:

G2_LTO8_DEV G2_F8C4R2 tpsrvg2105      Up ArchiveForRepack Transfer 12808 VR7100 test.cta_test_2copy_copy_1 dev_repack  1563 3.3T 255.6   29113        0        -  cephUser  13 -      

and looking at the drive logs all looks normal too:

{"epoch_time":1747853655.086340930,"local_time":"2025-05-21T13:54:15-0500","hostname":"tpsrvg2105","program":"cta-taped","log_level":"INFO","pid":3027568,"tid":3062633,"message":"File successfully read from disk","drive_name":"G2_F8C4R2","instance":"dev","sched_backend":"cephUser","thread":"DiskRead","tapeDrive":"G2_F8C4R2","tapeVid":"VR7100","mountId":"29113","vo":"dev_repack","tapePool":"test.cta_test_2copy_copy_1","threadID":7,"path":"file:////pnfs/Migration/VR1871/000000029","actualURL":"file:////pnfs/Migration/VR1871/000000029","fileId":54454163,"readWriteTime":3.632026,"checksumingTime":0.0,"waitFreeMemoryTime":69.5891090000001,"waitDataTime":0.0,"waitReportingTime":0.0,"checkingErrorTime":0.000555000000000003,"openingTime":0.007903,"transferTime":73.229596,"totalTime":73.229596,"dataVolume":2097152000,"globalPayloadTransferSpeedMBps":28.6380386421905,"diskPerformanceMBps":28.6380386421905,"openRWCloseToTransferTimeRatio":0.04970570915071}
{"epoch_time":1747853655.096964629,"local_time":"2025-05-21T13:54:15-0500","hostname":"tpsrvg2105","program":"cta-taped","log_level":"INFO","pid":3027568,"tid":3062633,"message":"Opened disk file for read","drive_name":"G2_F8C4R2","instance":"dev","sched_backend":"cephUser","thread":"DiskRead","tapeDrive":"G2_F8C4R2","tapeVid":"VR7100","mountId":"29113","vo":"dev_repack","tapePool":"test.cta_test_2copy_copy_1","threadID":7,"path":"file:////pnfs/Migration/VR1871/000000380","actualURL":"file:////pnfs/Migration/VR1871/000000380","fileId":54452244}
{"epoch_time":1747853661.966283774,"local_time":"2025-05-21T13:54:21-0500","hostname":"tpsrvg2105","program":"cta-taped","log_level":"INFO","pid":3027568,"tid":3062637,"message":"File successfully transmitted to drive","drive_name":"G2_F8C4R2","instance":"dev","sched_backend":"cephUser","thread":"TapeWrite","tapeDrive":"G2_F8C4R2","tapeVid":"VR7100","mountId":"29113","vo":"dev_repack","tapePool":"test.cta_test_2copy_copy_1","mediaType":"LTO7M","logicalLibrary":"G2_LTO8_DEV","mountType":"ArchiveForRepack","vendor":"Unknown","capacityInBytes":9000000000000,"fileId":54452357,"fileSize":2097152000,"fSeq":1543,"diskURL":"file:////pnfs/Migration/VR1871/000000368","readWriteTime":5.97889,"checksumingTime":0.901273,"waitDataTime":0.00250699999999999,"waitReportingTime":0.000175,"transferTime":6.882845,"totalTime":6.882817,"dataVolume":2097152000,"headerVolume":480,"driveTransferSpeedMBps":304.693918202387,"payloadTransferSpeedMBps":304.6938484635,"reconciliationTime":1699296224,"LBPMode":"LBP_On"}

with messages like this repeating. We seem to have had several mounts of the tape to write to with these messages:

{"epoch_time":1747835106.071864160,"local_time":"2025-05-21T08:45:06-0500","hostname":"tpsrvg2105","program":"cta-taped","log_level":"INFO","pid":2357472,"tid":2925532,"message":"No more data to write on tape, unconditional flushing to the client","drive_name":"G2_F8C4R2","instance":"dev","sched_backend":"cephUser","thread":"TapeWrite","tapeDrive":"G2_F8C4R2","tapeVid":"VR7100","mountId":"29112","vo":"dev_repack","tapePool":"test.cta_test_2copy_copy_1","mediaT
ype":"LTO7M","logicalLibrary":"G2_LTO8_DEV","mountType":"ArchiveForRepack","vendor":"Unknown","capacityInBytes":9000000000000,"files":2,"bytes":4194304000,"flushTime":1.949508}
{"epoch_time":1747835106.086162223,"local_time":"2025-05-21T08:45:06-0500","hostname":"tpsrvg2105","program":"cta-taped","log_level":"INFO","pid":2357472,"tid":2925532,"message":"Logging mount general statistics","drive_name":"G2_F8
C4R2","instance":"dev","sched_backend":"cephUser","thread":"TapeWrite","tapeDrive":"G2_F8C4R2","tapeVid":"VR7100","mountId":"29112","vo":"dev_repack","tapePool":"test.cta_test_2copy_copy_1","driveManufacturer":"IBM     ","driveType"
:"ULT3580-TD8     ","firmwareVersion":"Q3A0","serialNumber":"0007880A1B","mountTotalNonMediumErrorCounts":0}

Even that doesn’t look bad, but when we look at the output of cta-admin repack ls we see no progress. archivedBytes is 0, failedToArchive files and bytes are non-zero, and there is nothing in destinationInfos. Also listing tapefiles for the destination tape gives nothing.

On thing that is odd, but seems OK to me is that the tape pool of the source tape is not the same as the tape pool of the destination tape. As far as I can tell this is OK because the archive routes for the storage class of the files changed in the interim. But the files are correctly routed according to the SC, the new ARs, and the new TPs.

Any ideas of what might be wrong or what to check? We’re trying this in our dev environment first of course.

Dear Eric,

we discussed today this issue, but we are not sure to be best placed to understand what you are doing.

We learned from DESY that when using CTA with dCache, there is no need for low level repack, dCache will just move the files from one tape to another.

Maybe @timkrtch or @Mwai can comment?

Best regards,

Vladimir

Hi Vlado,

What we are trying to do is to run repack procedure using repack buffer that is a mounted NFS file system (served by dCache, but that bit should not be relevant, could be any NFS server or local file system).

I believe we used to run it successfully in 5.10

I think for clarity @ewv we need to provide all commands that we used plus configuration bit.

(as part of dCache team I am personally not aware of any method dCache could move files from one tape to another other than CTA using a buffer, or what you call “low level repack", so I am also interested).

Hi all,

We perform low-level repacking using a buffer shared among the tape movers, with dCache playing no role in the tape repacking process

regards,
mwai

Let me come back to this question and why I asked. We have a repack running of a bad tape and from dr ls we see the following:

  {
    "logicalLibrary": "F2_LTO9",
    "driveName": "F2_F9B3D3",
    "host": "tpsrvf2216",
    "desiredDriveState": "UP",
    "mountType": "ARCHIVE_FOR_REPACK",
    "driveStatus": "TRANSFERRING",
    "driveStatusSince": "36652",
    "vid": "FB7345",
    "tapepool": "uboone.data_reco",
    "filesTransferredInSession": "16488",
    "bytesTransferredInSession": "14134607185731",
    "sessionId": "73083",
    "timeSinceLastUpdate": "14",
    "currentPriority": "0",
    "currentActivity": "",
    "ctaVersion": "5.11.2.3-1",
    "devFileName": "/dev/tape/by-id/scsi-103300575A-nst",
    "rawLibrarySlot": "smc10",
    "comment": "",
    "reason": "",
    "vo": "prd_repack",
    "diskSystemName": "",
    "reservedBytes": "0",
    "sessionElapsedTime": "36990",
    "logicalLibraryDisabled": false,
    "physicalLibrary": "F2",
    "physicalLibraryDisabled": false,
    "schedulerBackendName": "cephUser"
  },

so we’ve written 14 TB of the data from the repack onto tape. That looks great. However when I look at that tape

  {
    "vid": "FB7345",
    "mediaType": "LTO9",
    "vendor": "Fujifilm",
    "logicalLibrary": "F2_LTO9",
    "tapepool": "uboone.data_reco",
    "vo": "uboone",
    "encryptionKeyName": "-",
    "capacity": "18000000000000",
    "occupancy": "2594542786018",
    "lastFseq": "5801",
    "full": false,
    "fromCastor": false,
    "readMountCount": "0",
    "writeMountCount": "58",
    "comment": "",
    "nbMasterFiles": "5801",
    "masterDataInBytes": "2594542786018",
    "state": "ACTIVE",
    "stateReason": "",
    "stateUpdateTime": "1746623502",
    "stateModifiedBy": "jonest@cta03",
    "dirty": false,
    "verificationStatus": "",
    "purchaseOrder": "",
    "physicalLibrary": "F2",
    "labelFormat": "CTA"
  }

I only see the 2.6 TB of data that was on that tape in the first place (written before the repack was started). I would expect to see the byte and fSeq counters on this tape incrementing as we go along.

To conclude this thread for the community - the problem turned out to be caused by files with owner_uid=0.

This is normally not allwed but it can happen during import of files from another system.

The only workaround for now is to make sure the files are owned by non root user.