Dear George,
we discussed this use case at our weekly ops meeting and it was decided that I will take over and write proper documentation for the CTA Tape Verify framework.
Please let’s start from this page:
The tape verification framework consist of 3 components:
/opt/cta-ops/ops-venv/bin/cta-ops-verification-feeder (python script) which will first select tapes to be verified based on various criteria.
cta-ops-verify-tape (python script) will then select which files on each selected tape should be verified based on various criteria
/usr/bin/cta-verify-file (binary) will finally submit verifications requests to the queue for the selected files from that give tape
This is how the execution of cta-ops-verification-feeder should look like:
[tape-local@ctaproductionfrontend11 ~]$ /opt/cta-ops/ops-venv/bin/cta-ops-verification-feeder --maxverify 20 --min_data_on_tape 1000000000000 --verify_options "--first 10 --last 10 --read_time 30"
2025-09-12 11:00:37 [INFO] [main] Executed as: /opt/cta-ops/ops-venv/bin/cta-ops-verification-feeder --maxverify 20 --min_data_on_tape 1000000000000 --verify_options --first 10 --last 10 --read_time 30
2025-09-12 11:00:56 [INFO] [main] Currently running verification on tapes: I04202, I04212, L50613, L53219, L60572, L60616, L65149, L65568, L65760, L92019, L92787, L98074
2025-09-12 11:00:56 [INFO] [main] 12 tapes are currently being verified, target is 20
2025-09-12 11:00:56 [INFO] [main] 51937 tapes are eligible for verification
2025-09-12 11:00:56 [INFO] [main] --min_data_on_tape specified, selecting only tapes with at least 1000000000000 bytes written
2025-09-12 11:00:56 [INFO] [main] After selecting only tapes with at least 1000000000000 bytes written, 51873 are eligible
2025-09-12 11:00:57 [INFO] [main] Based on policy: random, following new tapes have been selected for verification: I56818, L50997, L95366, I75988, L97267, I62014, I76537, I56470
2025-09-12 11:00:57 [INFO] [main] Submitting verification for tape I56818 (media type: 3592JD15T, logical library: IBMLIB4-TS1160, tape pool: vo_ATLAS_raw, total files: 9050, total bytes: 15449035707560) using command: cta-ops-verify-tape --vid I56818 --first 10 --last 10 --read_time 30
2025-09-12 11:01:02 [INFO] [main] Tape I56818 successfully submitted for verification.
2025-09-12 11:01:02 [INFO] [main] Waiting 120 seconds to start the next verification job...
2025-09-12 11:03:02 [INFO] [main] Submitting verification for tape L50997 (media type: LTO9, logical library: IBMLIB1-LTO9, tape pool: vo_CMS_2025, total files: 4648, total bytes: 18747095131345) using command: cta-ops-verify-tape --vid L50997 --first 10 --last 10 --read_time 30
2025-09-12 11:03:04 [INFO] [main] Tape L50997 successfully submitted for verification.
2025-09-12 11:03:04 [INFO] [main] Waiting 120 seconds to start the next verification job...
:
:
:
2025-09-12 11:15:35 [INFO] [main] All verification jobs submitted successfully
This is example of what cta-ops-verify-tape would do:
2025-09-12 11:00:57 [INFO] [verify_tape] Running verify-tape for tape with vid I56818, read speed of 300 MB/s, and data size: 566.2G
2025-09-12 11:00:59 [INFO] [partial_tape_scan] Performing partial tape scan.
2025-09-12 11:00:59 [INFO] [verify_tape] Verifying 30 files and 56.3G from tape I56818
2025-09-12 11:00:59 [INFO] [verify_files] ArchiveId of files to verify: 1723971173,1723969110,1723973684,1723970920,1723971908,1723971012,1723963741,1723962770,1723952600,1723972380,1723972082,1723986609,1723990392,1723998596,1724020618,1724020070,1724107269,1724125779,1724126480,17241358
33,1724159502,1724159687,1724157621,1724159132,1724158227,1724158771,1724158420,1724159497,1724159135,1724158162
2025-09-12 11:01:01 [INFO] [verify_files] All file verifications queued for tape I56818
2025-09-12 11:03:02 [INFO] [verify_tape] Running verify-tape for tape with vid L50997, read speed of 300 MB/s, and data size: 566.2G
2025-09-12 11:03:03 [INFO] [partial_tape_scan] Performing partial tape scan.
2025-09-12 11:03:03 [INFO] [verify_tape] Verifying 30 files and 115.5G from tape L50997
2025-09-12 11:03:03 [INFO] [verify_files] ArchiveId of files to verify: 4885744601,4885744597,4885744787,4885744616,4885744609,4885744645,4885744729,4885744782,4885744709,4885744789,4885744620,4885758367,4885959003,4885962075,4885965022,4885971897,4885979944,4886016608,4886026103,4886065410,4886084467,4886084285,4886084402,4886084468,4886084471,4886084282,4886084411,4886084388,4886084397,4886084448
2025-09-12 11:03:04 [INFO] [verify_files] All file verifications queued for tape L50997
These requests are then submitted using verification mount policy:
[tape-local@ctaproductionfrontend11 cta-ops]$ cta-admin mp ls | egrep "c.user|verification"
mount policy a.priority a.minAge r.priority r.minAge c.user c.host c.time m.user m.host m.time instance comment
verification 50 14400 50 600 vlado ctaproductionfrontend01 2021-08-12 08:55 vlado ctaproductionfrontend02 2022-09-06 12:01 ctaproduction Tape Media Verification framework mount policy
and look like this when queued:
[tape-local@ctaproductionfrontend11 ~]$ cta-admin sq | egrep "I56818|L50997"
Retrieve ctaproduction cephUser vo_ATLAS_raw ATLAS IBMLIB4-TS1160 I56818 30 56.3G 908 906 50 600 10 50 0 0 0 15.0T 9050 15.4T 1 0
Retrieve ctaproduction cephUser vo_CMS_2025 CMS IBMLIB1-LTO9 L50997 0 0 0 0 0 0 10 60 1 10 41.4G 18.0T 4648 18.7T 1 0
The 2nd one is already running on a drive:
[tape-local@ctaproductionfrontend11 ~]$ cta-admin dr ls | grep L50997
IBMLIB1-LTO9 IBMLIB1-LTO9-F09C1R1 tpsrv449 Up Retrieve Transfer 151 L50997 vo_CMS_2025 CMS 11 44.8G 269.6 2002091 0 - cephUser ctaproduction 12 -
[root@tpsrv449 cta]# cta-admin dr ls first
library drive host desired request status since vid tapepool vo files data MB/s session priority activity scheduler instance age reason
IBMLIB1-LTO9 IBMLIB1-LTO9-F09C1R1 tpsrv449 Up Retrieve Transfer 481 L50997 vo_CMS_2025 CMS 27 102.4G 206.4 2002091 0 - cephUser ctaproduction 12 -
In the log file of the cta-taped on the tape server you should see lines like this:
{"epoch_time":1757668849.349325545,"local_time":"2025-09-12T11:20:49+0200","hostname":"tpsrv449","program":"cta-taped","log_level":"INFO","pid":1533109,"tid":1533343,"message":"File successfully read from tape","drive_name":"IBMLIB1-LTO9-F09C1R1","instance":"ctaproduction","sched_backend":"cephUser","thread":"TapeRead","tapeDrive":"IBMLIB1-LTO9-F09C1R1","tapeVid":"L50997","mountId":"2002091","vo":"CMS","tapePool":"vo_CMS_2025","mediaType":"LTO9","logicalLibrary":"IBMLIB1-LTO9","mountType":"Retrieve","labelFormat":"0000","vendor":"IBM-SONY","capacityInBytes":18000000000000,"fileId":4886084411,"BlockId":71511836,"fSeq":4645,"dstURL":"file://dummy","isRepack":false,"isVerifyOnly":true,"positionTime":0.039334,"readWriteTime":6.650987,"waitFreeMemoryTime":4e-05,"waitReportingTime":0.00243800000000001,"transferTime":6.653465,"totalTime":6.693677,"dataVolume":2797578535,"headerVolume":480,"driveTransferSpeedMBps":417.943533128354,"payloadTransferSpeedMBps":417.943461418888,"LBPMode":"LBP_On","repackFilesCount":0,"repackBytesCount":0,"userFilesCount":0,"userBytesCount":0,"verifiedFilesCount":1,"verifiedBytesCount":2797578535,"checksumType":"ADLER32","checksumValue":"e99533da"}
{"epoch_time":1757668849.349429577,"local_time":"2025-09-12T11:20:49+0200","hostname":"tpsrv449","program":"cta-taped","log_level":"INFO","pid":1533109,"tid":1533347,"message":"File successfully verified","drive_name":"IBMLIB1-LTO9-F09C1R1","instance":"ctaproduction","sched_backend":"cephUser","thread":"DiskWrite","tapeDrive":"IBMLIB1-LTO9-F09C1R1","tapeVid":"L50997","mountId":"2002091","vo":"CMS","tapePool":"vo_CMS_2025","threadCount":10,"threadID":3,"fileId":4886084411,"dstURL":"file://dummy","fSeq":4645,"readWriteTime":0.0,"checksumingTime":0.0,"waitDataTime":174.262309,"waitReportingTime":0.000122,"checkingErrorTime":0.0,"openingTime":0.0,"closingTime":0.0,"transferTime":174.262696,"totalTime":174.262696,"dataVolume":0,"globalPayloadTransferSpeedMBps":0.0,"diskPerformanceMBps":0.0,"openRWCloseToTransferTimeRatio":0.0}
You may notice that the destination file is dummy with "file://dummy" and that this is a verification session with isVerifyOnly":true.
The final message of the session is:
{"epoch_time":1757669252.570209822,"local_time":"2025-09-12T11:27:32+0200","hostname":"tpsrv449","program":"cta-taped","log_level":"INFO","pid":7996,"tid":7996,"message":"Tape session finished","drive_name":"IBMLIB1-LTO9-F09C1R1","instance":"ctaproduction","sched_backend":"cephUser","capacityInBytes":"18000000000000","logicalLibrary":"IBMLIB1-LTO9","mediaType":"LTO9","mountAttempted":"1","mountId":"2002091","mountType":"Retrieve","tapePool":"vo_CMS_2025","tapeVid":"L50997","vendor":"IBM-SONY","vo":"CMS","volReqId":"2002091","wasTapeMounted":"1","mountTime":"18.853548","positionTime":"272.10096","waitInstructionsTime":"0.554444","waitFreeMemoryTime":"0.002791","waitDataTime":"0.0","waitReportingTime":"0.107093","checksumingTime":"0.0","readWriteTime":"295.94293","flushTime":"0.0","unloadTime":"242.245613","unmountTime":"20.070627","encryptionControlTime":"0.008753","transferTime":"296.607258","totalTime":"849.529503","deliveryTime":"587.606845","drainingTime":"0.0","dataVolume":"115527137288","filesCount":"30","headerVolume":"14400","payloadTransferSpeedMBps":"135.989552899612","driveTransferSpeedMBps":"135.989569850172","repackFilesCount":"0","userFilesCount":"0","verifiedFilesCount":"30","repackBytesCount":"0","userBytesCount":"0","verifiedBytesCount":"115527137288","status":"success","tapeDrive":"IBMLIB1-LTO9-F09C1R1","subprocessPid":1533109,"exitCode":0,"killSignal":0}
where you should note the non-zero verification counters: "verifiedFilesCount":"30" and verifiedBytesCount":"115527137288".
To configure all of this, there is the above mentioned verification mount policy and these settings in the various configuration files:
[/etc/cta/cta-cli.conf]:
eos.instance eosctabogus
eos.requester.user verification
eos.requester.group it
[/etc/cta/cta-frontend-xrootd.conf]:
cta.verification.mount_policy verification
[/etc/cta-ops/cta-ops-config.yaml]:
# -------------------------------
# CTA Tape Verification
# -------------------------------
cta-ops-tape-verify:
debug: false
logger:
log_dir: "/var/log/cta-ops/verification/"
cta-ops-verify-tape:
default_read_data_size: '0B'
default_read_time: 0
default_first: 10
default_random: 10
default_last: 10
cta-ops-verification-feeder:
verification_mount_policy: 'verification'
default_min_age: 0
default_max_verify: 10
default_min_data_on_tape: 0
default_min_relative_capacity: 0
default_verify_options: '--first 10 --last 10 --read_time 30'
default_verify_policy: 'random'
default_tape_verify_path: 'cta-ops-verify-tape'
default_feeder_log_path: '/var/log/cta/verification/cta-verification-feeder.log'
ts_format: '%Y-%m-%d %H:%M:%S'
sleep_time: 120 # 2* 60
[/etc/cta-ops/error-messages.yaml]:
cta-ops-tape-verify:
cta-verify-file:
- tool_string: "Optional string parameter :MOUNT_POLICY_NAME is an empty string"
translation: "Verification Mount Policy for this verification request is not correctly defined."
I would suggest you try to reproduce our setup and start with cta-verify-file to submit verification for just one file of any tape - just to see if it gets queued.
Example:
[tape-local@ctaproductionfrontend11 cta-ops]$ /usr/bin/cta-verify-file --vid L50997 --id 4885744601
RetrieveRequest-Frontend-ctaproductionfrontend11.cern.ch-1130-20250706-09:41:39-0-38711
[tape-local@ctaproductionfrontend11 cta-ops]$ cta-admin sq|grep L50997
Retrieve ctaproduction cephUser vo_CMS_2025 CMS IBMLIB1-LTO9 L50997 1 3.9G 14 14 50 600 10 60 0 0 0 18.0T 4648 18.7T 1 0
If you need further clarification, I will try to explain them here. Once we have all information in place, I will then turn this ticket into a documentation.
Please let me know how it goes.
Vladimir Bahyl
CERN