Use of cta-ops-verify-tape

Hello,

Would it be possible to get some clues on how to use cta-ops-verify-tape tool?

I tried to run the tool on one f our preprod tapes using the default config options in cta-operations-utilities/cta-ops-config.yaml but I got the following error

2025-08-19 10:24:38 [INFO] [verify_tape] Running verify-tape for tape with vid JL0504, read speed of 300 MB/s, and data size: 0B
2025-08-19 10:24:43 [INFO] [partial_tape_scan] Performing partial tape scan.
2025-08-19 10:24:50 [INFO] [verify_tape] Verifying 30 files and 31.5M from tape JL0504
2025-08-19 10:24:50 [INFO] [verify_files] ArchiveId of files to verify: 4294991264,4294991252,4294991247,4294991253,4294991267,4294991266,4294991257,4294991268,4294991259,4294991270,4295021090,4295022162,4295033073,4295038281,4295038974,4295042002,4295044056,4295044384,4295049625,4295049913,4294983452,4294983451,4294983455,4294983453,4294983454,4294983456,4294983460,4294983457,4294983462,4294983458
2025-08-19 10:24:50 [CRITICAL] [log_and_exit] Could not submit verification request for archiveId 4294991264 of tape JL0504 (STDERR: instance must be specified in /etc/cta/cta-cli.conf)

I can see this option in /etc/cta/cta-cli.conf.example.

Also, dont we need to specify a drive from the examined tape to be mounted?

Thanks,

George

Hi George,

I can see this option in /etc/cta/cta-cli.conf.example.

Could you check that you have this option set in your /etc/cta/cta-cli.conf file as well (not just the example config file)?
The error is odd, as the tapeadmin library should specify a made up (“eosctabogus” by default) instance, in order to clearly distinguish the verification job from production activity in the logs. It should bit this bit in the code here: Files · master · CTA / CTA Operations Utilities · GitLab

For the verification, the tool calls the CTA command cta-verify-file under the hood. Could you check that it works by running something like cta-verify-file --instance eosctabogus --id 4294991264 –vid JL0504, and see if that succeeds?

Also, dont we need to specify a drive from the examined tape to be mounted?

No, there should be no need. From what I remember (perhaps someone else can fact check?), the command will queue the file for verification on the CTA-side, and CTA will use any available & eligible drive for that particular tape to schedule/perform it.

Hi Richard,

Thanks for the reply.

I meant to say that I cannot see the eoscta instance option in the /etc/cta/cta-cli.conf.example so that I can add it then to my /etc/cta/cta-cli.conf which exists. I tried to run cta-verify-tape command you suggested and I got the same error as when I ran cta-ops-verify-tape itself

2025-08-19 12:10:50 [CRITICAL] [log_and_exit] Could not submit verification request for archiveId 4294991264 of tape JL0504 (STDERR: Error in Google Protocol Buffers: Instance name “eosctabogus” does not match key identifier “cta-admin”

Ah, then I get you.

Upon closer inspection it looks like that config example is out of date…
Please set the `eos.instance eosctabogusoption in the /etc/cta/cta-cli.conffile on the CTA frontend node you use for operator actions.
The cta-verify-file tool seems to have a minor weakness in that first checks for the presence of that option in the config file, and only after overwrites the config option with any optionally provided command line argument.
That should do it, I hope.

I added eos.instance eosctabogus to the /etc/cta/cta-cli.conf on the operator frontened but

(venv) [root@cta-adm-preprodfac georgep]# cta-verify-file --instance eosctabogus --id 4294991264 --vid JL0504
instance must be specified in /etc/cta/cta-cli.conf

Adding the same option in /etc/cta/cta-cli.conf on the CTA admin node where I run the tool results in (venv) [root@cta-adm-preprodfac georgep]# cta-verify-file --instance eosctabogus --id 4294991264 --vid JL0504
instance must be specified in /etc/cta/cta-cli.conf
(venv) [root@cta-adm-preprodfac georgep]# vim /etc/cta/cta-cli.conf
(venv) [root@cta-adm-preprodfac georgep]# cta-verify-file --instance eosctabogus --id 4294991264 --vid JL0504
Error in Google Protocol Buffers: Instance name “eosctabogus” does not match key identifier “cta-admin”
/usr/lib64/libctacommon.so.0(cta::exception::Backtrace::Backtrace(bool)+0x6b) [0x7f106b9a9fcd]
/usr/lib64/libctacommon.so.0(cta::exception::Exception::Exception(std::basic_string_view<char, std::char_traits >, bool)+0x91) [0x7f106b9ab095]
/lib64/libXrdSsiCta.so(cta::exception::PbException::Exception(std::basic_string_view<char, std::char_traits >, bool)+0x4c) [0x7f106e457aae]
/lib64/libXrdSsiCta.so(cta::frontend::WorkflowEvent::WorkflowEvent(cta::frontend::FrontendService const&, cta::common::dataStructures::SecurityIdentity const&, cta::eos::Notification const&)+0x6a9) [0x7f106e50ebfb]
/lib64/libXrdSsiCta.so(cta::xrd::RequestMessage::process(cta::xrd::Request const&, cta::xrd::Response&, XrdSsiStream*&)+0x30a) [0x7f106e456fcc]
/lib64/libXrdSsiCta.so(XrdSsiPb::RequestProc<cta::xrd::Request, cta::xrd::Response, cta::xrd::Alert>::ExecuteAction()+0x169) [0x7f106e45301b]
/lib64/libXrdSsiCta.so(XrdSsiPb::RequestProc<cta::xrd::Request, cta::xrd::Response, cta::xrd::Alert>::Execute()+0xd0) [0x7f106e4505d0]
/lib64/libXrdSsiCta.so(XrdSsiPb::Service<cta::xrd::Request, cta::xrd::Response, cta::xrd::Alert>::ProcessRequest(XrdSsiRequest&, XrdSsiResource&)+0x8f) [0x7f106e44f4a7]
/lib64/libXrdUtils.so.3(XrdScheduler::Run()+0x14a) [0x7f10714637aa]
/lib64/libXrdUtils.so.3(XrdStartWorking(void*)+0xd) [0x7f10714638ad]
/lib64/libXrdUtils.so.3(XrdSysThread_Xeq+0x3c) [0x7f10714a1b0c]
/lib64/libc.so.6(+0x8a19a) [0x7f1070c8a19a]
/lib64/libc.so.6(+0x10f210) [0x7f1070d0f210]the error I pasted above

Coming to think of it, the eosctabogus disk instance may need to exist. I’m unsure because this exact part I think I never touched. You could try to create it, and see if that does the trick.
The disk instance shouldn’t really be used, as the verification doesn’t write the files to any instance.

Actually indeed: at CERN we created this disk instance for verification: something like this should create exactly what we have configured on all our CTA instances:

cta-admin di add –-name eosctabogus –-comment “Bogus eosctainstance that does not exist for verification“

Thanks for this. I created the eosctabogus disk instance and added the eos.instance directive to the /etc/cta/cta-cli.conf on the client and the frontend but I still get the error

Error in Google Protocol Buffers: Instance name "eosctabogus" does not match key identifier "cta-admin"

Is this becase I need add it to the /etc/cta/eos.sss.keytab with its own SSS key?

No need for any SSS entry, but you need VERIFICATIONvo defined in CTA, and a verification mount policy configured in the frontend config file (and mp defined):

# grep verification /etc/cta/cta-frontend-xrootd.conf
cta.verification.mount_policy  verification

I guess configuring verification deserves a dedicated documentation page.

Hi Julien,

Thanks for this but I still get an error

2025-09-02 16:52:57 [CRITICAL] [log_and_exit] Could not submit verification request for archiveId 4294991264 of tape JL0504 (STDERR: Error in Google Protocol Buffers: Instance name "eosctabogus" does not match key identifier "cta-admin"

I defined a VERIFICATION vo (capital case) as well as a verification mount policy and I added the cta.verification.mount_policy directive in /etc/cta/cta-frontend-xrootd.conf. What I am missing?

Dear George,

we discussed this use case at our weekly ops meeting and it was decided that I will take over and write proper documentation for the CTA Tape Verify framework.

Please let’s start from this page:

The tape verification framework consist of 3 components:

  • /opt/cta-ops/ops-venv/bin/cta-ops-verification-feeder (python script) which will first select tapes to be verified based on various criteria.
  • cta-ops-verify-tape (python script) will then select which files on each selected tape should be verified based on various criteria
  • /usr/bin/cta-verify-file (binary) will finally submit verifications requests to the queue for the selected files from that give tape

This is how the execution of cta-ops-verification-feeder should look like:

[tape-local@ctaproductionfrontend11 ~]$ /opt/cta-ops/ops-venv/bin/cta-ops-verification-feeder --maxverify 20 --min_data_on_tape 1000000000000 --verify_options "--first 10 --last 10 --read_time 30"
2025-09-12 11:00:37 [INFO] [main] Executed as: /opt/cta-ops/ops-venv/bin/cta-ops-verification-feeder --maxverify 20 --min_data_on_tape 1000000000000 --verify_options --first 10 --last 10 --read_time 30
2025-09-12 11:00:56 [INFO] [main] Currently running verification on tapes: I04202, I04212, L50613, L53219, L60572, L60616, L65149, L65568, L65760, L92019, L92787, L98074
2025-09-12 11:00:56 [INFO] [main] 12 tapes are currently being verified, target is 20
2025-09-12 11:00:56 [INFO] [main] 51937 tapes are eligible for verification
2025-09-12 11:00:56 [INFO] [main] --min_data_on_tape specified, selecting only tapes with at least 1000000000000 bytes written
2025-09-12 11:00:56 [INFO] [main] After selecting only tapes with at least 1000000000000 bytes written, 51873 are eligible
2025-09-12 11:00:57 [INFO] [main] Based on policy: random, following new tapes have been selected for verification: I56818, L50997, L95366, I75988, L97267, I62014, I76537, I56470
2025-09-12 11:00:57 [INFO] [main] Submitting verification for tape I56818 (media type: 3592JD15T, logical library: IBMLIB4-TS1160, tape pool: vo_ATLAS_raw, total files: 9050, total bytes: 15449035707560) using command: cta-ops-verify-tape --vid I56818 --first 10 --last 10 --read_time 30
2025-09-12 11:01:02 [INFO] [main] Tape I56818 successfully submitted for verification.
2025-09-12 11:01:02 [INFO] [main] Waiting 120 seconds to start the next verification job...
2025-09-12 11:03:02 [INFO] [main] Submitting verification for tape L50997 (media type: LTO9, logical library: IBMLIB1-LTO9, tape pool: vo_CMS_2025, total files: 4648, total bytes: 18747095131345) using command: cta-ops-verify-tape --vid L50997 --first 10 --last 10 --read_time 30
2025-09-12 11:03:04 [INFO] [main] Tape L50997 successfully submitted for verification.
2025-09-12 11:03:04 [INFO] [main] Waiting 120 seconds to start the next verification job...
     :
     :
     :
2025-09-12 11:15:35 [INFO] [main] All verification jobs submitted successfully

This is example of what cta-ops-verify-tape would do:

2025-09-12 11:00:57 [INFO] [verify_tape] Running verify-tape for tape with vid I56818, read speed of 300 MB/s, and data size: 566.2G
2025-09-12 11:00:59 [INFO] [partial_tape_scan] Performing partial tape scan.
2025-09-12 11:00:59 [INFO] [verify_tape] Verifying 30 files and 56.3G from tape I56818
2025-09-12 11:00:59 [INFO] [verify_files] ArchiveId of files to verify: 1723971173,1723969110,1723973684,1723970920,1723971908,1723971012,1723963741,1723962770,1723952600,1723972380,1723972082,1723986609,1723990392,1723998596,1724020618,1724020070,1724107269,1724125779,1724126480,17241358
33,1724159502,1724159687,1724157621,1724159132,1724158227,1724158771,1724158420,1724159497,1724159135,1724158162
2025-09-12 11:01:01 [INFO] [verify_files] All file verifications queued for tape I56818

2025-09-12 11:03:02 [INFO] [verify_tape] Running verify-tape for tape with vid L50997, read speed of 300 MB/s, and data size: 566.2G
2025-09-12 11:03:03 [INFO] [partial_tape_scan] Performing partial tape scan.
2025-09-12 11:03:03 [INFO] [verify_tape] Verifying 30 files and 115.5G from tape L50997
2025-09-12 11:03:03 [INFO] [verify_files] ArchiveId of files to verify: 4885744601,4885744597,4885744787,4885744616,4885744609,4885744645,4885744729,4885744782,4885744709,4885744789,4885744620,4885758367,4885959003,4885962075,4885965022,4885971897,4885979944,4886016608,4886026103,4886065410,4886084467,4886084285,4886084402,4886084468,4886084471,4886084282,4886084411,4886084388,4886084397,4886084448
2025-09-12 11:03:04 [INFO] [verify_files] All file verifications queued for tape L50997

These requests are then submitted using verification mount policy:

[tape-local@ctaproductionfrontend11 cta-ops]$ cta-admin mp ls | egrep "c.user|verification"
        mount policy a.priority a.minAge r.priority r.minAge   c.user                  c.host           c.time   m.user                  m.host           m.time      instance comment
        verification         50    14400         50      600    vlado ctaproductionfrontend01 2021-08-12 08:55    vlado ctaproductionfrontend02 2022-09-06 12:01 ctaproduction Tape Media Verification framework mount policy

and look like this when queued:

[tape-local@ctaproductionfrontend11 ~]$ cta-admin sq | egrep "I56818|L50997"
        Retrieve ctaproduction  cephUser           vo_ATLAS_raw       ATLAS   IBMLIB4-TS1160 I56818           30       56.3G    908      906       50     600              10               50           0          0         0          15.0T           9050         15.4T          1              0
        Retrieve ctaproduction  cephUser            vo_CMS_2025         CMS     IBMLIB1-LTO9 L50997            0           0      0        0        0       0              10               60           1         10     41.4G          18.0T           4648         18.7T          1              0

The 2nd one is already running on a drive:

[tape-local@ctaproductionfrontend11 ~]$ cta-admin dr ls | grep L50997
    IBMLIB1-LTO9     IBMLIB1-LTO9-F09C1R1 tpsrv449      Up         Retrieve Transfer    151 L50997            vo_CMS_2025    CMS    11  44.8G 269.6 2002091        0              -  cephUser ctaproduction  12 -

[root@tpsrv449 cta]# cta-admin dr ls first
     library                drive     host desired  request   status since    vid    tapepool  vo files   data  MB/s session priority activity scheduler      instance age reason
IBMLIB1-LTO9 IBMLIB1-LTO9-F09C1R1 tpsrv449      Up Retrieve Transfer   481 L50997 vo_CMS_2025 CMS    27 102.4G 206.4 2002091        0        -  cephUser ctaproduction  12 -

In the log file of the cta-taped on the tape server you should see lines like this:

{"epoch_time":1757668849.349325545,"local_time":"2025-09-12T11:20:49+0200","hostname":"tpsrv449","program":"cta-taped","log_level":"INFO","pid":1533109,"tid":1533343,"message":"File successfully read from tape","drive_name":"IBMLIB1-LTO9-F09C1R1","instance":"ctaproduction","sched_backend":"cephUser","thread":"TapeRead","tapeDrive":"IBMLIB1-LTO9-F09C1R1","tapeVid":"L50997","mountId":"2002091","vo":"CMS","tapePool":"vo_CMS_2025","mediaType":"LTO9","logicalLibrary":"IBMLIB1-LTO9","mountType":"Retrieve","labelFormat":"0000","vendor":"IBM-SONY","capacityInBytes":18000000000000,"fileId":4886084411,"BlockId":71511836,"fSeq":4645,"dstURL":"file://dummy","isRepack":false,"isVerifyOnly":true,"positionTime":0.039334,"readWriteTime":6.650987,"waitFreeMemoryTime":4e-05,"waitReportingTime":0.00243800000000001,"transferTime":6.653465,"totalTime":6.693677,"dataVolume":2797578535,"headerVolume":480,"driveTransferSpeedMBps":417.943533128354,"payloadTransferSpeedMBps":417.943461418888,"LBPMode":"LBP_On","repackFilesCount":0,"repackBytesCount":0,"userFilesCount":0,"userBytesCount":0,"verifiedFilesCount":1,"verifiedBytesCount":2797578535,"checksumType":"ADLER32","checksumValue":"e99533da"}
{"epoch_time":1757668849.349429577,"local_time":"2025-09-12T11:20:49+0200","hostname":"tpsrv449","program":"cta-taped","log_level":"INFO","pid":1533109,"tid":1533347,"message":"File successfully verified","drive_name":"IBMLIB1-LTO9-F09C1R1","instance":"ctaproduction","sched_backend":"cephUser","thread":"DiskWrite","tapeDrive":"IBMLIB1-LTO9-F09C1R1","tapeVid":"L50997","mountId":"2002091","vo":"CMS","tapePool":"vo_CMS_2025","threadCount":10,"threadID":3,"fileId":4886084411,"dstURL":"file://dummy","fSeq":4645,"readWriteTime":0.0,"checksumingTime":0.0,"waitDataTime":174.262309,"waitReportingTime":0.000122,"checkingErrorTime":0.0,"openingTime":0.0,"closingTime":0.0,"transferTime":174.262696,"totalTime":174.262696,"dataVolume":0,"globalPayloadTransferSpeedMBps":0.0,"diskPerformanceMBps":0.0,"openRWCloseToTransferTimeRatio":0.0}

You may notice that the destination file is dummy with "file://dummy" and that this is a verification session with isVerifyOnly":true.

The final message of the session is:

{"epoch_time":1757669252.570209822,"local_time":"2025-09-12T11:27:32+0200","hostname":"tpsrv449","program":"cta-taped","log_level":"INFO","pid":7996,"tid":7996,"message":"Tape session finished","drive_name":"IBMLIB1-LTO9-F09C1R1","instance":"ctaproduction","sched_backend":"cephUser","capacityInBytes":"18000000000000","logicalLibrary":"IBMLIB1-LTO9","mediaType":"LTO9","mountAttempted":"1","mountId":"2002091","mountType":"Retrieve","tapePool":"vo_CMS_2025","tapeVid":"L50997","vendor":"IBM-SONY","vo":"CMS","volReqId":"2002091","wasTapeMounted":"1","mountTime":"18.853548","positionTime":"272.10096","waitInstructionsTime":"0.554444","waitFreeMemoryTime":"0.002791","waitDataTime":"0.0","waitReportingTime":"0.107093","checksumingTime":"0.0","readWriteTime":"295.94293","flushTime":"0.0","unloadTime":"242.245613","unmountTime":"20.070627","encryptionControlTime":"0.008753","transferTime":"296.607258","totalTime":"849.529503","deliveryTime":"587.606845","drainingTime":"0.0","dataVolume":"115527137288","filesCount":"30","headerVolume":"14400","payloadTransferSpeedMBps":"135.989552899612","driveTransferSpeedMBps":"135.989569850172","repackFilesCount":"0","userFilesCount":"0","verifiedFilesCount":"30","repackBytesCount":"0","userBytesCount":"0","verifiedBytesCount":"115527137288","status":"success","tapeDrive":"IBMLIB1-LTO9-F09C1R1","subprocessPid":1533109,"exitCode":0,"killSignal":0}

where you should note the non-zero verification counters: "verifiedFilesCount":"30" and verifiedBytesCount":"115527137288".

To configure all of this, there is the above mentioned verification mount policy and these settings in the various configuration files:

[/etc/cta/cta-cli.conf]:
eos.instance  eosctabogus
eos.requester.user  verification
eos.requester.group  it

[/etc/cta/cta-frontend-xrootd.conf]:
cta.verification.mount_policy  verification
[/etc/cta-ops/cta-ops-config.yaml]:
  # -------------------------------
  # CTA Tape Verification
  # -------------------------------
  cta-ops-tape-verify:
    debug: false
    logger:
      log_dir: "/var/log/cta-ops/verification/"
    cta-ops-verify-tape:
      default_read_data_size: '0B'
      default_read_time: 0
      default_first: 10
      default_random: 10
      default_last: 10
    cta-ops-verification-feeder:
      verification_mount_policy: 'verification'
      default_min_age: 0
      default_max_verify: 10
      default_min_data_on_tape: 0
      default_min_relative_capacity: 0
      default_verify_options: '--first 10 --last 10 --read_time 30'
      default_verify_policy: 'random'
      default_tape_verify_path: 'cta-ops-verify-tape'
      default_feeder_log_path: '/var/log/cta/verification/cta-verification-feeder.log'
      ts_format: '%Y-%m-%d %H:%M:%S'
      sleep_time: 120  # 2* 60

[/etc/cta-ops/error-messages.yaml]:
  cta-ops-tape-verify:
    cta-verify-file:
      - tool_string: "Optional string parameter :MOUNT_POLICY_NAME is an empty string"
        translation: "Verification Mount Policy for this verification request is not correctly defined."

I would suggest you try to reproduce our setup and start with cta-verify-file to submit verification for just one file of any tape - just to see if it gets queued.

Example:

[tape-local@ctaproductionfrontend11 cta-ops]$ /usr/bin/cta-verify-file --vid L50997 --id 4885744601
RetrieveRequest-Frontend-ctaproductionfrontend11.cern.ch-1130-20250706-09:41:39-0-38711
[tape-local@ctaproductionfrontend11 cta-ops]$ cta-admin sq|grep L50997
        Retrieve ctaproduction  cephUser            vo_CMS_2025    CMS     IBMLIB1-LTO9 L50997            1        3.9G     14       14       50     600              10               60           0          0         0          18.0T           4648         18.7T          1              0

If you need further clarification, I will try to explain them here. Once we have all information in place, I will then turn this ticket into a documentation.

Please let me know how it goes.

Vladimir Bahyl
CERN

Hi Vlado,

Many thanks for your detailed reply. I have created a verification mount poloicy and eosctabogus disk instance which is included - via the directive eos.instance eosctabogus - in /etc/cta/cta-cli.conf of the frontend node that is responsible for admin commands

After exporting the CTA client keytab

export XrdSecSSSKT=/etc/cta/cta-cli.keytab

I tried to run the following

cta-ops-verify-tape -v JL0504 -C cta-operations-utilities/cta-ops-config.yaml

but unfortunatelly, I got - as before - the following error

2025-09-23 15:19:06 [INFO] [verify_tape] Running verify-tape for tape with vid JL0504, read speed of 300 MB/s, and data size: 0B
2025-09-23 15:19:11 [INFO] [partial_tape_scan] Performing partial tape scan.
2025-09-23 15:19:18 [INFO] [verify_tape] Verifying 30 files and 31.5M from tape JL0504
2025-09-23 15:19:18 [INFO] [verify_files] ArchiveId of files to verify: 4294991264,4294991252,4294991247,4294991253,4294991267,4294991266,4294991257,4294991268,4294991259,4294991270,4294991447,4295001962,4295009317,4295019797,4295020051,4295020856,4295021018,4295022449,4295042933,4295049274,4294983452,4294983451,4294983455,4294983453,4294983454,4294983456,4294983460,4294983457,4294983462,4294983458
2025-09-23 15:19:18 [CRITICAL] [log_and_exit] Could not submit verification request for archiveId 4294991264 of tape JL0504 (STDERR: Error in Google Protocol Buffers: Instance name “eosctabogus” does not match key identifier “cta-admin”
/usr/lib64/libctacommon.so.0(cta::exception::Backtrace::Backtrace(bool)+0x6b) [0x7f0be71a9fcd]
/usr/lib64/libctacommon.so.0(cta::exception::Exception::Exception(std::basic_string_view<char, std::char_traits >, bool)+0x91) [0x7f0be71ab095]
/lib64/libXrdSsiCta.so(cta::exception::PbException::Exception(std::basic_string_view<char, std::char_traits >, bool)+0x4c) [0x7f0be9c57aae]
/lib64/libXrdSsiCta.so(cta::frontend::WorkflowEvent::WorkflowEvent(cta::frontend::FrontendService const&, cta::common::dataStructures::SecurityIdentity const&, cta::eos::Notification const&)+0x6a9) [0x7f0be9d0ebfb]
/lib64/libXrdSsiCta.so(cta::xrd::RequestMessage::process(cta::xrd::Request const&, cta::xrd::Response&, XrdSsiStream*&)+0x30a) [0x7f0be9c56fcc]
/lib64/libXrdSsiCta.so(XrdSsiPb::RequestProc<cta::xrd::Request, cta::xrd::Response, cta::xrd::Alert>::ExecuteAction()+0x169) [0x7f0be9c5301b]
/lib64/libXrdSsiCta.so(XrdSsiPb::RequestProc<cta::xrd::Request, cta::xrd::Response, cta::xrd::Alert>::Execute()+0xd0) [0x7f0be9c505d0]
/lib64/libXrdSsiCta.so(XrdSsiPb::Service<cta::xrd::Request, cta::xrd::Response, cta::xrd::Alert>::ProcessRequest(XrdSsiRequest&, XrdSsiResource&)+0x8f) [0x7f0be9c4f4a7]
/lib64/libXrdUtils.so.3(XrdScheduler::Run()+0x14a) [0x7f0becd467aa]
/lib64/libXrdUtils.so.3(XrdStartWorking(void*)+0xd) [0x7f0becd468ad]
/lib64/libXrdUtils.so.3(XrdSysThread_Xeq+0x3c) [0x7f0becd84b0c]
/lib64/libc.so.6(+0x8a19a) [0x7f0bec68a19a]
/lib64/libc.so.6(+0x10f210) [0x7f0bec70f210])

What exactly I am missing?

George

Dear George,

as I said earlier, let us ignore the cta-ops-* framework for now and please start debugging this with the lowest level command with just one file to verify.

Could you please look at the last example of my previous reply and try this command (modifed for your values):

/usr/bin/cta-verify-file --vid JL0504 --id 4295020856

What happens?
Is the retrieve request created as in my example?
Is the tape JL0504 queued in the output of the cta-admin sq|grep JL0504 command?
Do you get the same error / exception?

Please let me know. Best regards,

Vladimir Bahyl
CERN

Dear George,

I will also take a look now at this issue to see if maybe there is a bug in the tool code that we need to fix.

In the meantime, I would also like to suggest something to try out to see if it bypasses this error:

Could you try setting the instance name to “cta-admin” instead of “eosctabogus” and see if that gets rid of the error?

Best regards,
Konstantina

Hi Vlado,

Apologies, I missed your suggestion to run /usr/bin/cta-verify-file only. I ran this command for a single file from a tape and request was queued. The tape was mounted on the drive but I dont think much else happened.

Konstantina, thank you very much for your suggestion! Indeed, by adding a cta-admin disk instance and updating /etc/cta/cta-cli.conf on admin node where I run the ops tool and on the operator frontend with the dfirective eos.instance cta-admin got rid of the error I mentioned above.

Best,

George

Sorry for the hassle Vlado. As I said, I did manage to run sucessfully cta-verify-file . Am I ready to know run cta-ops-verify-tape on a whole tape? Is this tool supposed to be used only for tape with data or also on empty (repacked) tapes?

Hello George,
When cta-verify-file works you should in principle be ready to use the cta-ops-verify-tape tool.
If the latter should still fail, it likely indicates some issue with the config supplied to the cta-ops- tools.

The cta-ops-verify-tape tool is intended to be used on tapes with files/data on them, to perform periodic checks that the data on them matches what is expected (by way of checksumming). It is just a wrapper on top of the cta-verify-file tool, to make it easier to automate these checks or perform them in bulk.
You could for instance set up a workflow which samples X files on Y tapes each day, to see if there were any issues during writing, or with the media.

For repacking (verifying that a tape is indeed fully empty and such), please use the cta-ops-repack-* tools provided by the atresys pip package instead.

Thanks Richard.

Although I can run cta-verify-file - by defining a cta-admin disk instance as Konstantina suggested - I still get a Frontend error/crash when I try to run cta-ops-verify-tape.
As you can see from the error pasted below, the tool issues a PREPARE event with eosInstance=“eosctabogus” This is somewhere hard-coded . I have removed all references to eosctabogus instance in the cta-cli.conf and deleted the disk instance from the CTA DB altogether.

Oct  1 11:53:09.997765323 cta-front03 cta-frontend: LVL="INFO" PID="2134346" TID="2135054" MSG="In WorkflowEvent::WorkflowEvent(): received event." instance="antares-preprod" sched_backend="cephUser" user="cta-admin@cta-front03" eventType="PREPARE" eosInstance="eoscta
bogus" diskFilePath="dummy" diskFileId=""
Oct  1 11:53:09.999397993 cta-front03 cta-frontend: LVL="ERROR" PID="2134346" TID="2135054" MSG="In RequestProc::ExecuteAction(): RSP_ERR_PROTOBUF: Instance name "eosctabogus" does not match key identifier "cta-admin"
/usr/lib64/libctacommon.so.0(cta::exception::Backtrace::Backtrace(bool)+0x6b) [0x7f187a5a9fcd]
/usr/lib64/libctacommon.so.0(cta::exception::Exception::Exception(std::basic_string_view<char, std::char_traits<char> >, bool)+0x91) [0x7f187a5ab095]
/lib64/libXrdSsiCta.so(cta::exception::PbException::Exception(std::basic_string_view<char, std::char_traits<char> >, bool)+0x4c) [0x7f187d057aae]
/lib64/libXrdSsiCta.so(cta::frontend::WorkflowEvent::WorkflowEvent(cta::frontend::FrontendService const&, cta::common::dataStructures::SecurityIdentity const&, cta::eos::Notification const&)+0x6a9) [0x7f187d10ebfb]
/lib64/libXrdSsiCta.so(cta::xrd::RequestMessage::process(cta::xrd::Request const&, cta::xrd::Response&, XrdSsiStream*&)+0x30a) [0x7f187d056fcc]
/lib64/libXrdSsiCta.so(XrdSsiPb::RequestProc<cta::xrd::Request, cta::xrd::Response, cta::xrd::Alert>::ExecuteAction()+0x169) [0x7f187d05301b]
/lib64/libXrdSsiCta.so(XrdSsiPb::RequestProc<cta::xrd::Request, cta::xrd::Response, cta::xrd::Alert>::Execute()+0xd0) [0x7f187d0505d0]
/lib64/libXrdSsiCta.so(XrdSsiPb::Service<cta::xrd::Request, cta::xrd::Response, cta::xrd::Alert>::ProcessRequest(XrdSsiRequest&, XrdSsiResource&)+0x8f) [0x7f187d04f4a7]
/lib64/libXrdUtils.so.3(XrdScheduler::Run()+0x14a) [0x7f18801477aa]
/lib64/libXrdUtils.so.3(XrdStartWorking(void*)+0xd) [0x7f18801478ad]
/lib64/libXrdUtils.so.3(XrdSysThread_Xeq+0x3c) [0x7f1880185b0c]
/lib64/libc.so.6(+0x8a19a) [0x7f187fa8a19a]
/lib64/libc.so.6(+0x10f240) [0x7f187fb0f240]
" instance="antares-preprod" sched_backend="cephUser"
Oct  1 11:54:00.178738496 cta-front03 cta-frontend: LVL="INFO" PID="2134346" TID="2134888" MSG="In Scheduler::authorizeAdmin(): success." instance="antares-preprod" sched_backend="cephUser" user="cta-admin@cta-front03" catalogueTime="0.003159"
Oct  1 11:54:00.182636938 cta-front03 cta-frontend: LVL="INFO" PID="2134346" TID="2135051" MSG="In Scheduler::authorizeAdmin(): success." instance="antares-preprod" sched_backend="cephUser" user="cta-admin@cta-front03" catalogueTime="2.1e-05"
Oct  1 11:54:00.186831188 cta-front03 cta-frontend: LVL="INFO" PID="2134346" TID="2135053" MSG="In Scheduler::authorizeAdmin(): success." instance="antares-preprod" sched_backend="cephUser" user="cta-admin@cta-front03" catalogueTime="1.7e-05"
Oct  1 11:54:00.201599775 cta-front03 cta-frontend: LVL="INFO" PID="2134346" TID="2134888" MSG="In RequestMessage::process(): Admin command succeeded." instance="antares-preprod" sched_backend="cephUser" user="cta-admin@cta-front03" command="tapepool" subcommand="ls" status="success" adminTime="0.022547"

Thank you George for confirming,

I will now look at this issue on a priority because it seems the problem might be in the CTA code then and not on the configuration side.

Let me quickly read through the thread and the implementation of the tools and I will get back to you by tomorrow.

Best,
Konstantina

Hi George,
That… is actually indeed a shortcoming in the code.
I remember now that this value was hard-coded due to Standalone CLI tools don't support specifying a path for the cta-cli.conf file (#514) · Issues · CTA / CTA · GitLab .

This here https://gitlab.cern.ch/cta/cta-operations-utilities/-/merge_requests/65 is completely untested for now, but should be a workaround that allows setting the instance in the cta-ops-config.yamlfile.