Hello,
We have set up a second EOS disk cluster on our develepment EOS/CTA instance. I can write files to an EOS dir that is not associated with any workflows without any problem.
However, as soon as I assign the ACLs,storage class and workflows, I get the following error
Run: [ERROR] Server responded with an error: [3007] Unable to store file - file has been cleaned because of a queueing to archive error; reason=“” /eos/antaresfacdev/daas-clf/CASTOR_hosts_all; input/output error (destination)
The SYNC::CREATE event is sent successfully to the Frontend but for some unclear reason the FST “sending error message to manager” and the SYNC::ARCHIVE event fails.
Can you please give any pointers? Many thanks, George
220805 15:38:27 time=1659710307.981855 func=SendArchiveFailedToManager level=INFO logid=44dc6176-14cc-11ed-9027-0c42a1f42af0 unit=fst@antares-eos15.scd.rl.ac.uk:1095 tid=00007f46f66fb700 source=XrdFstOfsFile:3867 tident=georgep.2723735:68@lcgui06.gridpp.rl.ac.uk sec= uid=1100 gid=1100 name=nobody geo=“” msg=“sending error message to manager” path=“/eos/antaresfacdev/daas-clf/CASTOR_hosts_all” manager=“antares-eos15.scd.rl.ac.uk:1094 ” errorReportOpaque=“/?mgm.pcmd=event&mgm.fid=f&mgm.logid=cta&mgm.event=sync::archive_failed&mgm.workflow=default&mgm.path=/dummy_path&mgm.ruid=0&mgm.rgid=0&mgm.errmsg=”
Hello,
Could you please provide EOS MGM and CTA Frontend logs for this error? Also, are there any error messages in FST log before the one you posted?
Cheers,
Vova
Hi Vova,
Thanks for the reply
Please see below for the MGM and FST logs (highlighted what I think are the relevant bits). Didn’t see any messages in the CTA Frontend pressumably because the ARCHIVE request never reached it.
MGM log
220808 11:42:04 time=1659955324.251301 func=open level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=mgm@antares-eos15.scd.rl.ac.uk:1094 tid=00007f60c046d700 source=XrdMgmOfsFile:500 tident=georgep.3087654:354@lcgui06.gridpp.rl.ac.uk sec=sss uid=1100 gid=1100 name=daaas-clf-backup geo=“” op=write trunc=0 path=/eos/antaresfacdev/daas-clf/echo_eos_tpc.log info=oss.asize=107865
220808 11:42:04 time=1659955324.251526 func=open level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=mgm@antares-eos15.scd.rl.ac.uk:1094 tid=00007f60c046d700 source=XrdMgmOfsFile:660 tident=georgep.3087654:354@lcgui06.gridpp.rl.ac.uk sec=sss uid=1100 gid=1100 name=daaas-clf-backup geo=“” msg=“rewrote symlinks” sym-path=/eos/antaresfacdev/daas-clf/echo_eos_tpc.log realpath=/eos/antaresfacdev/daas-clf/echo_eos_tpc.log
220808 11:42:04 time=1659955324.253607 func=open level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=mgm@antares-eos15.scd.rl.ac.uk:1094 tid=00007f60c046d700 source=XrdMgmOfsFile:1076 tident=georgep.3087654:354@lcgui06.gridpp.rl.ac.uk sec=sss uid=1100 gid=1100 name=daaas-clf-backup geo=“” acl=1 r=1 w=1 wo=0 egroup=0 shared=0 mutable=1 facl=0
220808 11:42:04 time=1659955324.256402 func=open level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=mgm@antares-eos15.scd.rl.ac.uk:1094 tid=00007f60c046d700 source=XrdMgmOfsFile:1633 tident=georgep.3087654:354@lcgui06.gridpp.rl.ac.uk sec=sss uid=1100 gid=1100 name=daaas-clf-backup geo=“” blocksize=4096 lid=100012
220808 11:42:04 time=1659955324.258172 func=HandleProtoMethodEvents level=INFO logid=static… unit=mgm@antares-eos15.scd.rl.ac.uk:1094 tid=00007f60c046d700 source=WFE:1619 tident= sec=(null) uid=99 gid=99 name=- geo=“” default SYNC::CREATE /eos/antaresfacdev/daas-clf/echo_eos_tpc.log cta-front02.scd.rl.ac.uk:10955 fxid=00000016 mgm.reqid=“”
220808 11:42:04 time=1659955324.321055 func=SendProtoWFRequest level=INFO logid=static… unit=mgm@antares-eos15.scd.rl.ac.uk:1094 tid=00007f60c046d700 source=WFE:2524 tident= sec=(null) uid=99 gid=99 name=- geo=“” protoWFEndPoint=“cta-front02.scd.rl.ac.uk:10955 ” protoWFResource=“/ctafrontend” fullPath=“/eos/antaresfacdev/daas-clf/echo_eos_tpc.log” event=“sync::create” timeSpentMs=58 msg=“Sent SSI protocol buffer request”
220808 11:42:04 time=1659955324.321792 func=open level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=mgm@antares-eos15.scd.rl.ac.uk:1094 tid=00007f60c046d700 source=XrdMgmOfsFile:2960 tident=georgep.3087654:354@lcgui06.gridpp.rl.ac.uk sec=sss uid=1100 gid=1100 name=daaas-clf-backup geo=“” msg=“workflow trigger returned” retc=0 errno=0
220808 11:42:04 time=1659955324.322106 func=open level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=mgm@antares-eos15.scd.rl.ac.uk:1094 tid=00007f60c046d700 source=XrdMgmOfsFile:3031 tident=georgep.3087654:354@lcgui06.gridpp.rl.ac.uk sec=sss uid=1100 gid=1100 name=daaas-clf-backup geo=“” op=write path=/eos/antaresfacdev/daas-clf/echo_eos_tpc.log info=oss.asize=107865 target[0]=(antares-eos15.scd.rl.ac.uk ,2) redirection=antares-eos15.scd.rl.ac.uk?&cap.sym= <…>&cap.msg=<…>&mgm.logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0&mgm.replicaindex=0&mgm.replicahead=0&mgm.id=00000016&mgm.event=sync::closew&mgm.workflow=default&mgm.instance=eosantaresfacdev&mgm.owner_uid=1100&mgm.owner_gid=1100&mgm.requestor=daaas-clf-backup&mgm.requestorgroup=daaas-clf&mgm.attributes=c3lzLmFyY2hpdmUuZmlsZV9pZD00Mjk0OTY5NjE3Ozs7c3lzLmFyY2hpdmUuc3RvcmFnZV9jbGFzcz1jbGZfdGVzdA== xrd_port=1095 http_port=8001
220808 11:42:04 time=1659955324.322139 func=open level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=mgm@antares-eos15.scd.rl.ac.uk:1094 tid=00007f60c046d700 source=XrdMgmOfsFile:3039 tident=georgep.3087654:354@lcgui06.gridpp.rl.ac.uk sec=sss uid=1100 gid=1100 name=daaas-clf-backup geo=“” info=“redirection” hostport=antares-eos15.scd.rl.ac.uk?&cap.sym= <…>&cap.msg=<…>&mgm.logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0&mgm.replicaindex=0&mgm.replicahead=0&mgm.id=00000016&mgm.event=sync::closew&mgm.workflow=default&mgm.instance=eosantaresfacdev&mgm.owner_uid=1100&mgm.owner_gid=1100&mgm.requestor=daaas-clf-backup&mgm.requestorgroup=daaas-clf&mgm.attributes=c3lzLmFyY2hpdmUuZmlsZV9pZD00Mjk0OTY5NjE3Ozs7c3lzLmFyY2hpdmUuc3RvcmFnZV9jbGFzcz1jbGZfdGVzdA==:1095
220808 11:42:04 time=1659955324.322189 func=open level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=mgm@antares-eos15.scd.rl.ac.uk:1094 tid=00007f60c046d700 source=XrdMgmOfsFile:3089 tident=georgep.3087654:354@lcgui06.gridpp.rl.ac.uk sec=sss uid=1100 gid=1100 name=daaas-clf-backup geo=“” path=/eos/antaresfacdev/daas-clf/echo_eos_tpc.log rt=71.03
220808 11:42:04 time=1659955324.345419 func=HandleProtoMethodEvents level=INFO logid=static… unit=mgm@antares-eos15.scd.rl.ac.uk:1094 tid=00007f60fcb2c700 source=WFE:1619 tident= sec=(null) uid=99 gid=99 name=- geo=“” default SYNC::ARCHIVE_FAILED /eos/antaresfacdev/daas-clf/echo_eos_tpc.log cta-front02.scd.rl.ac.uk:10955 fxid=00000016 mgm.reqid=" "
[root@antares-eos15 ~]#
FST log
220808 11:42:04 time=1659955324.325081 func=open level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=fst@antares-eos15.scd.rl.ac.uk:1095 tid=00007f7fe4662700 source=XrdFstOfsFile:133 tident=georgep.3087654:31@lcgui06.gridpp.rl.ac.uk sec=sss uid=0 gid=0 name=daaas-clf-backup geo=“” path=/eos/antaresfacdev/daas-clf/echo_eos_tpc.log info=cap.msg=<…>&cap.sym=<…>&mgm.attributes=c3lzLmFyY2hpdmUuZmlsZV9pZD00Mjk0OTY5NjE3Ozs7c3lzLmFyY2hpdmUuc3RvcmFnZV9jbGFzcz1jbGZfdGVzdA==&mgm.event=sync::closew&mgm.id=00000016&mgm.instance=eosantaresfacdev&mgm.logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0&mgm.owner_gid=1100&mgm.owner_uid=1100&mgm.replicahead=0&mgm.replicaindex=0&mgm.requestor=daaas-clf-backup&mgm.requestorgroup=daaas-clf&mgm.workflow=default&oss.asize=107865 open_mode=102
220808 11:42:04 time=1659955324.325478 func=ProcessCapOpaque level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=fst@antares-eos15.scd.rl.ac.uk:1095 tid=00007f7fe4662700 source=XrdFstOfsFile:2487 tident=georgep.3087654:31@lcgui06.gridpp.rl.ac.uk sec=(null) uid=99 gid=99 name=(null) geo=“” capability=&tapeenabled=1&mgm.access=create&mgm.ruid=1100&mgm.rgid=1100&mgm.uid=99&mgm.gid=99&mgm.path=/eos/antaresfacdev/daas-clf/echo_eos_tpc.log&mgm.manager:1094=antares-eos15.scd.rl.ac.uk &mgm.fid=00000016&mgm.cid=12&mgm.sec=sss|daaas-clf-backup|lcgui06.gridpp.rl.ac.uk||daaas-clf|||&mgm.lid=1048594&mgm.bookingsize=107865&mgm.targetsize=107865&mgm.fsid=2&mgm.url0=root://antares-eos15.scd.rl.ac.uk:1095//&mgm.fsid0=2&cap.valid=1659958924
220808 11:42:04 time=1659955324.325576 func=open level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=fst@antares-eos15.scd.rl.ac.uk:1095 tid=00007f7fe4662700 source=XrdFstOfsFile:212 tident=georgep.3087654:31@lcgui06.gridpp.rl.ac.uk sec=(null) uid=1100 gid=1100 name=nobody geo=“” ns_path=/eos/antaresfacdev/daas-clf/echo_eos_tpc.log fst_path=/eos/data-sdc/00000000/00000016
220808 11:42:04 time=1659955324.325954 func=open level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=fst@antares-eos15.scd.rl.ac.uk:1095 tid=00007f7fe4662700 source=XrdFstOfsFile:522 tident=georgep.3087654:31@lcgui06.gridpp.rl.ac.uk sec=(null) uid=1100 gid=1100 name=nobody geo=“” fst_path=/eos/data-sdc/00000000/00000016 open-mode=102 create-mode=41a4 layout-name=replica oss-opaque=&mgm.lid=1048594&mgm.bookingsize=107865
220808 11:42:04 time=1659955324.340285 func=open level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=fst@antares-eos15.scd.rl.ac.uk:1095 tid=00007f7fe4662700 source=XrdFstOfsFile:698 tident=georgep.3087654:31@lcgui06.gridpp.rl.ac.uk sec=(null) uid=1100 gid=1100 name=nobody geo=“” open-duration=15.202ms path=‘/eos/antaresfacdev/daas-clf/echo_eos_tpc.log’ fxid=00000016 path::print=0.537ms creation::barrier=0.093ms layout::exists=0.007ms get::localfmd=0.018ms resync::localfmd=0.252ms clone::fst=0.001ms layout::open=0.028ms layout::opened=13.565ms layout::stat=0.011ms full::mutex=0.001ms layout::fallocate=0.002ms layout::fallocated=0.603ms fileio::object=0.045ms open::accountingt=0.032ms end=0.007ms open=15.202ms
220808 11:42:04 time=1659955324.340343 func=stat level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=fst@antares-eos15.scd.rl.ac.uk:1095 tid=00007f7fe4662700 source=XrdFstOfsFile:1090 tident=georgep.3087654:31@lcgui06.gridpp.rl.ac.uk sec= uid=1100 gid=1100 name=nobody geo=“” path=/eos/antaresfacdev/daas-clf/echo_eos_tpc.log fxid=00000016 size=0 mtime=1659955324.70154420
220808 11:42:04 time=1659955324.342915 func=_close level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=fst@antares-eos15.scd.rl.ac.uk:1095 tid=00007f7fe4662700 source=XrdFstOfsFile:1317 tident=georgep.3087654:31@lcgui06.gridpp.rl.ac.uk sec= uid=1100 gid=1100 name=nobody geo=“” viaDelete=0 writeDelete=0
220808 11:42:04 time=1659955324.342942 func=VerifyChecksum level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=fst@antares-eos15.scd.rl.ac.uk:1095 tid=00007f7fe4662700 source=XrdFstOfsFile:3292 tident=georgep.3087654:31@lcgui06.gridpp.rl.ac.uk sec= uid=1100 gid=1100 name=nobody geo=“” (write) checksum type: adler checksum hex: 7f650010 requested-checksum hex: -none-
220808 11:42:04 time=1659955324.343216 func=SendArchiveFailedToManager level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=fst@antares-eos15.scd.rl.ac.uk:1095 tid=00007f7fe4662700 source=XrdFstOfsFile:3867 tident=georgep.3087654:31@lcgui06.gridpp.rl.ac.uk sec= uid=1100 gid=1100 name=nobody geo=“” msg=“sending error message to manager” path=“/eos/antaresfacdev/daas-clf/echo_eos_tpc.log” manager=“antares-eos15.scd.rl.ac.uk:1094 ” errorReportOpaque=“/?mgm.pcmd=event&mgm.fid=16&mgm.logid=cta&mgm.event=sync::archive_failed&mgm.workflow=default&mgm.path=/dummy_path&mgm.ruid=0&mgm.rgid=0&mgm.errmsg=”
220808 11:42:04 time=1659955324.349144 func=QueueForArchiving level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=fst@antares-eos15.scd.rl.ac.uk:1095 tid=00007f7fe4662700 source=XrdFstOfsFile:3426 tident=georgep.3087654:31@lcgui06.gridpp.rl.ac.uk sec= uid=1100 gid=1100 name=nobody geo=“” Return code rc=-1 errc=5
220808 11:42:04 time=1659955324.350996 func=_close level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=fst@antares-eos15.scd.rl.ac.uk:1095 tid=00007f7fe4662700 source=XrdFstOfsFile:1790 tident=georgep.3087654:31@lcgui06.gridpp.rl.ac.uk sec= uid=1100 gid=1100 name=nobody geo=“” info=“deleting on close” fn=/eos/antaresfacdev/daas-clf/echo_eos_tpc.log fstpath=/eos/data-sdc/00000000/00000016
220808 11:42:04 6610 FstOfs__close: georgep.3087654:31@lcgui06.gridpp.rl.ac.uk Unable to store file - file has been cleaned because of a queueing to archive error; reason=“” /eos/antaresfacdev/daas-clf/echo_eos_tpc.log; input/output error
220808 11:42:04 time=1659955324.351245 func=_close level=WARN logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=fst@antares-eos15.scd.rl.ac.uk:1095 tid=00007f7fe4662700 source=XrdFstOfsFile:1910 tident=georgep.3087654:31@lcgui06.gridpp.rl.ac.uk sec= uid=1100 gid=1100 name=nobody geo=“” info=“deleting on close” fn=/eos/antaresfacdev/daas-clf/echo_eos_tpc.log fstpath=/eos/data-sdc/00000000/00000016 reason=“”
220808 11:42:04 time=1659955324.351271 func=_close level=INFO logid=bdf476e8-1706-11ed-adcc-0c42a1f42af0 unit=fst@antares-eos15.scd.rl.ac.uk:1095 tid=00007f7fe4662700 source=XrdFstOfsFile:2005 tident=georgep.3087654:31@lcgui06.gridpp.rl.ac.uk sec= uid=1100 gid=1100 name=nobody geo=“” msg=“done close” rc=-1 errc=5
220808 11:42:04 time=1659955324.610404 func=Report level=INFO logid=static… unit=fst@antares-eos15.scd.rl.ac.uk:1095 tid=00007f7fc86d8700 source=Report:49 tident= sec=(null) uid=99 gid=99 name=- geo=“” log=bdf476e8-1706-11ed-adcc-0c42a1f42af0&path=/eos/antaresfacdev/daas-clf/echo_eos_tpc.log&fstpath=/eos/data-sdc/00000000/00000016&ruid=1100&rgid=1100&td=georgep.3087654:31@lcgui06.gridpp.rl.ac.uk&host=antares-eos15.scd.rl.ac.uk &lid=1048594&fid=22&fsid=2&ots=1659955324&otms=325&cts=1659955324&ctms=350&nrc=0&nwc=1&rb=0&rb_min=0&rb_max=0&rb_sigma=0.00&rv_op=0&rvb_min=0&rvb_max=0&rvb_sum=0&rvb_sigma=0.00&rs_op=0&rsb_min=0&rsb_max=0&rsb_sum=0&rsb_sigma=0.00&rc_min=0&rc_max=0&rc_sum=0&rc_sigma=0.00&wb=107865&wb_min=107865&wb_max=107865&wb_sigma=0.00&sfwdb=0&sbwdb=0&sxlfwdb=0&sxlbwdb=0&nfwds=0&nbwds=0&nxlfwds=0&nxlbwds=0&ot=15.202rt=0.00&rvt=0.00&wt=0.10&osize=0&csize=0&delete_on_close=1&prio_c=2&prio_l=4&prio_d=1&forced_bw=0&ms_sleep=0&sec.prot=sss&sec.name=daaas-clf-backup&sec.host=lcgui06.gridpp.rl.ac.uk &sec.vorg=&sec.grps=daaas-clf&sec.role=&sec.info=&sec.app=
Please see also output from cta-frontend.log
Aug 8 12:05:12.721973 cta-front02 cta-frontend: LVL=“INFO” PID=“7846” TID=“8407” MSG=“In RequestMessage::process(): processing SSI event” user=“eosantaresfacdev@cta-front02” eventType=“CREATE” eosInstance=“eosantaresfacdev” diskFilePath=“/eos/antaresfacdev/daas-clf/echo_eos_tpc.log” diskFileId=“25”
Aug 8 12:05:12.726791 cta-front02 cta-frontend: LVL=“INFO” PID=“7846” TID=“8407” MSG=“Checked request and got next archive file ID” user=“eosantaresfacdev@cta-front02” instanceName=“eosantaresfacdev” username=“daaas-clf-backup” usergroup=“daaas-clf” storageClass=“clf_test” fileId=“4294969620” catalogueTime=“0.004695” schedulerDbTime=“0.004695”
Aug 8 12:05:12.726869 cta-front02 cta-frontend: LVL=“INFO” PID=“7846” TID=“8407” MSG=“In RequestMessage::processCREATE(): assigning new archive file ID.” user=“eosantaresfacdev@cta-front02” diskFileId=“25” diskFilePath=“/eos/antaresfacdev/daas-clf/echo_eos_tpc.log” fileId=“4294969620” schedulerTime=“0.004814”
Really sorry for the hassle, do you have any ideas what might wrong here? As I said without workflows. acls assigned the EOS cluster works fine. It is only after the workflow assignment that the above weirdness happens.
Last thing I thought I got wrong has the set up SSS keys which I corrected but the problem remains. Here is my SSS set up (eosantaresdev and eosantaresfacdev are the two EOS cluster supported by the same CTA)
MGM
[root@antares-eos15 ~]# xrdsssadmin list /etc/eos.keytab
Number Len Date/Time Created Expires Keyname User & Group
------ — --------- ------- -------- -------
1 32 08/04/22 14:17:42 -------- daaas-clf daaas-clf-backup daaas-clf
1 32 08/04/22 09:54:27 -------- eosantaresfacdev daemon daemon
Frontend
[root@cta-front02 etc]# xrdsssadmin list /etc/cta/eos.sss.keytab
Number Len Date/Time Created Expires Keyname User & Group
------ — --------- ------- -------- -------
1 32 09/21/21 17:38:15 -------- cta-admin cta-admin tape
2 32 02/23/22 17:46:35 -------- eosantaresdev eosantaresdev daemon
1 32 08/04/22 09:54:27 -------- eosantaresfacdev eosantaresfacdev daemon
Tape server
Number Len Date/Time Created Expires Keyname User & Group
------ — --------- ------- -------- -------
2 32 02/23/22 17:46:35 -------- eosantaresdev daemon daemon
1 32 08/04/22 09:54:27 -------- eosantaresfacdev daemon daemon
Hello George,
I was going through the CTA logs (at CERN instances), the only similar error messages we got when one of the experiments was deleting the files themselves in the middle of archival. In that case the transfers were cancelled thus deleting the files, which were in the buffer. When CTA finished archival, it couldn’t find the source in the EOS buffer and threw exactly the same errors.
Sorry, but I don’t have any other ideas what could cause this. Perhaps, the files get corrupted on FST and the checksums do not match?
Is this problem reproducible on all FSTs and on the different directory?
Cheers,
Vova
Hi George,
Your logs show that the CREATE event is processed by the MGM and forwarded to CTA. When the file is written to the FST, it fails on the CLOSEW. The FST then deletes the file and reports this.
It’s not possible for us to diagnose why the CLOSEW is failing from the log fragments provided. Most likely it is an EOS misconfiguration. Possibly an authentication problem between the FST and MGM, but I am just guessing. I suggest that you compare all of the configuration files between your two instances and check if you missed something when setting up the second instance.
Best regards,
Michael
Hi Michael
Thanks for the reply, How can I turn on additional (DEBUG) logging for the xrdlog.mgm and xrdlog.fst?
After reading your comment above, I did go through again the SSS auth set up which I paste below for reference (reminder: this is a second EOS instance/cluster to the same CTA dev instance).
If I remove the CTA workflows and storage class from the EOS dir I am testing, the cluster works fine but when I put them back it breaks… I have added the new disk instance in CTA and defined the group and requested mount rules using that; can’t figure it what I am missing!
xrdsssadmin list /etc/eos.keytab
Number Len Date/Time Created Expires Keyname User & Group
------ — --------- ------- -------- -------
1 32 08/10/22 17:09:58 -------- cta-taped cta tape
1 32 08/04/22 14:17:42 -------- daaas-clf daaas-clf-backup daaas-clf
2 32 08/10/22 17:10:33 -------- eosantaresfacdev daemon daemon
xrdsssadmin list /etc/cta/eos.sss.keytab
Number Len Date/Time Created Expires Keyname User & Group
------ — --------- ------- -------- -------
1 32 09/21/21 17:38:15 -------- cta-admin cta-admin tape
2 32 02/23/22 17:46:35 -------- eosantaresdev eosantaresdev daemon
2 32 08/10/22 17:10:33 -------- eosantaresfacdev eosantaresfacdev daemon
xrdsssadmin list cta-taped.sss.keytab
Number Len Date/Time Created Expires Keyname User & Group
------ — --------- ------- -------- -------
1 32 08/10/22 17:09:58 -------- cta-taped cta tape
I did (I was wondering if there are any MGM/FST config directives for additional logging). This is the report for the file I am trying to write
log=e28c62d8-1a2e-11ed-a8ca-0c42a1f42af0&path=/eos/antaresfacdev/daas-clf/tpconfig-spectra.pan.copy&fstpath=/eos/data-sdc/00000000/00000026&ruid=1100&rgid=1100&td=georgep.3604635:51@lcgui06.gridpp.rl.ac.uk&host=antares-eos15.scd.rl.ac.uk &lid=1048594&fid=38&fsid=2&ots=1660302419&otms=137&cts=1660302419&ctms=142&nrc=0&nwc=1&rb=0&rb_min=0&rb_max=0&rb_sigma=0.00&rv_op=0&rvb_min=0&rvb_max=0&rvb_sum=0&rvb_sigma=0.00&rs_op=0&rsb_min=0&rsb_max=0&rsb_sum=0&rsb_sigma=0.00&rc_min=0&rc_max=0&rc_sum=0&rc_sigma=0.00&wb=3073&wb_min=3073&wb_max=3073&wb_sigma=0.00&sfwdb=0&sbwdb=0&sxlfwdb=0&sxlbwdb=0&nfwds=0&nbwds=0&nxlfwds=0&nxlbwds=0&ot=1.096rt=0.00&rvt=0.00&wt=0.02&osize=0&csize=0&delete_on_close=1&prio_c=2&prio_l=4&prio_d=1&forced_bw=0&ms_sleep=0&sec.prot=sss&sec.name=daaas-clf-backup&sec.host=lcgui06.gridpp.rl.ac.uk &sec.vorg=&sec.grps=daaas-clf&sec.role=&sec.info=&sec.app=
which I think doesnt show anything unusual. One of the FST log messages for this uuid is
220812 12:06:59 time=1660302419.139876 func=SendArchiveFailedToManager level=INFO logid=e28c62d8-1a2e-11ed-a8ca-0c42a1f42af0 unit=fst@antares-eos15.scd.rl.ac.uk:1095 tid=00007f6bd3fff700 source=XrdFstOfsFile:3867 tident=georgep.3604635:51@lcgui06.gridpp.rl.ac.uk sec= uid=1100 gid=1100 name=nobody geo=“” msg=“sending error message to manager” path=“/eos/antaresfacdev/daas-clf/tpconfig-spectra.pan.copy” manager=“antares-eos15.scd.rl.ac.uk:1094 ” errorReportOpaque=“/?mgm.pcmd=event&mgm.fid=26&mgm.logid=cta&mgm.event=sync::archive_failed&mgm.workflow=default&mgm.path=/dummy_path&mgm.ruid=0&mgm.rgid=0&mgm.errmsg=”
George
Sorry Michael, we got sth wrong in the config management templates and we missed these two lines in the FST config!
fstofs.protowfendpoint cta-front02.scd.rl.ac.uk:10955
fstofs.protowfresource /ctafrontend
apologies for goofing this
mdavis
12 August 2022 11:41
12
Glad to hear you found the problem.