Hi, there,
After I upgraded cta to 4.2-3 from 4.0-2, I notice that when I run cta-admin dr ls with a file being transferred, it will crash and the error is:
210916 08:33:36 256076 ssi_Pb::Request: pid:256068 tid:140486964729600 ProcessResponseData(): fatal error from XRootD framework
Unable to read stream /ctafrontend; GetBuff failed: basic_string::at
library drive host desired request status since vid tapepool vo files data MB/s session priority activity age reason disk_system_name reserved_bytes
Meanwhile, the cta-taped will throw an exception which may cause the backup fail. Some error log like
Sep 16 08:29:35.107969 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Aborting cta-taped on uncaught exception. Stack trace follows." Message="Uncaught exception of type 'std::logic_error' in Thread.run(): >>>><<<< End of uncaught exception"
Sep 16 08:29:35.108079 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Stack trace" traceFrameNumber="0" traceFrame="/lib64/libctacommon.so.0(cta::exception::Backtrace::Backtrace(bool)+0x6e) [0x7fb155fd73bc]"
Sep 16 08:29:35.108146 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Stack trace" traceFrameNumber="1" traceFrame="/lib64/libctacommon.so.0(cta::exception::Exception::Exception(std::string const&, bool)+0x89) [0x7fb155fd8c53]"
Sep 16 08:29:35.108197 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Stack trace" traceFrameNumber="2" traceFrame="/lib64/libctacommon.so.0(cta::threading::UncaughtExceptionInThread::UncaughtExceptionInThread(std::string const&)+0x37) [0x7fb155ff3517]"
Sep 16 08:29:35.108244 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Stack trace" traceFrameNumber="3" traceFrame="/lib64/libctacommon.so.0(cta::threading::Thread::wait()+0x13b) [0x7fb155ff2e2d]"
Sep 16 08:29:35.108288 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Stack trace" traceFrameNumber="4" traceFrame="/usr/bin/cta-taped(castor::tape::tapeserver::daemon::TaskWatchDog::stopAndWaitThread()+0x39) [0x594437]"
Sep 16 08:29:35.108334 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Stack trace" traceFrameNumber="5" traceFrame="/usr/bin/cta-taped(castor::tape::tapeserver::daemon::DataTransferSession::executeWrite(cta::log::LogContext&, cta::ArchiveMount*)+0x66a) [0x58e4e0]"
Sep 16 08:29:35.108382 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Stack trace" traceFrameNumber="6" traceFrame="/usr/bin/cta-taped(castor::tape::tapeserver::daemon::DataTransferSession::execute()+0xbd3) [0x58c4bb]"
Sep 16 08:29:35.108426 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Stack trace" traceFrameNumber="7" traceFrame="/usr/bin/cta-taped(cta::tape::daemon::DriveHandler::runChild()+0x178b) [0x5502bb]"
Sep 16 08:29:35.108468 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Stack trace" traceFrameNumber="8" traceFrame="/usr/bin/cta-taped(cta::tape::daemon::ProcessManager::runForkManagement()+0x3dd) [0x57293b]"
Sep 16 08:29:35.108510 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Stack trace" traceFrameNumber="9" traceFrame="/usr/bin/cta-taped(cta::tape::daemon::ProcessManager::run()+0x7e) [0x571934]"
Sep 16 08:29:35.108552 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Stack trace" traceFrameNumber="10" traceFrame="/usr/bin/cta-taped(cta::tape::daemon::TapeDaemon::mainEventLoop()+0x2fb) [0x53bd85]"
Sep 16 08:29:35.108594 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Stack trace" traceFrameNumber="11" traceFrame="/usr/bin/cta-taped(cta::tape::daemon::TapeDaemon::exceptionThrowingMain()+0x1d9) [0x53b9a1]"
Sep 16 08:29:35.108636 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Stack trace" traceFrameNumber="12" traceFrame="/usr/bin/cta-taped(cta::tape::daemon::TapeDaemon::main()+0x37) [0x53b379]"
Sep 16 08:29:35.108676 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Stack trace" traceFrameNumber="13" traceFrame="/usr/bin/cta-taped() [0x52c73a]"
Sep 16 08:29:35.108714 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Stack trace" traceFrameNumber="14" traceFrame="/usr/bin/cta-taped(main+0x334) [0x52ce23]"
Sep 16 08:29:35.108752 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Stack trace" traceFrameNumber="15" traceFrame="/lib64/libc.so.6(__libc_start_main+0xf5) [0x7fb14f654555]"
Sep 16 08:29:35.108808 laso-tape01 cta-taped: LVL="ERROR" PID="96932" TID="96932" MSG="Stack trace" traceFrameNumber="16" traceFrame="/usr/bin/cta-taped() [0x52c3f9]"
Sep 16 08:29:44.171453 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Aborting cta-taped on uncaught exception. Stack trace follows." Message="Uncaught exception of type 'std::logic_error' in Thread.run(): >>>><<<< End of uncaught exception"
Sep 16 08:29:44.171591 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Stack trace" traceFrameNumber="0" traceFrame="/lib64/libctacommon.so.0(cta::exception::Backtrace::Backtrace(bool)+0x6e) [0x7fb155fd73bc]"
Sep 16 08:29:44.171677 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Stack trace" traceFrameNumber="1" traceFrame="/lib64/libctacommon.so.0(cta::exception::Exception::Exception(std::string const&, bool)+0x89) [0x7fb155fd8c53]"
Sep 16 08:29:44.171753 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Stack trace" traceFrameNumber="2" traceFrame="/lib64/libctacommon.so.0(cta::threading::UncaughtExceptionInThread::UncaughtExceptionInThread(std::string const&)+0x37) [0x7fb155ff3517]"
Sep 16 08:29:44.171847 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Stack trace" traceFrameNumber="3" traceFrame="/lib64/libctacommon.so.0(cta::threading::Thread::wait()+0x13b) [0x7fb155ff2e2d]"
Sep 16 08:29:44.171916 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Stack trace" traceFrameNumber="4" traceFrame="/usr/bin/cta-taped(castor::tape::tapeserver::daemon::TaskWatchDog::stopAndWaitThread()+0x39) [0x594437]"
Sep 16 08:29:44.171988 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Stack trace" traceFrameNumber="5" traceFrame="/usr/bin/cta-taped(castor::tape::tapeserver::daemon::DataTransferSession::executeWrite(cta::log::LogContext&, cta::ArchiveMount*)+0x66a) [0x58e4e0]"
Sep 16 08:29:44.172067 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Stack trace" traceFrameNumber="6" traceFrame="/usr/bin/cta-taped(castor::tape::tapeserver::daemon::DataTransferSession::execute()+0xbd3) [0x58c4bb]"
Sep 16 08:29:44.172136 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Stack trace" traceFrameNumber="7" traceFrame="/usr/bin/cta-taped(cta::tape::daemon::DriveHandler::runChild()+0x178b) [0x5502bb]"
Sep 16 08:29:44.172202 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Stack trace" traceFrameNumber="8" traceFrame="/usr/bin/cta-taped(cta::tape::daemon::ProcessManager::runForkManagement()+0x3dd) [0x57293b]"
Sep 16 08:29:44.172269 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Stack trace" traceFrameNumber="9" traceFrame="/usr/bin/cta-taped(cta::tape::daemon::ProcessManager::run()+0x7e) [0x571934]"
Sep 16 08:29:44.172333 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Stack trace" traceFrameNumber="10" traceFrame="/usr/bin/cta-taped(cta::tape::daemon::TapeDaemon::mainEventLoop()+0x2fb) [0x53bd85]"
Sep 16 08:29:44.172411 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Stack trace" traceFrameNumber="11" traceFrame="/usr/bin/cta-taped(cta::tape::daemon::TapeDaemon::exceptionThrowingMain()+0x1d9) [0x53b9a1]"
Sep 16 08:29:44.172478 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Stack trace" traceFrameNumber="12" traceFrame="/usr/bin/cta-taped(cta::tape::daemon::TapeDaemon::main()+0x37) [0x53b379]"
Sep 16 08:29:44.172541 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Stack trace" traceFrameNumber="13" traceFrame="/usr/bin/cta-taped() [0x52c73a]"
Sep 16 08:29:44.172599 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Stack trace" traceFrameNumber="14" traceFrame="/usr/bin/cta-taped(main+0x334) [0x52ce23]"
Sep 16 08:29:44.172661 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Stack trace" traceFrameNumber="15" traceFrame="/lib64/libc.so.6(__libc_start_main+0xf5) [0x7fb14f654555]"
Sep 16 08:29:44.172722 laso-tape01 cta-taped: LVL="ERROR" PID="95526" TID="95526" MSG="Stack trace" traceFrameNumber="16" traceFrame="/usr/bin/cta-taped() [0x52c3f9]"
I lookuped the files on tapes, and these files are archived successfully,
[cta@cta ~]$ cta tf ls -l -v B02705
archive id copy no vid fseq block id instance disk fxid size checksum type checksum value storage class owner group creation time path
2 2 B02705 3 21 eoslaso 3e 158 ADLER32 9c6d36f9 laso-raw 12142 600 2021-09-15 23:47 /eos/laso/test/nn
21 2 B02705 4 31 eoslaso 3f 158 ADLER32 9c6d36f9 laso-raw 12142 600 2021-09-16 06:52 /eos/laso/raw/test/hosts
22 2 B02705 5 41 eoslaso 40 158 ADLER32 9c6d36f9 laso-raw 12142 600 2021-09-16 06:57 /eos/laso/raw/test/aa
23 2 B02705 6 51 eoslaso 41 158 ADLER32 9c6d36f9 laso-raw 12142 600 2021-09-16 07:08 /eos/laso/test/oo
24 2 B02705 7 61 eoslaso 42 158 ADLER32 9c6d36f9 laso-raw 12142 600 2021-09-16 07:21 /eos/laso/raw/test/bb
26 2 B02705 8 71 eoslaso 43 158 ADLER32 9c6d36f9 laso-raw 12142 600 2021-09-16 08:01 /eos/laso/raw/test/cc
27 2 B02705 9 81 eoslaso 44 2.1G ADLER32 baf4aec3 laso-raw 12142 600 2021-09-16 08:09 /eos/laso/raw/test/test.img
28 2 B02705 10 8282 eoslaso 45 2.1G ADLER32 baf4aec3 laso-raw 12142 600 2021-09-16 08:17 /eos/laso/raw/test/2.img
29 2 B02705 11 16483 eoslaso 46 2.1G ADLER32 baf4aec3 laso-raw 12142 600 2021-09-16 08:23 /eos/laso/test/2.img
30 2 B02705 12 24684 eoslaso 47 2.1G ADLER32 baf4aec3 laso-raw 12142 600 2021-09-16 08:28 /eos/laso/raw/test/3.img
31 2 B02705 13 32885 eoslaso 48 2.1G ADLER32 baf4aec3 laso-raw 12142 600 2021-09-16 08:32 /eos/laso/test/3.img
32 2 B02705 14 41086 eoslaso 49 2.1G ADLER32 baf4aec3 laso-raw 12142 600 2021-09-16 08:38 /eos/laso/test/4.img
But the eos don’t recognize the archive result as Files archived but eoscta disagrees.
We don’t know what the problem is. Did anyone else have such problems? Thanks.
Cheers,
Yujiang