Hi again,
We wanted to start testing the statistics tools, but so far we observed that its usage is quite different than the one described at the documentation (Introduction - EOSCTA Docs)
cta-statistics-update
On the one hand, when trying to use cta-statistics-update, it throws us an error about another sql query being in process, even with 0 activity in our system and checking the existing queries in the db:
[root@ctatps001 ~]# cta-statistics-update /etc/cta/cta-catalogue.conf
Updating tape statistics in the catalog...
Aborting: virtual void cta::statistics::DatabaseStatisticsService::updateStatisticsPerTape(): executeNonQuery failed: executeNonQuery failed for SQL statement UPDATE TAPE TAPE_TO_UPDATE SET(DIRTY,NB_MASTER_FILES,MASTER_DATA_IN_BYTES,NB_...: can not execute sql, another query is in progress
The documentation also states that this tool needs the SUPERSEDED_BY_VID and SUPERSEDED_BY_FSEQ columns in the TAPE_FILE table to exist, which our catalog schema certainly doesn’t have. Still we don’t know if this has anything to do with the error.
cta-statistics-save
On the other hand, the cta-statistics-save command does not recognize the --build, --drop, --statisticsconf parameters described in the documentation, and only allows us to provide the connection string to the catalogue. It returns the output in json without specifying the --json flag too.
Is the documentation up to date? Is this tool being used in another way now, or not used at all?
What it seams at least is that a MySQL database is no longer needed, right?
There might be something totally wrong in our way of understanding the tool, so apologies if that’s the case.
I hope you can clarify this matter for us.
Many thanks in advance,
Jordi.
Hi @jcasals,
We are actually reviewing the state and use of the statistics tool internally at present. The wiki page is indeed out of date for recent versions of CTA, apologies for that. We will update it in the coming future once we have sorted out this part of the statistics gathering on our end.
A likely outcome is that we will move away from using cta-statistics-save
at CERN. The output of ctaadmin --json tape ls --all
gives more detailed output for monitoring and statistics display purposes if you store it in a DB.
cta-statistics-update
is still needed for the time being. It is used to update the internal tape stats for ‘dirty’ tapes. Based on the error message above it appears that you have ran into a bug with the tool, thank you for reporting it. I have created this issue to track the progress on resolving this problem: cta-statistics-update can fail for catalogues in postgres (#181) · Issues · cta / CTA · GitLab
We run the update command just the same as you, but it appears that it is working for us by virtue of still using an Oracle DB.
The MySQL DB is indeed no longer needed as of CTA v4.4.0-1
, and the superseded concept was replaced in v3.2-1
.
Hope this is helpful to you, cheers,
-Richard
Hi @jcasals,
As promised I’ve updated the docs for the statistics tool. I hope that the explanations of the returned values are a bit clearer now: The cta-statistics-update tool - EOSCTA Docs
But as mentioned, we really recommend you base your statistics gathering and monitoring on the output of cta-admin --json ta ls --all
instead of using the output of cta-statistics-save
, especially if you’re setting up something new. The tape ls
output is much more flexible and allows you to create more interesting plots/dashboards.
We’re not using statistics-save anymore ourselves, but we do (as should you, once it is fixed) use the cta-statistics-update
command to update the statistics for recently written tapes.
Hi Richard,
Thank you so much for fixing this. We will follow your advice to monitor the system. We have already some scripts that use the json format output to send metrics to our metrics backend in a cron executed every minute.
We will look into the cta-statistics-update executions. For the moment we are still putting in place a small test system with a semi automated installation.
We’ll let you know, but thank you so much for fixing the documentation.