Releasing the CTA operator tools

Hi everyone,
at CERN we’re thinking about making a push to release some of our CTA operator tools, which have thus far lived inside of a separate private repo.
The operator tools are what we use to allow operators and automation systems to perform CTA-related tasks, as well as what powers the data-gathering parts of our monitoring.

Here is the work-in-progress plan for releasing these tools:

  • The code will be GLPv3 licensed.
  • We will create a new public repo at gitlab.cern.ch, then we’ll gradually migrate tools there.
  • Packaging wise we’re thinking of providing pip packages, since most of the code is Python, and rpms, which wrap the installation process of said pip packages and other needed dependencies.
  • We’ll start with three components to test the waters and work out the details. These would be:
    • ‘ctautils’ - a set of utility modules used by most of the tools
    • ‘tapeadmin’ - a module for tape-specific things
    • 'ATRESYS - A freshly made set of tools for automating and tracking repacks and the surrounding tape lifecycle
  • We have some internal documentation for our tools and metrics, these can be published on the Gitlab wiki together with the code as it moves.

The released software will be as generic as possible: CERNs-specific details should either go/stay in our Puppet profiles (which will not be shared), or be replaced by config files such that everyone can adapt the tools to their own setups.

I would be interested in hearing your thoughts on the plan above and how well this works for various sites once we get going.
There are also some things I’m not sure on how to do properly.
For instance, it would be nice to share the Grafana dashboards that go with the metrics, but as far as I know there is no systematic way of exposing them from Grafana’s built-in version control.
Perhaps committing the exported json to the repo from time to time would be sufficient?
The situation is similar for the Rendeck job definitions as well.

@snorenberg @mwai @kotlyar (it looked like you might be interested, based on previous discussions)

1 Like

Hi Richard!

super! actually I recently thought about supply pool realization at CERN and about possible monitoring solutions so as far as I understand that is a good direction to go. Good luck!

Some notes from my point of view:

  • we do not use Grafana, Rendeck so I can not help you here
  • using wiki looks not good, maybe .md based documentation for self-documenting git repo is better (with or without using *.md site generators)
  • using pip for python: maybe it would be good to do everything about python in python-env (maybe based on miniconda) and avoid completely system default packages. They usually provide requirements file for dependency
  • actually we are interested in using containers as soon as we have Ubuntu everywhere
  • and we are using ansible playbooks for setups if possible
  • in theory, if for testing CI/CD you will create something based on containers and ansible(as replace for bash scripts) it also worth for sharing
  • also .gitlab-ci.yml could be a helper

Cheers,
Victor

1 Like

Thanks Viktor, this is all good to know and I’ll take it into consideration

The Gitlab Wiki feature is entirely markdown based, so each article would be a .md file in the project. One can even clone the wiki part as a repo on its own and work with it locally, or render it in some other way.

Yes, this is something we could work on. We faced a similar issue with colliding package versions from the CERN monitoring setup. For now we rely on creating a special path /opt/ctaops-lib/... where we put our dependencies, and then giving our scripts preference for these when importing.
External dependencies are already tracked and version locked using a requirements.txt, so this we’ll include as well.

Hi Richard,

We gladly welcome the idea! Some time back Vlado shared some of those tools with us which we customised for our setup. So it’s a nod from us!

cheers,
mwai

Hi @Mwai and @kotlyar,
I’ve now made the repository for our public operator tools available at: cta / CTA Operations utilities · GitLab .
The pre-built pip packages can be found at: Index of /cta-operations/pip/simple .

Please note the versioning and release scheme:
Releases and pip packages are tagged with versions x.y, where x is incremented when CTA (in particular tape-admin) changes in a backwards-incompatible way, and y is incremented when there is an update to the operator tools for the same CTA release. The package index may also contain packages versioned with x.y-devz. These are release candidates which we are testing internally and should not be used unless you are feeling particularly adventurous.

The present 1.y tags of the operator tools are tested with CTA v4.8.*.
The 1.2 tag is the candidate for the switchover on our production setup, where we will start using the contents of the public repo instead of the corresponding contents in our private repo.

What is available now

  • CTA Operations script libraries:
    • ctautils - collection of helpers and wrappers which are re-used across the various operator tools
    • tapeadmin - library for interacting with tape media and libraries
  • ATRESYS - Automated Tape REpacking SYStem, a tool for managing the tape life-cycle surrounding repack operations. Vlado will present this tool at the EOS workshop.
    • Also includes some config and examples for monitoring.
  • Documentation published in the Gitlab project’s wiki: Home · Wiki · cta / CTA Operations utilities · GitLab
  • A requirements.txt file with recommended dependency versions. The idea is to install the packages specified here into a dedicated venv.
  • A config file template. You will have to install this manually and adjust to make it work with your setup. Configuration for future tools will be added to this template as well.
  • Makefiles and .gitlab-ci.yml for easy building.

Things coming next

  • An RPM for managing non-python dependencies and versionlocking to specific CTA versions.
  • More tools, we’re considering the EOSCTA namespace reconciliation scripts and our ACL management tools.
  • More monitoring examples

Feedback

When you find the time to play around with this we’d appreciate any feedback you may have on this so far, in particular:

  • If there were any dependencies we missed.
  • For ATRESYS in particular: Whether or not the installation of psycopg2 (pip pkg for postgresql interactions) works for you. On CC7 we had issues with both the binary and non-binary pip package on Pypi, so we have built it from source and installed that instead. Instructions for this maneuver are in the wiki.
2 Likes

Hi Richard,

many thx! Sure I will post feedback if any.

Cheers,
Victor

Many thanks Richard!

Thank you Richard! I am having a look now.

Hi everyone, just wanted to let you know that thanks to the efforts of @lwardena and our summer student Thomas, we now have a new release of the operations utilities (1.4) which includes many new tools:

  • The tape verification tool, which is used to periodically read back tape data in order to verify its integrity
  • The tape supply tool, which we use to automatically re-fill the tapepools belonging to VOs from pools of fresh tapes
  • The cta-ops-admin command, which is a customisable wrapper around cta-admin and a set of tape scripts used for testing drives, labelling tapes, etc.
  • Two EOS-specific tools, which allow for the changing for storage classes and for fetching the path in EOS corresponding to a file in CTA.

There are also a few tweaks to the repack automation system.

We’ll move our setup to use the public tools instead of the internal edition in the next days, so there will likely be a new tag with bugfixes soon. In the mean time feel free to check these out and play with them, and report bugs!

2 Likes