Ceph/librados version requirements for CTA

Hello,

We noted that the version of librados used by CTA is pinned to 14.2.8, both for building RPMs and for install dependencies. Is there a reason for not allowing newer librados versions to be installed alongside CTA packages (or is it just because that was the rados library you tested against)?

We were planning to keep the librados version on the CTA nodes roughly in step with the version running on the Ceph cluster, but the spec file indicates that you don’t do this. Can you shed any light on what you do in production with librados versions and ceph cluster versions?

Many thanks,

George, Tom

Dear George and Tom,
the versionlock file allows us to maintain consistent and reproducible builds and tests in the CI environment.

But not only: the idea behind the versionlock is that it defines the set of RPMs we used for all our tests for a specific CTA version and in production.

If your software stack diverges from this set you are mostly on your own as we cannot support all possible library mixes (xrootd, rados, protobuf,…).

You are right that while we allow for newer versions for xrootd to be installed alongside CTA rpms, librados is strictly versionned in the spec file. We have removed this constraint from CTA master today and applied the same constraint as for xrootd (>= version).

Our ceph cluster version is 14.2.19 while the client side is 14.2.8, so anything between these 2 should be fine.

Best regards,
Julien Leduc

Hi all,

I unfortunately had to revert my commit that relaxed the version of librados used with CTA:

commit 680f2a96c527e48dddd428a5071254f88bfd3ea3
Author: Steven Murray Steven.Murray@cern.ch
Date: Tue Apr 27 10:19:20 2021 +0200

Revert "Relaxed dependency on radosVersion.  Instead of = we now use >="

This reverts commit 6bbd0afaecf1ff195e94fec80be59c0188ff185c.

It is too dangerous to not pin the version rados to exact version.

I had no choice but to revert because the Ceph developers have demonstrated in the past that they do not respect ABI compatibility between minor versions of Rados.

There are now two possible ways forward:

  1. RAL modifies their local copy of the CTA source to build against and use the exact version of Rados they are happy with.
  2. CERN simply uses the same version of Rados as RAL.

Thanks for the replies, and thanks for confirming that you don’t expect/require the client (CTA) librados versions to match the cluster version.

The simple option seems to be that we mirror what you do:

  • Our CTA machines will stick with whatever version of librados CERN build and test against.
  • Our Ceph cluster can start a recent-ish nautilus release (e.g. 14.2.19), then we can upgrade as needed.

The fact that Ceph Nautlius (14.2.x) is planned to EOL around June is another issue that we can worry about later. In our experience, there is good compatibility when running mismatched client and cluster versions, so I wouldn’t expect any issues upgrading the cluster to the next major version without upgrading the client librados version at the same time, but it definitely feels like a thing for us to test before we go into production if at all possible…

Cheers,
Tom