Can you please confirm the exact target year for deploying the PostgreSQL SchedulerDB in production at CERN?
I do recall from the CTA 2025 workshop that the aim is to have the new scheduler setup ready before the beginning of LHC Run 4. Does this mean 2029 or 2030?
We are targetting Q1 2026 to deploy the PostgreSQL SchedulerDB on our Repack instance only.
All physics instances will stay on the objectstore SchedulerDB until end of Run-3. During the next year we will focus on scale/stress testing with the goal of full deployment in 2027.
The context of my question is that our CTA Ceph nodes (purchased in 2019) are not any more under warranty and need to be replaced soon which means that we will be running an objectstore SchedulerDB past 2027. Are you going to support the Ceph object store (at least on the level of tickets) past this date?
Once we have migrated to the PostgreSQL SchedulerDB we would like to deprecate the Ceph objectstore as soon as possible. One of the motivations to switch is to free ourselves from certain constraints of the objectstore. Future developments of the CTA scheduler are not guaranteed to be backwards compatible.
Once we have a stable production release of the PostgreSQL SchedulerDB we will be in contact with all CTA sites to work on a migration schedule. This is foreseen for 2027.
Given that all sites will need to work on a migration schedule in 2027, I conclude that there is no point to purchase new Ceph nodes (that will arrive in 2026) with a 5-year lifetime.
One last question: what are the hardware provisioning requirements for a cluster/node that runs PostgreSQL SchedulerDB? Is there a need to have the spec of hosts running a standard PostgreSQL DB?
Our standard PostgreSQL high-availability (HA) configuration consists of three nodes: one primary, one secondary (replica), and one witness node for quorum and failover management. All PostgreSQL instances will run on Proxmox virtual machine hosts to ensure flexibility and resource efficiency.
We do not yet have any information on hardware requirements. We plan to use the central IT department PostgreSQL service in the first instance, so we will not run the DB backend ourselves.
Once we have it running, we will collect telemetry and do stress testing and observe performance over time. In particular we want to observe whether VACUUM causes any performance penalty. At that point we should be able to give some informed advice.
I realise that you are ordering hardware now, so if you can’t wait, I can suggest asking the CERN DB team. Let me know if you need a contact.
Thanks, I will not need a contact from the CERN DB team. As our DB team has confirmed that they will not be deploying PostgreSQL on physical hardware, I think it makes sense to wait until 2027 and pick up this issue when we will be working on the migration schedule with you.