[EU Only] Emergency Maintenance
Incident Report for Cirrus Assessment
Postmortem

Impact

No reports were received and the few candidates taking exams should not have been impacted after during off-peak hours our EU (non-Premium) customers may have experienced minimal downtime (< 5 minutes) due to an unplanned restart of our caching cluster.

Root Cause

Human error during scheduling of maintenance.

Resolution

The platform automatically recovered from the caching cluster restart.

Preventative Measures

  • [DONE] CSOP220_Cirrus_SysOps_Procedures reviewed.
  • [DONE] Script caching cluster maintenance
Posted Nov 03, 2023 - 14:44 CET

Resolved
Our EU (non-Premium) customers may have experienced minimal downtime (< 5 minutes) this evening due to an unplanned restart of our caching cluster. The few candidates taking exams should not have been impacted but may have briefly noticed a connection issue.
Posted Oct 16, 2023 - 19:08 CEST