Proceedings of the

The 33rd European Safety and Reliability Conference (ESREL 2023)
3 – 8 September 2023, Southampton, UK

A Discussion on the use of Eliminative Argumentation (EA) to Identify Key Performance Indicators (KPIs) for the CERN LHC Machine Protection System

Chris Rees1,a, Adam Casey1,b, Jeff Joyce1,c, Jan Uythoven2,d, Markus Zerlauth2,e, Lukas Felsberger2,f and Torin Viger3

1Critical Systems Labs, Vancouver, Canada.

2European Organization for Nuclear Research (CERN).

3University of Toronto.

ABSTRACT

Key Performance Indicators (KPIs) and Safety Performance Indicators (SPIs) form an integral part of the Safety Management System (SMS) for a selected system. They provide a key insight into the system's safety performance and risk management, and enable data-driven decision-making.

A KPI for a system is defined as "a quantifiable measure used to evaluate the success of an organization, employee, etc. in meeting objectives for performance". The KPIs discussed within this paper denote a measure of success/performance of the relevant identified sub-systems. Integration of KPIs and SPIs serves as a method of performance and safety evaluation of the systems they are associated with. KPIs can be used to estimate the safety performance of a system, as well as to support the safety case and ensure that it remains "fit for purpose" and "live".

The paper also discusses how KPIs can be grouped into "leading" and "lagging" indicators. A leading indicator is one that tracks the occurrence of events that, while not themselves harmful, are expected to precede, or indicate the potential for, more harmful events. A lagging indicator is one that tracks the occurrence rate of hazards and/or loss events, such as crashes, injuries and fatalities. Leading and lagging indicators have limitations, advantages and disadvantages, which will be discussed further in the paper. Further we also discuss the challenges of accurate data collection to support KPIs.

KPIs have a variety of potential uses, such as tracking safety trends over time, measuring system compliance to regulations/legislation, and providing evidence for the system's safety case. This paper will focus on how KPIs can be defined from the safety (assurance) case assessment process. Specifically, this paper demonstrates the use of Eliminative Argumentation (EA) to define the potential hazards associated with the machine protection system at the nuclear research facility CERN. We discuss the evaluation and identification of the KPIs for each of these systems. Further, we show how performance indicators are identified with the EA assessment and the corresponding nodes, whilst demonstrating how the content of this assessment is linked via a "golden thread". We show how they can be analysed post-mortem to ensure that the safety case remains valid and "live" as the system changes. Finally, we discuss how the use of KPIs can benefit the safety case and why ensuring that it remains "live" (fit for purpose) is critical to the continued safe operation of a system.

In summary, KPIs play a critical role in keeping a safety case live by providing ongoing monitoring, driving continuous improvement, providing documentation, and establishing accountability for safety performance. By using them effectively, organizations can ensure that safety goals are being met over time.

Keywords: Safety case, CERN, Nuclear research, Machine protection system, LHC, Risk assessment, HAZOP, FMEA, SWIFT, EA, Performance indicators, SPIs, KPIs.



Download PDF