Proceedings of the
35th European Safety and Reliability Conference (ESREL2025) and
the 33rd Society for Risk Analysis Europe Conference (SRA-E 2025)
15 – 19 June 2025, Stavanger, Norway
Enhancement of a Hydrogen Incident and Accident Database Using Large Language Models
1Dept. Gas Technology, SINTEF Energy Research, Norway.
2Dept. Civil, Chemical, Environmental, and Materials Engineering, University of Bologna, Italy.
3Dept. Mechanical and Industrial Engineering, NTNU Norwegian University of Science and Technology, Norway.
4Dept. Electronic Systems, NTNU Norwegian University of Science and Technology, Norway.
5Dept. Mechanical and Aerospace Engineering, Sapienza University of Rome, Italy.
ABSTRACT
Hydrogen holds significant potential for decarbonizing various industries, including energy and mobility. However, the limited availability of accident data poses a significant challenge to effective safety risk analysis and assessment. This study leverages large language models to address the critical task of filling gaps in the Hydrogen Incidents and Accidents Database (HIAD) 2.1, a prominent repository of hydrogen-related unwanted events. A three-step Artificial Intelligence-driven algorithm is proposed: (i) a preprocessing phase to standardize and prepare an event description, (ii) a processing phase utilizing OpenAI's sentence embedding technology to extract semantic relationships, and (iii) an enhancement phase employing trained multilayer perceptrons to impute missing data. The algorithm demonstrates promising results in predicting categorical entries and is applied to enhance the entire database, with a specific focus on the 2019 fueling station fire in Sandvika (Norway). This case study highlights the proposed algorithm's potential to improve our understanding of hydrogen-related incidents and contribute to enhanced risk management strategies.
Keywords: Hydrogen, HIAD, Safety, Large language model, Artificial intelligence, Deep learning, Machine learning.