Proceedings of the
35th European Safety and Reliability Conference (ESREL2025) and
the 33rd Society for Risk Analysis Europe Conference (SRA-E 2025)
15 – 19 June 2025, Stavanger, Norway

Knowledge Graph Construction of Large Language Model Retrieval Augmented Generation for Oil and Gas HSE Professional Q&A System

Yiyue Chena, Jiyu Zhaib, Shuncheng Wuc, Kun Tiand and Xu Songe

Research Institute of Safety and Environmental Technology, CNPC, China.

ABSTRACT

Among Oil and gas HSE (Health, Safety, and Environment) management, supervision and inspection are essential to stable production. During inspections, dealing with complex professional knowledge are challenged by heavy manual review and low efficiency. Currently, while Large Language Models unfold powerful text comprehension and generation capabilities, their performance is limited in the HSE field due to lack of specialized knowledge and data resources. There exists a contradiction between the professional answers demand and the non-targeted, lacking reference basis responses. Therefore, knowledge graph technology is introduced to enrich knowledge resources and enhance question-answering effectiveness. This study focused on constructing a knowledge graph in the oil and gas HSE field, to enhance the accuracy, recall and pertinence of LLM question-answering. Firstly, collect HSE related policy files, design ontology model concepts, including production process, applicable object, business domain, theory and policy, data sources, etc. Secondly, design ontology model relations, including part-of, kind-of, instance-of, attribute-of, meet-above, etc. Thirdly, identify entities of policy file and extract relationships to construct domain knowledge graph. Finally, store the vectorized content of sliced policy files in the vector knowledge base, to provide rich and accurate knowledge external sources for the large model. Question-answering tests show that the constructed knowledge graph significantly enhances the performance of Large Language Model answering. In small scene query, it can directly answer the standard specification requirements; In standard query, the most applicable files are matched according to multi-dimensional information; In reasoning question, it supports the analysis of the thought chain, and provides the legal basis step by step. Overall, the construction of the knowledge graph effectively tackles the challenge of evidence-based question-answering by large models within the professional field, offering valuable knowledge support to technicians and managers during production activities and HSE inspections.

Keywords: Knowledge graph, Domain ontology model, Large Language Model(LLM), Retrieval Augmented Generation (RAG), Oil and gas production, HSE management system.



Download PDF