ABSTRACT

banner.jpg

3888-cd

Pipeline for Machine Reading of Unstructured Maintenance Work Order Records

Yiyang Gao¹, Caitlin Woods^2,a, Wei Liu^2,b, Tim French^2,c and Melinda Hodkiewicz³ ¹Electrical & Electronic Engineering, The University of Western Australia, Australia. gaoyiyang150@gmail.com ²Computer Science & Software Engineering, The University of Western Australia, Australia. ^acaitlin.woods@uwa.edu.au ^bwei.liu@uwa.edu.au ^ctim.french@uwa.edu.au ³Mechanical Engineering, The University of Western Australia, Australia. melinda.hodkiewicz@uwa.edu.au

ABSTRACT

Maintenance work order records contain vital information including inspection data, asset health observations and records of work planned and executed. However the text is unstructured with maintenance-specific jargon and abbreviations, and in mostly short grammatical sentences. Conventional natural language processing methods based on lexical dictionaries and part-of-speech tagging do not perform well in this domain. Our challenge is the extraction and tagging of machine-readable text to support calculation of asset performance metrics such as mean time to failure. For example, it is necessary to differentiate between an asset that has been repaired with one that has been replaced. We have a corpus of 690,000 mining maintenance work order records of mean length 4.57 words. In this research we first pre-process the data and create a set of reusable context-specific n-gram dictionaries. Domainrelevant rules based on structures in the data are used to tag maintenance item, activity, and state n-grams. The pipeline is tested on a randomly selected set of maintenance records that have been manually tagged by an expert. Performance is assessed using the Jaccard Index and compared with a baseline pipeline using conventional tools. The new pipeline outperforms the baseline due to the incorporation of domain knowledge particularly in the method for maintenance item identification. The maintenance dictionaries are usable for other maintenance NLP applications.

Keywords: Natural language, Technical language processing, Work order, Unstructured, Short text.

pdflogo.jpg 3888