TY - JOUR
T1 - Narrative texts-based anomaly detection using accident report documents
T2 - The case of chemical process safety
AU - Song, Bomi
AU - Suh, Yongyoon
N1 - Publisher Copyright:
© 2018 Elsevier Ltd
PY - 2019/1
Y1 - 2019/1
N2 - For detecting anomalous conditions of accidents, previous studies have usually used numeric data such as physical signals and conditions. However, invaluable text information contained in accident report documents such as accident description and situation has yet to be used to analyze the anomaly conditions. In this respect, this study aims to propose the text mining-based local outlier factor (LOF) algorithm approach to detecting anomalous conditions using accident report documents, focusing on the text information. In this study, anomalous conditions are defined as the unexperienced accidents that occur in unusual conditions. The unusual conditions are identified in terms of qualitative variables of the accident narrative texts such as locations, processes, and work types. The text mining algorithm is applied to systematically investigate these unusual contexts through the text contents contained in accident report documents and the LOF algorithm is used to identify anomaly accidents in terms of local density clusters. The LOF algorithm is recognized as one of the anomaly detection algorithms to identify the outliers among data clusters based on the density-based clustering. As a result, four major types of anomaly accidents in chemical process are derived: filling-related, detection-related, ventilation-related, and waste-related accidents. Also, risk keywords of the anomaly accidents in each type are extracted and compared with the keywords of the normal accidents to understand the detailed anomalous conditions. By extracting and prioritizing the anomaly conditions based on text information, not based on numeric value, the proposed approach enables safety managers to monitor the natural language-based risk factors and reasons of infrequent, anomalous, and critical accidents.
AB - For detecting anomalous conditions of accidents, previous studies have usually used numeric data such as physical signals and conditions. However, invaluable text information contained in accident report documents such as accident description and situation has yet to be used to analyze the anomaly conditions. In this respect, this study aims to propose the text mining-based local outlier factor (LOF) algorithm approach to detecting anomalous conditions using accident report documents, focusing on the text information. In this study, anomalous conditions are defined as the unexperienced accidents that occur in unusual conditions. The unusual conditions are identified in terms of qualitative variables of the accident narrative texts such as locations, processes, and work types. The text mining algorithm is applied to systematically investigate these unusual contexts through the text contents contained in accident report documents and the LOF algorithm is used to identify anomaly accidents in terms of local density clusters. The LOF algorithm is recognized as one of the anomaly detection algorithms to identify the outliers among data clusters based on the density-based clustering. As a result, four major types of anomaly accidents in chemical process are derived: filling-related, detection-related, ventilation-related, and waste-related accidents. Also, risk keywords of the anomaly accidents in each type are extracted and compared with the keywords of the normal accidents to understand the detailed anomalous conditions. By extracting and prioritizing the anomaly conditions based on text information, not based on numeric value, the proposed approach enables safety managers to monitor the natural language-based risk factors and reasons of infrequent, anomalous, and critical accidents.
KW - Accident documents
KW - Anomaly detection
KW - Local outlier factor (LOF)
KW - Narrative texts
KW - Process safety
KW - Text mining
UR - http://www.scopus.com/inward/record.url?scp=85056642922&partnerID=8YFLogxK
U2 - 10.1016/j.jlp.2018.08.010
DO - 10.1016/j.jlp.2018.08.010
M3 - Article
AN - SCOPUS:85056642922
SN - 0950-4230
VL - 57
SP - 47
EP - 54
JO - Journal of Loss Prevention in the Process Industries
JF - Journal of Loss Prevention in the Process Industries
ER -