TY - JOUR
T1 - Multi-document summarization for patent documents based on generative adversarial network
AU - Kim, Sunhye
AU - Yoon, Byungun
N1 - Publisher Copyright:
© 2022 Elsevier Ltd
PY - 2022/11/30
Y1 - 2022/11/30
N2 - Given the exponential growth of patent documents, automatic patent summarization methods to facilitate the patent analysis process are in strong demand. Recently, the development of natural language processing (NLP), text-mining, and deep learning has greatly improved the performance of text summarization models for general documents. However, existing models cannot be successfully applied to patent documents, because patent documents describing an inventive technology and using domain-specific words have many differences from general documents. To address this challenge, we propose in this study a multi-patent summarization approach based on deep learning to generate an abstractive summarization considering the characteristics of a patent. Single patent summarization and multi-patent summarization were performed through a patent-specific feature extraction process, a summarization model based on generative adversarial network (GAN), and an inference process using topic modeling. The proposed model was verified by applying it to a patent in the drone technology field. In consequence, the proposed model performed better than existing deep learning summarization models. The proposed approach enables high-quality information summary for a large number of patent documents, which can be used by R&D researchers and decision-makers. In addition, it can provide a guideline for deep learning research using patent data.
AB - Given the exponential growth of patent documents, automatic patent summarization methods to facilitate the patent analysis process are in strong demand. Recently, the development of natural language processing (NLP), text-mining, and deep learning has greatly improved the performance of text summarization models for general documents. However, existing models cannot be successfully applied to patent documents, because patent documents describing an inventive technology and using domain-specific words have many differences from general documents. To address this challenge, we propose in this study a multi-patent summarization approach based on deep learning to generate an abstractive summarization considering the characteristics of a patent. Single patent summarization and multi-patent summarization were performed through a patent-specific feature extraction process, a summarization model based on generative adversarial network (GAN), and an inference process using topic modeling. The proposed model was verified by applying it to a patent in the drone technology field. In consequence, the proposed model performed better than existing deep learning summarization models. The proposed approach enables high-quality information summary for a large number of patent documents, which can be used by R&D researchers and decision-makers. In addition, it can provide a guideline for deep learning research using patent data.
KW - Generative adversarial network (GAN)
KW - Natural language processing (NLP)
KW - Patent analysis
KW - Patent summarization
KW - Text mining
UR - http://www.scopus.com/inward/record.url?scp=85133717721&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2022.117983
DO - 10.1016/j.eswa.2022.117983
M3 - Article
AN - SCOPUS:85133717721
SN - 0957-4174
VL - 207
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 117983
ER -