TY - GEN
T1 - Meta learning for imbalanced big data analysis by using generative adversarial networks
AU - Seo, Sanghyun
AU - Jeon, Yongjin
AU - Kim, Juntae
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/4/28
Y1 - 2018/4/28
N2 - Imbalanced big data means big data where the ratio of a certain class is relatively small compared to other classes. When the machine learning model is trained by using imbalanced big data, the problem with performance drops for the minority class occurs. For this reason, various oversampling methodologies have been proposed, but simple oversampling leads to problem of the overfitting. In this paper, we propose a meta learning methodology for efficient analysis of imbalanced big data. The proposed meta learning methodology uses the meta information of the data generated by the generative model based on Generative Adversarial Networks. It prevents the generative model from becoming too similar to the real data in minority class. Compared to the simple oversampling methodology for analyzing imbalanced big data, it is less likely to cause overfitting. Experimental results show that the proposed method can efficiently analyze imbalanced big data.
AB - Imbalanced big data means big data where the ratio of a certain class is relatively small compared to other classes. When the machine learning model is trained by using imbalanced big data, the problem with performance drops for the minority class occurs. For this reason, various oversampling methodologies have been proposed, but simple oversampling leads to problem of the overfitting. In this paper, we propose a meta learning methodology for efficient analysis of imbalanced big data. The proposed meta learning methodology uses the meta information of the data generated by the generative model based on Generative Adversarial Networks. It prevents the generative model from becoming too similar to the real data in minority class. Compared to the simple oversampling methodology for analyzing imbalanced big data, it is less likely to cause overfitting. Experimental results show that the proposed method can efficiently analyze imbalanced big data.
KW - Generative adversarial network
KW - Imbalanced big data analysis
KW - Meta learning
KW - Oversampling
UR - http://www.scopus.com/inward/record.url?scp=85051547899&partnerID=8YFLogxK
U2 - 10.1145/3220199.3220205
DO - 10.1145/3220199.3220205
M3 - Conference contribution
AN - SCOPUS:85051547899
SN - 9781450364263
T3 - ACM International Conference Proceeding Series
SP - 5
EP - 9
BT - ICBDC 2018 - Proceedings of 2018 International Conference on Big Data and Computing
PB - Association for Computing Machinery
T2 - 2018 International Conference on Big Data and Computing, ICBDC 2018
Y2 - 28 April 2018 through 30 April 2018
ER -