TY - JOUR
T1 - Enhanced Evaluation Method of Musical Instrument Digital Interface Data based on Random Masking and Seq2Seq Model
AU - Jiang, Zhe
AU - Li, Shuyu
AU - Sung, Yunsick
N1 - Publisher Copyright:
© 2022 by the authors.
PY - 2022/8
Y1 - 2022/8
N2 - With developments in artificial intelligence (AI), it is possible for novel applications to utilize deep learning to compose music by the format of musical instrument digital interface (MIDI) even without any knowledge of musical theory. The composed music is generally evaluated by human-based Turing test, which is a subjective approach and does not provide any quantitative criteria. Therefore, objective evaluation approaches with many general descriptive parameters are applied to the evaluation of MIDI data while considering MIDI features such as pitch distances, chord rates, tone spans, drum patterns, etc. However, setting several general descriptive parameters manually on large datasets is difficult and has considerable generalization limitations. In this paper, an enhanced evaluation method based on random masking and sequence-to-sequence (Seq2Seq) model is proposed to evaluate MIDI data. An experiment was conducted on real MIDI data, generated MIDI data, and random MIDI data. The bilingual evaluation understudy (BLEU) is a common MIDI data evaluation approach and is used here to evaluate the performance of the proposed method in a comparative study. In the proposed method, the ratio of the average evaluation score of the generated MIDI data to that of the real MIDI data was 31%, while that of BLEU was 79%. The lesser the ratio, the greater the difference between the real MIDI data and generated MIDI data. This implies that the proposed method quantified the gap while accurately identifying real and generated MIDI data.
AB - With developments in artificial intelligence (AI), it is possible for novel applications to utilize deep learning to compose music by the format of musical instrument digital interface (MIDI) even without any knowledge of musical theory. The composed music is generally evaluated by human-based Turing test, which is a subjective approach and does not provide any quantitative criteria. Therefore, objective evaluation approaches with many general descriptive parameters are applied to the evaluation of MIDI data while considering MIDI features such as pitch distances, chord rates, tone spans, drum patterns, etc. However, setting several general descriptive parameters manually on large datasets is difficult and has considerable generalization limitations. In this paper, an enhanced evaluation method based on random masking and sequence-to-sequence (Seq2Seq) model is proposed to evaluate MIDI data. An experiment was conducted on real MIDI data, generated MIDI data, and random MIDI data. The bilingual evaluation understudy (BLEU) is a common MIDI data evaluation approach and is used here to evaluate the performance of the proposed method in a comparative study. In the proposed method, the ratio of the average evaluation score of the generated MIDI data to that of the real MIDI data was 31%, while that of BLEU was 79%. The lesser the ratio, the greater the difference between the real MIDI data and generated MIDI data. This implies that the proposed method quantified the gap while accurately identifying real and generated MIDI data.
KW - deep learning
KW - music evaluation
KW - musical instrument digital interface
KW - random masking
KW - sequence-to-sequence model
UR - http://www.scopus.com/inward/record.url?scp=85136779465&partnerID=8YFLogxK
U2 - 10.3390/math10152747
DO - 10.3390/math10152747
M3 - Article
AN - SCOPUS:85136779465
SN - 2227-7390
VL - 10
JO - Mathematics
JF - Mathematics
IS - 15
M1 - 2747
ER -