TY - JOUR
T1 - From intuition to intelligence
T2 - a text mining–based approach for movies' green-lighting process
AU - Kim, Jongdae
AU - Lee, Youseok
AU - Song, Inseong
N1 - Publisher Copyright:
© 2021, Emerald Publishing Limited.
PY - 2022/5/9
Y1 - 2022/5/9
N2 - Purpose: The purpose of this paper is to develop a predictive model for box office performance based on the textual information in movie scripts in the green-lighting process of movie production. Design/methodology/approach: The authors use Latent Dirichlet Allocation to determine the hidden textual structure in movie scripts by extracting topic probabilities as predictors for classification. The extracted topic probabilities are used as inputs for the predictive model for the box office performance. For the predictive model, the authors utilize a variety of classification algorithms such as logistic classification, decision trees, random forests, k-nearest neighbor algorithms, support vector machines and artificial neural networks, and compare their relative performances in predicting movies' market performance. Findings: This approach for extracting textual information from movie scripts produces a valuable typology for movies. Moreover, our modeling approach has significant power to predict movie scripts' profitability. It provides a superior prediction performance compared to previous benchmarks, such as that of Eliashberg et al. (2007). Research limitations/implications: This work contributes to literature on predicting the box office performance in the green-lighting process and literature regarding suggesting models for the idea screening stage in the new product development process. Besides, this is one of the few studies that use movie script data to predict movies' financial performance by proposing an approach to integrate text mining models and machine learning algorithms with movie experts' intuition. Practical implications: First, the authors’ approach can significantly reduce the financial risk associated with movie production decisions before the pre-production stage. Second, this paper proposes an approach that is applicable at a very early stage of new product development, such as the idea screening stage. The authors also introduce an online-based movie scenario database system that can help movie studios make more systematic and profitable decisions in the green-lighting process. Third, this approach can help movie studios estimate movie scripts' financial value. Originality/value: This study is one of the few studies to forecast market performance in the green-lighting process.
AB - Purpose: The purpose of this paper is to develop a predictive model for box office performance based on the textual information in movie scripts in the green-lighting process of movie production. Design/methodology/approach: The authors use Latent Dirichlet Allocation to determine the hidden textual structure in movie scripts by extracting topic probabilities as predictors for classification. The extracted topic probabilities are used as inputs for the predictive model for the box office performance. For the predictive model, the authors utilize a variety of classification algorithms such as logistic classification, decision trees, random forests, k-nearest neighbor algorithms, support vector machines and artificial neural networks, and compare their relative performances in predicting movies' market performance. Findings: This approach for extracting textual information from movie scripts produces a valuable typology for movies. Moreover, our modeling approach has significant power to predict movie scripts' profitability. It provides a superior prediction performance compared to previous benchmarks, such as that of Eliashberg et al. (2007). Research limitations/implications: This work contributes to literature on predicting the box office performance in the green-lighting process and literature regarding suggesting models for the idea screening stage in the new product development process. Besides, this is one of the few studies that use movie script data to predict movies' financial performance by proposing an approach to integrate text mining models and machine learning algorithms with movie experts' intuition. Practical implications: First, the authors’ approach can significantly reduce the financial risk associated with movie production decisions before the pre-production stage. Second, this paper proposes an approach that is applicable at a very early stage of new product development, such as the idea screening stage. The authors also introduce an online-based movie scenario database system that can help movie studios make more systematic and profitable decisions in the green-lighting process. Third, this approach can help movie studios estimate movie scripts' financial value. Originality/value: This study is one of the few studies to forecast market performance in the green-lighting process.
KW - Latent Dirichlet allocation
KW - Machine learning
KW - Movie industry
KW - New product development
KW - Predictive models
UR - https://www.scopus.com/pages/publications/85129365881
U2 - 10.1108/INTR-11-2020-0651
DO - 10.1108/INTR-11-2020-0651
M3 - Article
AN - SCOPUS:85129365881
SN - 1066-2243
VL - 32
SP - 1003
EP - 1022
JO - Internet Research
JF - Internet Research
IS - 3
ER -