TY - JOUR
T1 - Seo, Hye-Jin and Jeong-Ah Shin 2020. Exploring the relationship between the predictability and the behavioral reaction time in sentence processing using corpus and deep-learning language models
AU - Seo, Hye Jin
AU - Shin, Jeong Ah
N1 - Publisher Copyright:
© 2020 KASELL.
PY - 2020
Y1 - 2020
N2 - This study examined whether the predictability is associated with the behavioral reaction times in sentence processing. The information complexity measures have been proposed to quantify the predictability for word-by-word human sentence processing. The most traditional information complexity measure is known as surprisal, which calculates relative unexpectedness at each word in a sentence (Hale 2001, Levy 2005, 2008). The most traditional information complexity measure is known as surprisal, which calculates relative unexpectedness at each word in a sentence (Hale 2001, Levy 2005, 2008), and some studies suggested that surprisal and reading times are positively correlated (Monsalve, Frank and Vigliocco 2012, Smith and Levy 2013). In order to calculate surprisal, the previous studies used one of two ways: Corpus based language models and deep learning based language models. This study, however, used both of them to analyze human reading times, comparing surprisal calculated from corpus-based language models with that calculated from deep-learning-based language models. Many studies partially investigated either of them. In this study, human reading times were analyzed by comparing surprisal calculated from corpus-based language models with that calculated from deep-learning-based language models. The results showed that surprisal calculated from corpus-based language models is more suitable to explain the behavioral reaction time data. Although the deep learning technology performs very well in the field of natural language processing, it does not seem to be human-like processing. Nonetheless, this study can contribute to the development of deep learning technology as well as computational psycholinguistic research in that it tried to compare the outcomes of corpus and deep learning technology with human behavioral responses.
AB - This study examined whether the predictability is associated with the behavioral reaction times in sentence processing. The information complexity measures have been proposed to quantify the predictability for word-by-word human sentence processing. The most traditional information complexity measure is known as surprisal, which calculates relative unexpectedness at each word in a sentence (Hale 2001, Levy 2005, 2008). The most traditional information complexity measure is known as surprisal, which calculates relative unexpectedness at each word in a sentence (Hale 2001, Levy 2005, 2008), and some studies suggested that surprisal and reading times are positively correlated (Monsalve, Frank and Vigliocco 2012, Smith and Levy 2013). In order to calculate surprisal, the previous studies used one of two ways: Corpus based language models and deep learning based language models. This study, however, used both of them to analyze human reading times, comparing surprisal calculated from corpus-based language models with that calculated from deep-learning-based language models. Many studies partially investigated either of them. In this study, human reading times were analyzed by comparing surprisal calculated from corpus-based language models with that calculated from deep-learning-based language models. The results showed that surprisal calculated from corpus-based language models is more suitable to explain the behavioral reaction time data. Although the deep learning technology performs very well in the field of natural language processing, it does not seem to be human-like processing. Nonetheless, this study can contribute to the development of deep learning technology as well as computational psycholinguistic research in that it tried to compare the outcomes of corpus and deep learning technology with human behavioral responses.
KW - Corpus-based language model
KW - Deep-learning-based language model
KW - Predictabililty
KW - Surprisal
UR - http://www.scopus.com/inward/record.url?scp=85102295396&partnerID=8YFLogxK
U2 - 10.15738/kjell.20..202012.881
DO - 10.15738/kjell.20..202012.881
M3 - Article
AN - SCOPUS:85102295396
SN - 1598-1398
VL - 2020
SP - 881
EP - 903
JO - Korean Journal of English Language and Linguistics
JF - Korean Journal of English Language and Linguistics
IS - 20
ER -