TY - JOUR
T1 - Unveiling Cryptocurrency Conversations
T2 - Insights From Data Mining and Unsupervised Learning Across Multiple Platforms
AU - Jung, Hae Sun
AU - Lee, Haein
AU - Kim, Jang Hyun
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2023
Y1 - 2023
N2 - The rapid growth of the cryptocurrency market has led to an increasing interest in the subject. Cryptocurrency is now recognized as an asset, and laws and financial regulations have begun to emerge for supporting its practical use. As a result, it has become essential to perform data mining and attain knowledge from text data related to cryptocurrency. Previous studies have focused on analyzing data from a single source such as Twitter. However, there are unique insights to be gained from data across multiple platforms. In the present study, we utilized data mining techniques to extract insights from LexisNexis, Web of Science, and Reddit, representing the media, academia, and general public, respectively. Among unsupervised learning technologies, topic modeling was employed for the analysis. Topic modeling is a methodology that uncovers hidden meanings within the collected data. Among the diverse topic modeling techniques available, bidirectional encoder representations from transformers topic was chosen for the analysis. BERTopic considered to be state-of-the-art in the field of topic modeling. Dynamic topic modeling was employed to track changes in themes over time. Our experimental results reveal a tendency in the news to cover major events related to cryptocurrencies, such as regulatory developments and market trends. Academic papers, on the other hand, tend to focus on the technology behind cryptocurrencies and related research. Finally, social media conversations center more around information delivery from an investor's psychological perspective, such as market sentiment and investment strategies.
AB - The rapid growth of the cryptocurrency market has led to an increasing interest in the subject. Cryptocurrency is now recognized as an asset, and laws and financial regulations have begun to emerge for supporting its practical use. As a result, it has become essential to perform data mining and attain knowledge from text data related to cryptocurrency. Previous studies have focused on analyzing data from a single source such as Twitter. However, there are unique insights to be gained from data across multiple platforms. In the present study, we utilized data mining techniques to extract insights from LexisNexis, Web of Science, and Reddit, representing the media, academia, and general public, respectively. Among unsupervised learning technologies, topic modeling was employed for the analysis. Topic modeling is a methodology that uncovers hidden meanings within the collected data. Among the diverse topic modeling techniques available, bidirectional encoder representations from transformers topic was chosen for the analysis. BERTopic considered to be state-of-the-art in the field of topic modeling. Dynamic topic modeling was employed to track changes in themes over time. Our experimental results reveal a tendency in the news to cover major events related to cryptocurrencies, such as regulatory developments and market trends. Academic papers, on the other hand, tend to focus on the technology behind cryptocurrencies and related research. Finally, social media conversations center more around information delivery from an investor's psychological perspective, such as market sentiment and investment strategies.
KW - BERTopic
KW - Bitcoin
KW - cryptocurrency
KW - data mining
KW - machine learning
KW - natural language processing
KW - topic modeling
KW - unsupervised learning
UR - https://www.scopus.com/pages/publications/85178015311
U2 - 10.1109/ACCESS.2023.3334617
DO - 10.1109/ACCESS.2023.3334617
M3 - Article
AN - SCOPUS:85178015311
SN - 2169-3536
VL - 11
SP - 130573
EP - 130583
JO - IEEE Access
JF - IEEE Access
ER -