• KSII Transactions on Internet and Information Systems
    Monthly Online Journal (eISSN: 1976-7277)

Text Mining and Sentiment Analysis for Predicting Box Office Success

Vol. 12, No.8, August 31, 2018
10.3837/tiis.2018.08.030, Download Paper (Free):

Abstract

After emerging online communications, text mining and sentiment analysis has been frequently applied into analyzing electronic word-of-mouth. This study aims to develop a domain-specific lexicon of sentiment analysis to predict box office success in Korea film market and validate the feasibility of the lexicon. Natural language processing, a machine learning algorithm, and a lexicon-based sentiment classification method are employed. To create a movie domain sentiment lexicon, 233,631 reviews of 147 movies with popularity ratings is collected by a XML crawling package in R program. We accomplished 81.69% accuracy in sentiment classification by the Korean sentiment dictionary including 706 negative words and 617 positive words. The result showed a stronger positive relationship with box office success and consumers’ sentiment as well as a significant positive effect in the linear regression for the predicting model. In addition, it reveals emotion in the usergenerated content can be a more accurate clue to predict business success.


Statistics

Show / Hide Statistics

Statistics (Cumulative Counts from December 1st, 2015)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article

[IEEE Style]
Yoosin Kim, Mingon Kang and Seung Ryul Jeong, "Text Mining and Sentiment Analysis for Predicting Box Office Success," KSII Transactions on Internet and Information Systems, vol. 12, no. 8, pp. 4090-4102, 2018. DOI: 10.3837/tiis.2018.08.030

[ACM Style]
Kim, Y., Kang, M., and Jeong, S. R. 2018. Text Mining and Sentiment Analysis for Predicting Box Office Success. KSII Transactions on Internet and Information Systems, 12, 8, (2018), 4090-4102. DOI: 10.3837/tiis.2018.08.030