Statistic-Based Sentiment Analysis of Social Media Data

Abstract

This paper presents one of the practices of text/opinion mining of web data which can help to provide assistance to prepare reports for Customer Relationship Management (CRM). For convenience we use Twitter tweets data for sentiment analysis. These sentiment values are further treated with statistical evaluations like z-tests and chi-squared test. As social media data are considered as normally distributed, the test like ANOVA test, t-test (for small sample-size) and hypotheses test can be used. Python programming language is used for the task as it has several libraries and packages for Natural Language Processing, statistics, data visualization while supporting the features of general-purpose programming language. This paper also dictates the process of fetching, storing, cleaning, language - translating, sentiment & statistical evaluating, simulating & building theoretical model for hypothesis testing of the data.

Country : Nepal

1 Aman Karn2 Anish Shrestha3 Anil Pudasaini4 Binay Mahara5 Anku Jaiswal

  1. Student, Advanced College of Engineering and Management, Kupondole, Kathmandu, Nepal / Tribhuvan University, Kritipur, Nepal
  2. Student, Advanced College of Engineering and Management, Kupondole, Kathmandu, Nepal / Tribhuvan University, Kritipur, Nepal
  3. Student, Advanced College of Engineering and Management, Kupondole, Kathmandu, Nepal / Tribhuvan University, Kritipur, Nepal
  4. Student, Advanced College of Engineering and Management, Kupondole, Kathmandu, Nepal / Tribhuvan University, Kritipur, Nepal
  5. Lecturer, Computer and Electronics Engineering Department, Advanced College of Engineering and Management, Kupondole, Kathmandu, Nepal

IRJIET, Volume 2, Issue 5, July 2018 pp. 28-32

References

  1. Hatzivassiloglou, V. and McKeown, K. (1997). Predicting the semantic orientation of adjectives. Proceedings of the 35th annual meeting on Association for Computational Linguistics -. [online] Available at: https://dl.acm.org/citation.cfm?id=979640  DOI: 10.3115/976909.979640.
  2. Turney, P. (2001). Thumbs up or thumbs down? Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL '02. [online] Available at: https://dl.acm.org/citation.cfm?id=1073153  DOI: 10.3115/1073083.1073153
  3. Pagolu, V., Reddy, K., Panda, G. and Majhi, B. (2016). Sentiment analysis of Twitter data for predicting stock market movements. 2016 International Conference on Signal Processing, Communication, Power and Embedded System (SCOPUS). [online] Available at: http://ieeexplore.ieee.org/document/7955659/?reload=true DOI: 0.1109/SCOPES.2016.7955659
  4. Fang, X. and Zhan, J. (2015). Sentiment analysis using product review data. Journal of Big Data, [online] 2(1). Available at: https://link.springer.com/article/10.1186/s40537-015-0015-2  DOI 10.1186/s40537-015-0015-2.
  5. Church, K. and Hanks, P. (1989). Word association norms, mutual information, and lexicography. Proceedings of the 27th annual meeting on Association for Computational Linguistics-. [online] Available at: https://www.researchgate.net/publication/2477223_Word_Association_Norms_Mutual_Information_and_Lexicography DOI: 10.3115/981623.981633.
  6. Bird, S., Klein, E. and Loper, E. (2011). Natural language processing with Python. Beijing [etc.]: O'Reilly. ISBN:0596516495 9780596516499.