Statistic-Based Sentiment Analysis of Social Media Data

Aman KarnStudent, Advanced College of Engineering and Management, Kupondole, Kathmandu, Nepal / Tribhuvan University, Kritipur, NepalAnish ShresthaStudent, Advanced College of Engineering and Management, Kupondole, Kathmandu, Nepal / Tribhuvan University, Kritipur, NepalAnil PudasainiStudent, Advanced College of Engineering and Management, Kupondole, Kathmandu, Nepal / Tribhuvan University, Kritipur, NepalBinay MaharaStudent, Advanced College of Engineering and Management, Kupondole, Kathmandu, Nepal / Tribhuvan University, Kritipur, NepalAnku JaiswalLecturer, Computer and Electronics Engineering Department, Advanced College of Engineering and Management, Kupondole, Kathmandu, Nepal

Vol 2 No 5 (2018): Volume 2, Issue 5, July 2018 | Pages: 28-32

International Research Journal of Innovations in Engineering and Technology

OPEN ACCESS | Research Article | Published Date: 05-07-2018

doi Logo

Abstract

This paper presents one of the practices of text/opinion mining of web data which can help to provide assistance to prepare reports for Customer Relationship Management (CRM). For convenience we use Twitter tweets data for sentiment analysis. These sentiment values are further treated with statistical evaluations like z-tests and chi-squared test. As social media data are considered as normally distributed, the test like ANOVA test, t-test (for small sample-size) and hypotheses test can be used. Python programming language is used for the task as it has several libraries and packages for Natural Language Processing, statistics, data visualization while supporting the features of general-purpose programming language. This paper also dictates the process of fetching, storing, cleaning, language - translating, sentiment & statistical evaluating, simulating & building theoretical model for hypothesis testing of the data.

Keywords

Sentiment Analysis, Natural Language Processing (NLP), Twitter Sentiment, Opinion Mining, Tweets analysis, Chi-squared test, Z-test, Social media data visualization


Citation of this Article
References
  1. Hatzivassiloglou, V. and McKeown, K. (1997). Predicting the semantic orientation of adjectives. Proceedings of the 35th annual meeting on Association for Computational Linguistics -. [online] Available at: https://dl.acm.org/citation.cfm?id=979640  DOI: 10.3115/976909.979640.
  2. Turney, P. (2001). Thumbs up or thumbs down? Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL '02. [online] Available at: https://dl.acm.org/citation.cfm?id=1073153  DOI: 10.3115/1073083.1073153
  3. Pagolu, V., Reddy, K., Panda, G. and Majhi, B. (2016). Sentiment analysis of Twitter data for predicting stock market movements. 2016 International Conference on Signal Processing, Communication, Power and Embedded System (SCOPUS). [online] Available at: http://ieeexplore.ieee.org/document/7955659/?reload=true DOI: 0.1109/SCOPES.2016.7955659
  4. Fang, X. and Zhan, J. (2015). Sentiment analysis using product review data. Journal of Big Data, [online] 2(1). Available at: https://link.springer.com/article/10.1186/s40537-015-0015-2  DOI 10.1186/s40537-015-0015-2.
  5. Church, K. and Hanks, P. (1989). Word association norms, mutual information, and lexicography. Proceedings of the 27th annual meeting on Association for Computational Linguistics-. [online] Available at: https://www.researchgate.net/publication/2477223_Word_Association_Norms_Mutual_Information_and_Lexicography DOI: 10.3115/981623.981633.
  6. Bird, S., Klein, E. and Loper, E. (2011). Natural language processing with Python. Beijing [etc.]: O'Reilly. ISBN:0596516495 9780596516499.