Sentiment Analysis: Current Issues


Abstract. In this text I present a report on current issues related to automated sentiment analysis. This report contains (1) details of problem in the area of sentiment analysis (solved and unsolved both), (2) data source for sentiment analysis, (3) current techniques and tools, and (4) Limitations of these techniques and tools.

1. Introduction to Sentiment Analysis

Sentiment analysis, generally convert onto a two level task: i.e. (1) Identifying topic (2) classifying sentiment related to that topic.Such type of statements contains several unnecessary statements not related to any topics or sentiments.
Sentiment analysis deals with the computational treatment of opinion, sentiment, and subjectivity of texts. Sentiment analysis starts with a small question: “What other people think?”, and finally convert into billions of dollars of commercial deal. After the great success of Web-2.0, sentiment analysis became a demanding and commercially supported research field.
Actually, Web 2.0 site gives its users the free choice to interact or collaborate with each other in a social media dialogue as creators of user-generated content in a virtual community. This resulted in: social-networking sites, blogs, wikis, video-sharing sites, hosted services, web applications, mashups and folksonomies etc. Now the huge increment in internet users (see the chart below, source:
http://www.internetworldstats.com/stats.htm) increases the e-commerce dealings.

1.1Data Source for Sentiment analysis
Data used in Sentiment analysis, generally contains unstructured text data from (1) blog posts, (2) user reviews (about any product), (3) chatting record, (4) opinion poll, etc. It may contain several noisy symbols, casual languages and emotion symbols. For example, if you search \hungry" with an arbitrary number of u's in the middle (e.g. huuuungry, huuuuuuungry,huuuuuuuuuungry) on Twitter, there will most likely be a nonempty result set.
Dataset: The standard dataset for Sentiment analysis can be downloaded from:
  • Wiki Blog Lists: It contains web lnk of a large number of famous English blogs and can be obtained from : http://en.wikipedia.org/wiki/List_of_blogs
  • BLOGS06 (Macdonald and Ounis, 2006) collection: It contains 148GB crawl of approximately 100,000 blogs and their respective RSS feeds. The collection has been used for 3 consecutive years by the Text REtrieval Conferences (TREC). Participants of the conference are provided with the task of finding documents (i.e. web pages) expressing an opinion about specific entities X, which may be people, companies, filmsetc. The results are given to human assessors who then judge the content of the webpages (i.e. blog post and comments) and assign each webpage a score: “1” if the document contains relevant, factual information about the entity but no expression of opinion, “2” if the document contains an explicit negative opinion towards the entity and “4” is the document contains an explicit positive opinion towards the entity. The data set can be found at http://www.trec.nist.gov
  • Multi-Domain Sentiment Dataset (version 2.0) (http://www.cs.jhu.edu/~mdredze/datasets/sentiment/): The Multi-Domain Sentiment Dataset contains product reviews taken from Amazon.com from many product types (domains). Some domains (books and dvds) have hundreds of thousands of reviews. Others (musical instruments) have only a few hundred. Reviews contain star ratings (1 to 5 stars) that can be converted into binary labels if needed. Used in a lot of recent publications.
1.2 Problem definition
In this section, I explore the problem statements related to Sentiment analysis. I start with problem of very basic nature and finished with some unsolved problems.
Analyzing sentiment using Clear Review: Such reviews contain either negative or positive opinion about product, or topics(s).
It is very simple to identify the positive or negative sentiments. For Example:
Product Reviews: Inspiron 1525
Title: Where has customer service gone
Review:I have an inspirion 1525-it was not listed in the models to review. DO NOT BUY THIS COMPUTER!!! The LCD has cracked after less than 9 months and Dell refuses to fix it under warranty. They will not tell me why they will not fix it and now after sending it all the way to Ontario to find out why there was lines on my screen- the service depot has returned it to me and the keyboard no longer functions. I can not use it all all now-lines or not!!!
Title: Insprion 1525
Review:--I rec'd my Inspiron 1525 about 1 month ago, and I LOVE it!! It is quicker than my Dell desktop, very portable and I love Windows 7. I opted for the 6 cell battery and am so thankful that I did - I almost wish I would have got the 9 cell. So, if you are looking for an everyday computer - this one is a great deal and a great computer...but I would reccommend upgrading the battery!
RESULT: 1 out of 2(50%) customers would recommend this product to a friend.
Analyzing Sentiment using Multi-theme documents:
In such type of document problem statement does not always remain so clear. It can be categorized into several different problems and successful analysis of sentiment depends on a lot of issues including (but not limited to):
  • Some time such texts contain multiple sentiments related to two or more than two issues.

  • Some time such documents contain both kinds of sentiments. i.e. Negative and positive both. Here, the identification of most effective one is a major issue.




    • In some cases the problem can be converted into multi-subjective sentiment analysis.



    • Example: “(1) I bought an iPhone a few days ago. (2) It was such a nice phone. (3) The touch screen was really cool. (4) The voice quality was clear too. (5) Although the battery life was not long, that is ok for me. (6) However, my mother was mad with me as I did not tell her before I bought it. (7) She also thought the phone was too expensive, and wanted me to return it to the shop. … ”
      Description: The above text contains total seven sentences. Contains, both kind of sentiments; i.e. positive sentiment w.r.t. buyer and negative sentiment w.r.t. his mother. It contains two issues, i.e. quality of product (a positive sentiment is attached with it) and cost issues (negative sentiment is attached with this issue), so decision of more important sentiment is also a problem.

      2 Current Trends and Techniques
      Based on above discussed problems and issues the techniques used in sentiment analysis can be categorized into following parts:
      • Document level sentiment classification: In this technique we, identify whether the given document contains positive or negative sentiment about any topic. Generally classification techniques are used to solve these issues. The general features used in these techniques are: (1) terms and their occurrence frequency (for example the use of Tf-Idf) [3], [4], [5], (2) POS taggers [2], (3) Opinion words and phrases, (4) Syntactic dependencies and (5) negative & Positive words. Ex: [2][3][4][5] and [6].
      • Using unsupervised learning: For example [7], it uses POS tagger to identify two word phrases. It estimates the orientation of the extracted phrases using the pointwise mutual information (PMI) .
      • Sentiment analysis at sentence level: techniques using this approach, considers the sentences as the source of single opinion [8], [9]. For a given a sentence s, it applies two sub-tasks: (a) Subjectivity classification: Determine whether s is a subjective sentence or an objective sentence, and (b) Sentence-level sentiment classification: If s is subjective, determine whether it expresses a positive or negative opinion.
      • Some Other Approaches: [10], It present a unified framework in which one can use background lexical information in terms of word-class associations, and refine this information for specific domains using any available training examples. [11], It analyzed the sentiment from financial documents and arose the issue of Topic-shift. It conduct an analysis of the annotated corpus, from which we show there is a significant level of topic shift within this collection, and also illustrate the difficulty that human annotators have when annotating certain sentiment categories. To deal with the problem of topic shift within blog articles, it proposes text extraction techniques to create topic-specific sub-documents, which is used to train a sentiment classier. It shows that such approaches provide a substantial improvement over full document classification and that word-based approaches perform better than sentence-based or paragraph-based approaches.
      2.1 Some Freely available Tools with brief technical details
      Twitter Sentiment[1] (http://twittersentiment.appspot.com/): It is freely available, simple sentiment analysis tool. It provides the following facilities:1. Brand management (e.g. windows 7), 2. Polling (e.g. obama), 3. Purchase planning (e.g. kindle), 4. Technology planning (e.g. streaming api), 5. Discovery (e.g. iphone app).
      Basic Techniques Used: [2], It uses N-grams (N=1, 2, and 3) to identify the emotions attached with Twitter statements, for this it uses Stanford POS-Tagger. It removes the emoticon (icons which shows emotion, i.e. J etc.), as it can misguide the final solution. Finally, it applies three different classifiers i.e. (1) Keyword based, (2) Naïve Bayes classifier, (3) Maximum entropy based model and (4) Support Vector based model, to classify the sentiments of Twitter statements


      LingPipe (http://alias-i.com/lingpipe/index.html): ([3], [4], [5]) LingPipe is a computational linguistic based text processing tool-kit. It considers the sentiment analysis as classification problem. It categorizes the entire problem into two classes:
      • Subjective (opinion) vs. Objective (fact) sentences.
      •        Positive (favorable) vs. Negative (unfavorable) movie reviews.
      Method Used: It uses the concept of sentence polarity. To determine this sentiment polarity, it proposes a machine-learning method that applies text-categorization techniques to just separate the subjective portions of the document. For this, as depicted in Figure 1, it uses a subjectivity detector that determines whether each sentence is subjective or not: discarding the objective ones creates an extract that should better represent a review's subjective content to a default polarity classifier.
      Finally a graph-cut (basically Min-cut) algorithm is applied to partition the negative and positive sentiments.

      3 Automated Sentiment Analysis: Reality

      The current report on Automated Sentimental analysis tools says: “Automated sentiment analysis is less accurate then flipping a coin when it comes to determining whether brand mentions in social media are positive or negative, according to a white paper from FreshMinds [1]”.
      Tests of a range of different social media monitoring tools conducted by the research consultancy found that comments were, on average, correctly categorized only 30% of the time.
      FreshMinds’ experiment involved tools from Alterian (http://www.alterian.com/), Biz360 (http://www.biz360.com/), Brandwatch (http://www.brandwatch.com/), Nielsen (http://www.nielsen.com/), Radian6 (http://www.radian6.com/), Scoutlabs (http://www.scoutlabs.com/) and Sysomos (http://www.sysomos.com/). The products were tested on how well they assessed comments made about the coffee chain Starbucks, with the comments also having been manually coded.
      On aggregate the results look good, said FreshMinds. Accuracy levels were between 60% and 80% when the automated tools were reporting whether a brand mention was either positive, negative or neutral.
      “However, this masks what is really going on here,” writes Matt Rhodes, a director of sister company FreshNetworks, in a blog post. “In our test case on the Starbucks brand, approximately 80% of all comments we found were neutral in nature.

      3.1 Some Other Limitations
      As the sentiment analysis depends upon a lot of techniques including (1) data mining based techniques (i.e. classification, clustering etc. all are not 100% accurate), (2) Linguistic techniques (i.e. POS tagger, dictionaries, lexical analyzer etc. all such technique is not 100% accurate) and (3) Use of predefined opinion words or tags (the opinion words can misguides us several times).
      Similarity, due to presence of lot of noisy statements in dataset, it becomes tougher to achieve the highly reliable results. 

      5 Current Research Problems
      In this section I have presented some problems and issues, which require more focus to achieve better result in the field of Sentiment analysis.
      1. Instead of concentrating only on either (a) document level, (b) paragraph level, (c) sentence level or (d) feature based approach; can a better combination of all the above discussed technique give better result?
      2. Topic-shift in sentence is still list studied for sentiment analysis.
      3. We generally use sentiment labels ranging from (1) Very Negative to Very Positive: Very Negative, Negative, Neutral, Positive, Very Positive; (2) Negative to positive; (3) Negative to neutral; (4) Positive to neutral etc. In most of the paper that I read; I found they use this type of shift in classification. There should be some effect of such shifting and including this effect may give more effective result.
      References
      1. Turning conversations into insights: A comparison of Social Media Monitoring Tools; A white paper from FreshMinds Research 14th May 2010;FreshMinds 229-231 High Holborn London WC1V 7DA Tel: +44 20 7692 4300 Fax: +44 870 46 01596 www.freshminds.co.uk.
      2. Alec Go; Richa Bhayani; Lei Huang; Twitter Sentiment Classification using Distant Supervision; Technical report, Stanford University.
      3. Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment Classification using Machine Learning Techniques. EMNLP Proceedings.
      4. Bo Pang and Lillian Lee. 2004. A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts. ACL Proceedings.
      5. Bo Pang and Lillian Lee. 2005. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. ACL Proceedings.
      6. Chenghua Lin, Yulan He;Joint Sentiment/Topic Model for Sentiment Analysis; CIKM’09, November 2–6, 2009, Hong Kong, China.Copyright 2009 ACM 978-1-60558-512-3/09/11.
      7. P. Turney, “Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews,” Proceedings of the Association for Computational Linguistics (ACL), pp. 417–424, 2002..
      8. R. Ghani, K. Probst, Y. Liu, M. Krema, and A. Fano, “Text mining for product attribute extraction,” SIGKDD Explorations Newsletter, vol. 8, pp. 41–48, 2006.
      9. E. Riloff, S. Patwardhan, and J. Wiebe, “Feature subsumption for opinion analysis,” Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2006.
      10. Prem Melville, Wojciech Gryc, Richard D. Lawrence; Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification;KDD’09, June 28–July 1, 2009, Paris, France.Copyright 2009 ACM 978-1-60558-495-9/09/06. 
      11. Neil O’Hare, Michael Davy, Adam Bermingham, Paul Ferguson,Páraic Sheridan, Cathal Gurrin, Alan F.meaton1; Topic-Dependent Sentiment Analysis of Financial Blogs; TSA’09, November 6, 2009, Hong Kong, China.Copyright 2009 ACM 978-1-60558-805-6/09/11.

      [1] http://twittersentiment.appspot.com/