--- abstract: "This paper presents a simple unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not recommended (thumbs down). The classification of a review is predicted by the average semantic orientation of the phrases in the review that contain adjectives or adverbs. A phrase has a positive semantic orientation when it has good associations (e.g., \"subtle nuances\") and a negative semantic orientation when it has bad associations (e.g., \"very cavalier\"). In this paper, the semantic orientation of a phrase is calculated as the mutual information between the given phrase and the word \"excellent\" minus the mutual information between the given phrase and the word \"poor\". A review is classified as recommended if the average semantic orientation of its phrases is positive. The algorithm achieves an average accuracy of 74% when evaluated on 410 reviews from Epinions, sampled from four different domains (reviews of automobiles, banks, movies, and travel destinations). The accuracy ranges from 84% for automobile reviews to 66% for movie reviews. \n\n" altloc: - http://extractor.iit.nrc.ca/reports/acl02.pdf chapter: ~ commentary: ~ commref: ~ confdates: July 8-10 conference: 40th Annual Meeting of the Association for Computational Linguistics (ACL'02) confloc: 'Philadelphia, Pennsylvania' contact_email: ~ creators_id: [] creators_name: - family: Turney given: Peter D. honourific: '' lineage: '' date: 2002 date_type: published datestamp: 2002-07-15 department: ~ dir: disk0/00/00/23/21 edit_lock_since: ~ edit_lock_until: ~ edit_lock_user: ~ editors_id: [] editors_name: [] eprint_status: archive eprintid: 2321 fileinfo: /style/images/fileicons/application_postscript.png;/2321/1/turney%2Dacl02%2Dfinal.ps|/style/images/fileicons/application_pdf.png;/2321/5/turney%2Dacl02%2Dfinal.pdf full_text_status: public importid: ~ institution: ~ isbn: ~ ispublished: pub issn: ~ item_issues_comment: [] item_issues_count: 0 item_issues_description: [] item_issues_id: [] item_issues_reported_by: [] item_issues_resolved_by: [] item_issues_status: [] item_issues_timestamp: [] item_issues_type: [] keywords: ~ lastmod: 2011-03-11 08:54:57 latitude: ~ longitude: ~ metadata_visibility: show note: ~ number: ~ pagerange: 417-424 pubdom: FALSE publication: ~ publisher: ~ refereed: TRUE referencetext: | Agresti, A. 1996. An introduction to categorical data analysis. New York: Wiley. Brill, E. 1994. Some advances in transformation-based part of speech tagging. Proceedings of the Twelfth National Conference on Artificial Intelligence (pp. 722-727). Menlo Park, CA: AAAI Press. Church, K.W., & Hanks, P. 1989. Word association norms, mutual information and lexicography. Proceedings of the 27th Annual Conference of the ACL (pp. 76-83). New Brunswick, NJ: ACL. Frank, E., & Hall, M. 2001. A simple approach to ordinal classification. Proceedings of the Twelfth European Conference on Machine Learning (pp. 145-156). Berlin: Springer-Verlag. Hatzivassiloglou, V., & McKeown, K.R. 1997. Predicting the semantic orientation of adjectives. Proceedings of the 35th Annual Meeting of the ACL and the 8th Conference of the European Chapter of the ACL (pp. 174-181). New Brunswick, NJ: ACL. Hatzivassiloglou, V., & Wiebe, J.M. 2000. Effects of adjective orientation and gradability on sentence subjectivity. Proceedings of 18th International Conference on Computational Linguistics. New Brunswick, NJ: ACL. Hearst, M.A. 1992. Direction-based text interpretation as an information access refinement. In P. Jacobs (Ed.), Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval. Mahwah, NJ: Lawrence Erlbaum Associates. Landauer, T.K., & Dumais, S.T. 1997. A solution to Plato’s problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104, 211-240. Santorini, B. 1995. Part-of-Speech Tagging Guidelines for the Penn Treebank Project (3rd revision, 2nd printing). Technical Report, Department of Computer and Information Science, University of Pennsylvania. Spertus, E. 1997. Smokey: Automatic recognition of hostile messages. Proceedings of the Conference on Innovative Applications of Artificial Intelligence (pp. 1058-1065). Menlo Park, CA: AAAI Press. Tong, R.M. 2001. An operational system for detecting and tracking opinions in on-line discussions. Working Notes of the ACM SIGIR 2001 Workshop on Operational Text Classification (pp. 1-6). New York, NY: ACM. Turney, P.D. 2001. Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. Proceedings of the Twelfth European Conference on Machine Learning (pp. 491-502). Berlin: Springer-Verlag. Wiebe, J.M. 2000. Learning subjective adjectives from corpora. Proceedings of the 17th National Conference on Artificial Intelligence. Menlo Park, CA: AAAI Press. Wiebe, J.M., Bruce, R., Bell, M., Martin, M., & Wilson, T. 2001. A corpus study of evaluative and speculative language. Proceedings of the Second ACL SIG on Dialogue Workshop on Discourse and Dialogue. Aalborg, Denmark. relation_type: [] relation_uri: [] reportno: ~ rev_number: 14 series: ~ source: ~ status_changed: 2007-09-12 16:44:12 subjects: - comp-sci-art-intel - comp-sci-lang - comp-sci-mach-learn - comp-sci-stat-model succeeds: ~ suggestions: ~ sword_depositor: ~ sword_slug: ~ thesistype: ~ title: 'Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews' type: confpaper userid: 2175 volume: ~