Actas de congresos
Empath: A Framework For Evaluating Entity-level Sentiment Analysis
Registration in:
9781457715914
2011 8th International Conference And Expo On Emerging Technologies For A Smarter World, Cewit 2011. , v. , n. , p. - , 2011.
10.1109/CEWIT.2011.6135866
2-s2.0-84857221210
Author
Ward C.B.
Choi Y.
Skiena S.
Xavier E.C.
Institutions
Abstract
Sentiment analysis is the fundamental component in text-driven monitoring or forecasting systems, where the general sentiment towards real-world entities (e.g., people, products, organizations) are analyzed based on the sentiment signals embedded in a myriad of web text available today. Building such systems involves several practically important problems, from data cleansing (e.g., boilerplate removal, web-spam detection), and sentiment analysis at individual mention-level (e.g., phrase, sentence-, document-level) to the aggregation of sentiment for each entity-level (e.g., person, company) analysis. Most previous research in sentiment analysis however, has focused only on individual mention-level analysis, and there has been relatively less work that copes with other practically important problems for enabling a large-scale sentiment monitoring system. In this paper, we propose Empath, a new framework for evaluating entity-level sentiment analysis. Empath leverages objective measurements of entities in various domains such as people, companies, countries, movies, and sports, to facilitate entity-level sentiment analysis and tracking. We demonstrate the utility of Empath for the evaluation of a large-scale sentiment system by applying it to various lexicons using Lydia, our own large scale text-analytics tool, over a corpus consisting of more than a terabyte of newspaper data. We expect that Empath will encourage research that encompasses end-to-end pipelines to enable a large-scale text-driven monitoring and forecasting systems. © 2011 IEEE.
Bautin Vijayarenu, L.M., Skiena, S., (2008) International Sentiment Analysis for News and Blogs Bautin, M., Ward, C., Patil, A., Skiena, S., Access: News and blog analysis for the social sciences (2010) 19th Int. World Wide Web Conference (WWW 2010), , Raleigh NC Bradley, M.M., Lang, J.P., (1999) Affective Norms for English Words (Anew): Instruction Manual and Affective Ratings, , Technical report, The Center for Research in Psychophysiology, University of Florida Carey Lebo, M.S., Skiena, S., Leading, following or informing: Online news and its impact on British public attitudes (2008) Nuffield College Political Science Seminar Series, , University of Oxford, 11 November Choi, Y., Cardie, C., Adapting a polarity lexicon using integer linear programming for domain-specific sentiment classification (2009) Conference on Empirical Methods in Natural Language Processing, pp. 590-598 Dave Lawrence, S.K., Pennock, D.M., Mining the peanut gallery: Opinion extraction and semantic classification of product reviews (2003) Proceedings of the 12th International Conference on World Wide Web, WWW '03, pp. 519-528. , New York, NY, USA. ACM Devitt, A., Ahmad, K., Sentiment analysis and the use of extrinsic datasets in evaluation (2008) Proceedings of the Sixth International Language Resources and Evaluation (LREC'08) Esuli, A., Sebastiani, F., Sentiwordnet: A publicly available lexical resource for opinion mining (2006) Language Resources and Evaluation (LREC) Ghose, A., Ipeirotis, P.G., Sundararajan, A., Opinion mining using econometrics: A case study on reputation systems (2007) ACL, pp. 416-423 Godbole, N., Srinivasaiah, M., Skiena, S., Large-scale sentiment analysis for news and blogs (2007) Proc. First Int. Conf. on Weblogs and Social Media, pp. 219-222 Hong, Y., Skiena, S., The wisdom of bookies? Sentiment analysis versus. The NFL point spread (2010) International AAAI Conference on Weblogs and Social Media Hu, M., Liu, B., Mining and summarizing customer reviews (2004) KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168-177. , KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Hu, N., Pavlou, P.A., Zhang, J., Can online reviews reveal a product's true quality? Empirical findings analytical modeling of online word-of-mouth communication (2006) Proceedings of the ACM Conference on Electronic Commerce, 2006, pp. 324-330. , Proceedings of the 7th ACM Conference on Electronic Commerce 2006 (2002), http://www.wjh.harvard.edu/inquirer/, G. InquirerKim, S.-M., Hovy, E., Automatic detection of opinion bearing words and sentences (2005) IJCNLP, pp. 61-66. , New York,. Springer Koppel, M., Shtrimberg, I., Good news or bad news? let the market decide (2006) Computing Attitude and Affect in Text: Theory and Applications, 20 (1), pp. 297-301 Ku, L.-W., Lo, Y.-S., Chen, H.-H., Test collection selection and gold standard generation for a multiply-annotated opinion corpus (2007) ACL, pp. 89-92 Lavrenko, V., Schmill, M., Lawrie, D., Ogilvie, P., Jensen, D., Allan, J., Mining of concurrent text and time series (2000) Proceedings of the 6th ACM SIGKDD Int'l Conference on Knowledge Discovery and Data Mining Workshop on Text Mining, pp. 37-44 Lerman, K., Gilder, A., Dredze, M., Pereira, F., Reading the markets: Forecasting public opinion of political candidates by news analysis (2008) Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1, COLING '08, pp. 473-480 Liu, B., Hu, M., Cheng, J., Opinion observer: Analyzing and comparing opinions on the web (2005) Proceedings of the 14th International Conference on World Wide Web, WWW '05, pp. 342-351. , New York, NY, USA,. ACM Lloyd, L., Kechagias, D., Skiena, S., Lydia: A system for large-scale news analysis (2005) String Processing and Information Retrieval (SPIRE 2005), pp. 161-166 Miller, G.A., Word Net: A lexical database for English (1995) Commun. ACM, 38 (11), pp. 39-41 Pang, B., Lee, L., Vaithyanathan, S., Thumbs up?: Sentiment classification using machine learning techniques (2002) EMNLP '02: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, pp. 79-86. , Morristown, NJ, USA,. Association for Computational Linguistics Resnick, P., Zeckhauser, R., Swanson, J., Lockwood, K., The value of reputation on eBay: A controlled experiment (2006) Experimental Economics, 9 (2), pp. 79-101. , DOI 10.1007/s10683-006-4309-2 Sebastiani, F., Esuli, A., Determining term subjectivity and term orientation for opinion mining (2006) Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, pp. 193-200 Van De Rijt, A., Shor, E., Ward, C., Skiena, S., Only fifteen minutes? the social immobility of fame in english-language newspapers (2011) Under Review Whissell, C.M., (1989) The Dictionary of Affect in Language, pp. 113-131 Wiebe, J., Wilson, T., Cardie, C., Annotating expressions of opinions and emotions in language (2005) Language Resources and Evaluation, 39 (2-3), pp. 164-210 Wilson, T., Wiebe, J., Hoffmann, P., Recognizing contextual polarity in phrase-level sentiment analysis (2005) Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT '05, pp. 347-354 Wilson, T., Wiebe, J., Hoffmann, P., Recognizing contextual polarity in phrase-level sentiment analysis (2005) HLT-EMNLP-2005 Yu, H., Hatzivassiloglou, V., Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences (2003) Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pp. 129-136. , Morristown, NJ, USA,. Association for Computational Linguistics Zhang, W., Skiena, S., Improving movie gross prediction through news analysis (2009) Web Intelligence, pp. 301-304 Zhang, W., Skiena, S., Trading strategies to exploit blog and news sentiment (2010) International AAAI Conference on Weblogs and Social Media Zhuang, L., Jing, F., Zhu, X.-Y., Movie review mining and summarization. in Proceedings of the 15th ACM international conference on Information and knowledge management (2006) CIKM'06, pp. 43-50. , New York, NY, USA,. ACM