Actas de congresos
A qualitative analysis of a corpus of opinion summaries based on aspects
Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies; Linguistic Annotation Workshop, 9th, 2015, Denver.
López, Roque E.
Avanço, Lucas Vinicius
Balage Filho, Pedro Paulo
Bokan, Alessandro Y.
Cardoso, Paula Christina Figueira
Dias, Márcio de Souza
Nóbrega, Fernando Antônio Asevêdo
Cabezudo, Marco Antonio Sobrevilla
Souza, Jackson W. C.
Zacarias, Andressa C. I.
Seno, Eloize M. R.
Felippo, Ariani di
Pardo, Thiago Alexandre Salgueiro
Aspect-based opinion summarization is the task of automatically generating a summary for some aspects of a specific topic from a set of opinions. In most cases, to evaluate the quality of the automatic summaries, it is necessary to have a reference corpus of human summaries to analyze how similar they are. The scarcity of corpora in that task has been a limiting factor for many research works. In this paper, we introduce OpiSums-PT, a corpus of extractive and abstractive summaries of opinions written in Brazilian Portuguese. We use this corpus to analyze how similar human summaries are and how people take into account the issues of aspect coverage and sentimento orientation to generate manual summaries. The results of these analyses show that human summaries are diversified and people generate summaries only for some aspects, keeping the overall sentiment orientation with little variation.