dc.creatorSOARES, M. V. B.
dc.creatorPRATI, R. C.
dc.creatorMONARD, M. C.
dc.date.accessioned2012-10-20T03:36:07Z
dc.date.accessioned2018-07-04T15:38:50Z
dc.date.available2012-10-20T03:36:07Z
dc.date.available2018-07-04T15:38:50Z
dc.date.created2012-10-20T03:36:07Z
dc.date.issued2009
dc.identifierIEEE LATIN AMERICA TRANSACTIONS, v.7, n.4, p.472-477, 2009
dc.identifier1548-0992
dc.identifierhttp://producao.usp.br/handle/BDPI/28979
dc.identifier10.1109/TLA.2009.5349047
dc.identifierhttp://dx.doi.org/10.1109/TLA.2009.5349047
dc.identifier.urihttp://repositorioslatinoamericanos.uchile.cl/handle/2250/1625621
dc.description.abstractThe amount of textual information digitally stored is growing every day. However, our capability of processing and analyzing that information is not growing at the same pace. To overcome this limitation, it is important to develop semiautomatic processes to extract relevant knowledge from textual information, such as the text mining process. One of the main and most expensive stages of the text mining process is the text pre-processing stage, where the unstructured text should be transformed to structured format such as an attribute-value table. The stemming process, i.e. linguistics normalization, is usually used to find the attributes of this table. However, the stemming process is strongly dependent on the language in which the original textual information is given. Furthermore, for most languages, the stemming algorithms proposed in the literature are computationally expensive. In this work, several improvements of the well know Porter stemming algorithm for the Portuguese language, which explore the characteristics of this language, are proposed. Experimental results show that the proposed algorithm executes in far less time without affecting the quality of the generated stems.
dc.languagepor
dc.publisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
dc.relationIeee Latin America Transactions
dc.rightsCopyright IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
dc.rightsrestrictedAccess
dc.subjectAttribute Reduction
dc.subjectStemming
dc.subjectText Mining
dc.subjectText Pre-Processing
dc.titleImprovements on the Porter`s Stemming Algorithm for Portuguese
dc.typeArtículos de revistas


Este ítem pertenece a la siguiente institución