dc.contributor | FGV | |
dc.creator | Santos, Cicero Nogueira dos | |
dc.creator | Zadrozny, Bianca | |
dc.date.accessioned | 2018-05-10T13:36:50Z | |
dc.date.accessioned | 2019-05-22T14:08:44Z | |
dc.date.available | 2018-05-10T13:36:50Z | |
dc.date.available | 2019-05-22T14:08:44Z | |
dc.date.created | 2018-05-10T13:36:50Z | |
dc.date.issued | 2014 | |
dc.identifier | 978-3-319-09761-9; 978-3-319-09760-2 | |
dc.identifier | 0302-9743 | |
dc.identifier | http://hdl.handle.net/10438/23485 | |
dc.identifier | 000358252900008 | |
dc.identifier.uri | http://repositorioslatinoamericanos.uchile.cl/handle/2250/2690648 | |
dc.description.abstract | Part-of-speech (POS) tagging for morphologically rich languages normally requires the use of handcrafted features that encapsulate clues about the language's morphology. In this work, we tackle Portuguese POS tagging using a deep neural network that employs a convolutional layer to learn character-level representation of words. We apply the network to three different corpora: the original Mac-Morpho corpus; a revised version of the Mac-Morpho corpus; and the Tycho Brahe corpus. Using the proposed approach, while avoiding the use of any handcrafted feature, we produce state-of-the-art POS taggers for the three corpora: 97.47% accuracy on the Mac-Morpho corpus; 97.31% accuracy on the revised Mac-Morpho corpus; and 97.17% accuracy on the Tycho Brahe corpus. These results represent an error reduction of 12.2%, 23.6% and 15.8%, respectively, on the best previous known result for each corpus. | |
dc.language | eng | |
dc.publisher | Springer Int Publishing Ag | |
dc.relation | Computational processing of the portuguese language | |
dc.rights | restrictedAccess | |
dc.source | Web of Science | |
dc.subject | Portuguese part-of-speech tagging | |
dc.subject | Deep learning | |
dc.subject | Convolutional neural networks | |
dc.subject | Recognition | |
dc.title | Training state-of-the-art portuguese POS taggers without handcrafted features | |
dc.type | Conference Proceedings | |