dc.creatorRAFAEL GUZMAN CABRERA
dc.creatorMANUEL MONTES Y GOMEZ
dc.creatorPaolo ROSSO
dc.creatorLUIS VILLASEÑOR PINEDA
dc.date2009
dc.date.accessioned2022-10-12T19:48:09Z
dc.date.available2022-10-12T19:48:09Z
dc.identifierhttp://inaoe.repositorioinstitucional.mx/jspui/handle/1009/1190
dc.identifier.urihttps://repositorioslatinoamericanos.uchile.cl/handle/2250/4122307
dc.descriptionMost current methods for automatic text categorization are based on supervised learning techniques and, therefore, they face the problem of requiring a great number of training instances to construct an accurate classifier. In order to tackle this problem, this paper proposes a new semi-supervised method for text categorization, which considers the automatic extraction of unlabeled examples from the Web and the application of an enriched self-training approach for the construction of the classifier. This method, even though language independent, is more pertinent for scenarios where large sets of labeled resources do not exist. That, for instance, could be the case of several application domains in different non-English languages such as Spanish. The experimental evaluation of the method was carried out in three different tasks and in two different languages. The achieved results demonstrate the applicability and usefulness of the proposed method.
dc.formatapplication/pdf
dc.languageeng
dc.publisherSpringer Science+Business Media
dc.relationcitation:Guzmán-Cabrera, R., et al., (2009). Using the Web as corpus for self-training text categorization, Springer Science Inf. Retrieval (12): 400–415
dc.rightsinfo:eu-repo/semantics/openAccess
dc.rightshttp://creativecommons.org/licenses/by-nc-nd/4.0
dc.subjectinfo:eu-repo/classification/Text categorization/Text categorization
dc.subjectinfo:eu-repo/classification/Semi-supervised learning/Semi-supervised learning
dc.subjectinfo:eu-repo/classification/Self-training/Self-training
dc.subjectinfo:eu-repo/classification/Web as corpus/Web as corpus
dc.subjectinfo:eu-repo/classification/Authorship attribution/Authorship attribution
dc.subjectinfo:eu-repo/classification/cti/1
dc.subjectinfo:eu-repo/classification/cti/12
dc.subjectinfo:eu-repo/classification/cti/1203
dc.titleUsing the Web as corpus for self-training text categorization
dc.typeinfo:eu-repo/semantics/article
dc.typeinfo:eu-repo/semantics/acceptedVersion
dc.audiencestudents
dc.audienceresearchers
dc.audiencegeneralPublic


Este ítem pertenece a la siguiente institución