dc.creatorJosé Fernando Sánchez Vega
dc.creatorESAU VILLATORO TELLO
dc.creatorManuel Montes y Gómez
dc.creatorLuis Villaseñor Pineda
dc.creatorPaolo Rosso
dc.date2013
dc.date.accessioned2023-07-25T16:25:31Z
dc.date.available2023-07-25T16:25:31Z
dc.identifierhttp://inaoe.repositorioinstitucional.mx/jspui/handle/1009/2393
dc.identifier.urihttps://repositorioslatinoamericanos.uchile.cl/handle/2250/7807569
dc.descriptionAn important task in plagiarism detection is determining and measuring similar text portions between a given pair of documents. One of the main difficulties of this task resides on the fact that reused text is commonly modified with the aim of covering or camouflaging the plagiarism. Another difficulty is that not all similar text fragments are examples of plagiarism, since thematic coincidences also tend to pro- duce portions of similar text. In order to tackle these problems, we propose a novel method for detecting likely portions of reused text. This method is able to detect common actions performed by plagiarists such as word deletion, insertion and transposition, allowing to obtain plausible portions of reused text. We also propose representing the identified reused text by means of a set of features that denote its degree of plagiarism, relevance and fragmentation. This new representation aims to facilitate the recog- nition of plagiarism by considering diverse characteristics of the reused text during the classification phase. Experimental results employing a supervised classification strategy showed that the proposed method is able to outperform traditionally used approaches.
dc.formatapplication/pdf
dc.languageeng
dc.publisherElsevier Ltd.
dc.relationcitation:Sánchez. F., et al., (2013). Determining and characterizing the reused text for plagiarism detection, Expert Systems with Applications, Vol. 2013 (40): 1804–1813
dc.rightsinfo:eu-repo/semantics/openAccess
dc.rightshttp://creativecommons.org/licenses/by-nc-nd/4.0
dc.subjectinfo:eu-repo/classification/Plagiarism detection/Plagiarism detection
dc.subjectinfo:eu-repo/classification/Text reuse/Text reuse
dc.subjectinfo:eu-repo/classification/Machine learning/Machine learning
dc.subjectinfo:eu-repo/classification/Supervised classification/Supervised classification
dc.subjectinfo:eu-repo/classification/cti/1
dc.subjectinfo:eu-repo/classification/cti/12
dc.subjectinfo:eu-repo/classification/cti/1203
dc.subjectinfo:eu-repo/classification/cti/1203
dc.titleDetermining and characterizing the reused text for plagiarism detection
dc.typeinfo:eu-repo/semantics/article
dc.typeinfo:eu-repo/semantics/acceptedVersion
dc.audiencestudents
dc.audienceresearchers
dc.audiencegeneralPublic


Este ítem pertenece a la siguiente institución