dc.creator | Mahmood Abdullah, Sura | |
dc.creator | Mazin Ali, Sura | |
dc.creator | Abduljaleel Makttof, Mohammed | |
dc.date | 2019-06-19 | |
dc.date.accessioned | 2022-11-05T02:29:57Z | |
dc.date.available | 2022-11-05T02:29:57Z | |
dc.identifier | https://produccioncientificaluz.org/index.php/opcion/article/view/31072 | |
dc.identifier.uri | https://repositorioslatinoamericanos.uchile.cl/handle/2250/5141502 | |
dc.description | Calculating similarities between texts written in any language remains one of the extremely important challenges encounter natural language processing. This paper presents the modified Jaccard similarity coefficient for the texts; the main aim from this modification is to count the number of similar sen- tences between texts instead of counting the number of similar words between them as in previous works. This modification is applied by produced an equa- tion which combining the Jaccard coefficient and the similarity coefficient, furthermore, two criteria are employed in the proposed equation; where the first one is multiplied by the Jaccard coefficient and the second criterion is multiplied by the similarity coefficient. The objective of these criteria is to keep the similarity degree between 0 and 1. The experimental results are logi- cal, in which the similarity degree of the proposed equation increased approx- imately 3% on Jaccard coefficient degree when chosen texts from the same class, while it became less than the Jaccard coefficient degree when chosen texts from the various classes. | es-ES |
dc.format | application/pdf | |
dc.language | spa | |
dc.publisher | Universidad del Zulia | es-ES |
dc.relation | https://produccioncientificaluz.org/index.php/opcion/article/view/31072/32115 | |
dc.rights | Derechos de autor 2020 Opción | es-ES |
dc.source | Opción; Vol. 35 (2019): Edición Especial Nro. 19; 28 | es-ES |
dc.source | 2477-9385 | |
dc.source | 1012-1587 | |
dc.subject | Text Mining | es-ES |
dc.subject | Text Similarity | es-ES |
dc.subject | Lexical Similarity | es-ES |
dc.subject | String-Based Similarity | es-ES |
dc.subject | Jaccard Coefficient | es-ES |
dc.title | Modifying Jaccard Coefficient for Texts Similarity | es-ES |
dc.type | info:eu-repo/semantics/article | |
dc.type | info:eu-repo/semantics/publishedVersion | |
dc.type | Artículo revisado por pares | es-ES |