dc.contributorBugatti, Pedro Henrique
dc.contributorhttp://lattes.cnpq.br/2177467029991118
dc.contributorPaschoal, Alexandre Rossi
dc.contributor0000-0002-8887-0582
dc.contributorhttp://lattes.cnpq.br/5834088144837137
dc.contributorFujita, André
dc.contributorhttp://lattes.cnpq.br/0247990329725342
dc.contributorKashiwabara, Andre Yoshiaki
dc.contributorhttp://lattes.cnpq.br/3194328548975437
dc.contributorLopes, Fabricio Martins
dc.contributorhttp://lattes.cnpq.br/1660070580824436
dc.contributorBugatti, Pedro Henrique
dc.contributorhttp://lattes.cnpq.br/2177467029991118
dc.creatorCruz, Murilo Horacio Pereira da
dc.date.accessioned2020-10-27T01:06:31Z
dc.date.accessioned2022-12-06T14:21:08Z
dc.date.available2020-10-27T01:06:31Z
dc.date.available2022-12-06T14:21:08Z
dc.date.created2020-10-27T01:06:31Z
dc.date.issued2020-03-13
dc.identifierCRUZ, Murilo Horacio Pereira da. Classificação de elementos transponíveis por redes neurais convolucionais. 2020. Dissertação (Mestrado em Bioinformática) - Universidade Tecnológica Federal do Paraná, Cornélio Procópio, 2020.
dc.identifierhttp://repositorio.utfpr.edu.br/jspui/handle/1/5309
dc.identifier.urihttps://repositorioslatinoamericanos.uchile.cl/handle/2250/5247086
dc.description.abstractTransposable elements are the most represented sequences in eukaryotic genomes. They are capable to transpose and produce multiple copies throughout the host genome. By doing so, these sequences can produce a variety of effects on organisms, such as the regulation of gene expression. There are several kinds of these elements, which are classified in a hierarchic way into classes, orders and superfamílies. Few methods of the literature classify these sequences into the deeper levels of the classification hierarchy, such as superfamily. Moreover, most methods use handcrafted features, such as: k-mers; presence of ORF; presence of protein domains; and homology based search. These features could be inneficient for generalization to non homologous sequences and time-consuming. In this work, we introduce an approach, called Transposabel Element Representation Learner (TERL), which is capable to represent 1D sequences into 2D sequence images. Our approach is generic and can be used to classify any type of biological sequence in any level of the classification system, also it is flexible to the type of architecture to use for the classification. In this work we use seven databases to create nine data sets. These data sets were used in a series of 21 experiments designed to assess the performance of the methods TEclass, PASTEC and the proposed approach. TERL obtained an accuracy and F1-score of 0.95 and 0.71 respectively on the classification of 11 superfamilies. Considering accuracy and specificity our approach obtained 0.89 and 0.93 respectively on the classification of order sequences from a data set created with sequences from different organisms and from different databases. These results surpass the metrics obtained by TEclass and PASTEC. Our approach showed great advantage regarding the classification time, which is on average 76 times more efficient than TEclass and four orders of magnitude more efficient than PASTEC.
dc.publisherUniversidade Tecnológica Federal do Paraná
dc.publisherCornelio Procopio
dc.publisherBrasil
dc.publisherPrograma de Pós-Graduação em Bioinformática
dc.publisherUTFPR
dc.rightsopenAccess
dc.subjectGenoma
dc.subjectRedes neurais (Neurobiologia)
dc.subjectClassificação
dc.subjectGenomes
dc.subjectNeural networks (Neurobiology)
dc.subjectClassification
dc.titleClassificação de elementos transponíveis por redes neurais convolucionais
dc.typemasterThesis


Este ítem pertenece a la siguiente institución