Tesis
MIDB: um modelo de integração de dados biológicos
Fecha
2012-02-29Registro en:
PERLIN, Caroline Beatriz. MIDB : um modelo de integração de dados biológicos. 2012. 105 f. Dissertação (Mestrado em Ciências Exatas e da Terra) - Universidade Federal de São Carlos, São Carlos, 2012.
Autor
Perlin, Caroline Beatriz
Institución
Resumen
In bioinformatics, there is a huge volume of data related to biomolecules and to nucleotide and amino acid sequences that reside (in almost their totality) in several Biological Data Bases (BDBs). For a specific sequence, there are some informational classifications: genomic data, evolution-data, structural data, and others. Some BDBs store just one or some of these classifications. Those BDBs are hosted in different sites and servers, with several data base management systems with different data models. Besides, instances and schema might have semantic heterogeneity. In such scenario, the objective of this project is to propose a biological data integration model, that adopts new schema integration and instance integration techniques. The proposed integration model has a special mechanism of schema integration and another mechanism that performs the instance integration (with support of a dictionary) allowing conflict resolution in the attribute values; and a Clustering Algorithm is used in order to cluster similar entities. Besides, a domain specialist participates managing those clusters. The proposed model was validated through a study case focusing on schema and instance integration about nucleotide sequence data from organisms of Actinomyces gender, captured from four different data sources. The result is that about 97.91% of the attributes were correctly categorized in the schema integration, and the instance integration was able to identify that about 50% of the clusters created need support from a specialist, avoiding errors on the instance resolution. Besides, some contributions are presented, as the Attributes Categorization, the Clustering Algorithm, the distance functions proposed and the proposed model itself.
Ítems relacionados
Mostrando ítems relacionados por Título, autor o materia.
-
Infraestrutura física e integração regional na América do Sul : uma avaliação da iniciativa para a integração da infraestrutura regional da América do Sul
Costa, Carlos Eduardo Lampert; Forero Gonzalez, Manuel José -
Integração de um modelo de previsão de demanda de água a um modelo simulador em tempo real na operação de sistemas de abastecimento
Nogueira de Aquino Borges, Viviana Marli; Zahed Filho, Kamel