Dissertação
CALI: a novel model for visual mining of biological relevant patterns in protein-ligand graphs
Fecha
2016-10-27Autor
Susana Medina Gordillo
Institución
Resumen
Protein-ligand interaction (PLI) networks show how proteins interact with small nonprotein
ligands and can be used to study molecular recognition, which plays an important
role in biological systems. The binding and interaction of molecules depend on a
combination of conformational and physicochemical complementarity. There are several
methods to predict protein-ligand interactions, but a few are designed to identify
and describe implications of intelligible factors in protein-ligand recognition.
We propose CALI (Complex network-based Analysis of protein-Ligand Interactions),
a strategy based on complex network modeling of protein-ligand interactions,
revealing frequent and relevant patterns among them. We compared patterns obtained
with CALI to those computed using Frequent Subgraph Mining (FSM) paradigm. FSM
needs to run several times for a variety of support values and it also needs a mapping
step, in which computed patterns are mapped to the graph input dataset through a
subgraph isomorphism algorithm. On the other hand, CALI is executed once and without
applying the mapping step to the input dataset. Additionally, patterns obtained
with CALI were compared to experimentally determined protein-ligand interactions
from previous studies involving two datasets: one composed by the well studied CDK2
enzymes and, the other, by the Ricin toxin. For CDK2 dataset, CALI found 90% of
such residues and, for Ricin dataset, CALI found all residues that interact with ligands.
CALI was able to predict residues experimentally determined as relevant in
protein-ligand interactions for two diverse datasets. This new model requires neither
running FSM nor analyzing its wide number of output patterns to find the most
common protein-ligand interactions. Instead, we propose using network topological
properties coupled with a powerful visual and interactive representation of data to
analyze interactions. Furthermore, our strategy provides a general view of the input
interaction dataset, showing the most common PLIs from a global perspective.