dc.creatorEspariz, Martin
dc.creatorZuljan, Federico Alberto
dc.creatorEsteban, Luis
dc.creatorMagni, Christian
dc.date.accessioned2018-06-29T18:33:53Z
dc.date.available2018-06-29T18:33:53Z
dc.date.created2018-06-29T18:33:53Z
dc.date.issued2016-09
dc.identifierEspariz, Martin; Zuljan, Federico Alberto; Esteban, Luis; Magni, Christian; Taxonomic identity resolution of highly phylogenetically related strains and selection of phylogenetic markers by using genome-scale methods: The bacillus pumilus group case; Public Library of Science; Plos One; 11; 9; 9-2016; 1-17; e0163098
dc.identifier1932-6203
dc.identifierhttp://hdl.handle.net/11336/50748
dc.identifierCONICET Digital
dc.identifierCONICET
dc.description.abstractBacillus pumilus group strains have been studied due their agronomic, biotechnological or pharmaceutical potential. Classifying strains of this taxonomic group at species level is a challenging procedure since it is composed of seven species that share among them over 99.5% of 16S rRNA gene identity. In this study, first, a whole-genome in silico approach was used to accurately demarcate B. pumilus group strains, as a case of highly phylogenetically related taxa, at the species level. In order to achieve that and consequently to validate or correct taxonomic identities of genomes in public databases, an average nucleotide identity correlation, a core-based phylogenomic and a gene function repertory analyses were performed. Eventually, more than 50% such genomes were found to be misclassified. Hierarchical clustering of gene functional repertoires was also used to infer ecotypes among B. pumilus group species. Furthermore, for the first time the machine-learning algorithm Random Forest was used to rank genes in order of their importance for species classification. We found that ybbP, a gene involved in the synthesis of cyclic di-AMP, was the most important gene for accurately predicting species identity among B. pumilus group strains. Finally, principal component analysis was used to classify strains based on the distances between their ybbP genes. The methodologies described could be utilized more broadly to identify other highly phylogenetically related species in metagenomic or epidemiological assessments.
dc.languageeng
dc.publisherPublic Library of Science
dc.relationinfo:eu-repo/semantics/altIdentifier/doi/http://dx.doi.org/10.1371/journal.pone.0163098
dc.relationinfo:eu-repo/semantics/altIdentifier/url/http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0163098
dc.rightshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectB. Pumilus
dc.subjectRandomforests
dc.subjectTaxonomic Resulution
dc.titleTaxonomic identity resolution of highly phylogenetically related strains and selection of phylogenetic markers by using genome-scale methods: The bacillus pumilus group case
dc.typeinfo:eu-repo/semantics/article
dc.typeinfo:ar-repo/semantics/artículo
dc.typeinfo:eu-repo/semantics/publishedVersion


Este ítem pertenece a la siguiente institución