info:ar-repo/semantics/artículo
ddRADseq‑mediated detection of genetic variants in sugarcane
Fecha
2022-11-11Autor
Molina, Catalina
Aguirre, Natalia Cristina
Vera, Pablo Alfredo
Filippi, Carla Valeria
Puebla, Andrea Fabiana
Marcucci Poltri, Susana Noemi
Paniego, Norma Beatriz
Acevedo, Alberto
Resumen
Sugarcane (Saccharum sp.), a world-wide known feedstock for sugar production, bioethanol, and energy, has an extremely complex genome, being highly polyploid and aneuploid. A double-digestion restriction site-associated DNA sequencing protocol (ddRADseq) was tested in four commercial sugarcane hybrids and one high-fbre biotype for the detec tion of single nucleotide polymorphisms (SNPs). In this work we tested two Illumina sequencing platforms, read size (70 vs. 150 bp), diferent sequencing coverage per individual (medium and high coverage), and single-reads versus paired-end reads. We also explored diferent variant calling strategies (with and without reference genome) and fltering schemes [com bining two minor allele frequencies (MAFs) with three depth of coverage thresholds]. For the discovery of a large number
of novel SNPs in sugarcane, we recommend longer size and paired-end reads, medium sequencing coverage per individual and Illumina platform NovaSeq6000 for a cost-efective approach, and flter parameters of lower MAF and higher depth coverages thresholds. Although the de novo analysis retrieved more SNPs, the reference-based method allows downstream characterization of variants. For the two best performing matrices, the number of SNPs per chromosome correlated positively with chromosome length, demonstrating the presence of variants throughout the genome. Multivariate comparisons, with
both matrices, showed closer relationships among commercial hybrids than with the high-fbre biotype. Functional analysis of the SNPs demonstrated that more than half of them landed within regulatory regions, whereas the other half afected cod ing, intergenic and intronic regions. Allelic distances values were lower than 0.07 when analysing two replicated genotypes, confrming the protocol robustness.