dc.creatorGuerra Soler, Aníbal José
dc.creatorLotero García, Jaime Andrés
dc.creatorAedo Cobo, José Édinson
dc.creatorIsaza Ramírez, Sebastián
dc.date2023-06-01T18:27:29Z
dc.date2023-06-01T18:27:29Z
dc.date2019
dc.date.accessioned2024-04-23T18:13:33Z
dc.date.available2024-04-23T18:13:33Z
dc.identifierGuerra A, Lotero J, Aedo JÉ, Isaza S. Tackling the Challenges of FASTQ Referential Compression. Bioinform Biol Insights. 2019 Feb 14;13:1177932218821373. doi: 10.1177/1177932218821373. Erratum in: Bioinform Biol Insights. 2019 Sep 17;13:1177932219876803.
dc.identifier1177-9322
dc.identifierhttps://hdl.handle.net/10495/35246
dc.identifier10.1177/1177932218821373
dc.identifier.urihttps://repositorioslatinoamericanos.uchile.cl/handle/2250/9230561
dc.descriptionABSTRACT:The exponential growth of genomic data has recently motivated the development of compression algorithms to tackle the storage capacity limitations in bioinformatics centers. Referential compressors could theoretically achieve a much higher compression than their nonreferential counterparts; however, the latest tools have not been able to harness such potential yet. To reach such goal, an efficient encoding model to represent the differences between the input and the reference is needed. In this article, we introduce a novel approach for referential compression of FASTQ files. The core of our compression scheme consists of a referential compressor based on the combination of local alignments with binary encoding optimized for long reads. Here we present the algorithms and performance tests developed for our reads compression algorithm, named UdeACompress. Our compressor achieved the best results when compressing long reads and competitive compression ratios for shorter reads when compared to the best programs in the state of the art. As an added value, it also showed reasonable execution times and memory consumption, in comparison with similar tools.
dc.descriptionCOL0010717
dc.format19
dc.formatapplication/pdf
dc.formatapplication/pdf
dc.languageeng
dc.publisherSAGE Publications
dc.publisherSistemas Embebidos e Inteligencia Computacional (SISTEMIC)
dc.publisherThousand Oaks, Estados Unidos
dc.relationBioinform. Biol. Insights.
dc.rightsinfo:eu-repo/semantics/openAccess
dc.rightshttp://creativecommons.org/licenses/by-nc/2.5/co/
dc.rightshttp://purl.org/coar/access_right/c_abf2
dc.rightshttps://creativecommons.org/licenses/by-nc/4.0/
dc.subjectBioinformática
dc.subjectBioinformatics
dc.subjectCompresión de datos (computadores)
dc.subjectData compression (computer science)
dc.subjectTeoría de la codificación
dc.subjectCoding theory
dc.titleTackling the Challenges of FASTQ Referential Compression
dc.typeinfo:eu-repo/semantics/article
dc.typeinfo:eu-repo/semantics/publishedVersion
dc.typehttp://purl.org/coar/resource_type/c_2df8fbb1
dc.typehttps://purl.org/redcol/resource_type/ART
dc.typeArtículo de investigación


Este ítem pertenece a la siguiente institución