Artículo de revista
Efficient Fully-Compressed Sequence Representations
Fecha
2012Registro en:
Algorithmica (2014) 69:232–268
DOI 10.1007/s00453-012-9726-3
Autor
Barbay, Jérémy
Claude, Francisco
Gagie, Travis
Navarro, Gonzalo
Nekrich, Yakov
Institución
Resumen
We present a data structure that stores a sequence s[1..n] over alphabet
[1..σ ] in nH0(s) + o(n)(H0(s)+1) bits, where H0(s) is the zero-order entropy
of s. This structure supports the queries access, rank and select, which are fundamental
building blocks for many other compressed data structures, in worst-case
time O(lg lgσ) and average time O(lgH0(s)). The worst-case complexity matches
the best previous results, yet these had been achieved with data structures using
nH0(s) + o(n lgσ) bits. On highly compressible sequences the o(n lgσ) bits of the
redundancy may be significant compared to the nH0(s) bits that encode the data. Our
representation, instead, compresses the redundancy as well. Moreover, our averagecase
complexity is unprecedented.