Artículo de revista
Efficient Fully-Compressed Sequence Representations
Date
2012Registration in:
Algorithmica (2014) 69:232–268
DOI 10.1007/s00453-012-9726-3
Author
Barbay, Jérémy
Claude, Francisco
Gagie, Travis
Navarro, Gonzalo
Nekrich, Yakov
Institutions
Abstract
We present a data structure that stores a sequence s[1..n] over alphabet
[1..σ ] in nH0(s) + o(n)(H0(s)+1) bits, where H0(s) is the zero-order entropy
of s. This structure supports the queries access, rank and select, which are fundamental
building blocks for many other compressed data structures, in worst-case
time O(lg lgσ) and average time O(lgH0(s)). The worst-case complexity matches
the best previous results, yet these had been achieved with data structures using
nH0(s) + o(n lgσ) bits. On highly compressible sequences the o(n lgσ) bits of the
redundancy may be significant compared to the nH0(s) bits that encode the data. Our
representation, instead, compresses the redundancy as well. Moreover, our averagecase
complexity is unprecedented.