Compression of Very Sparse Column Oriented Data

Garcia, Vinicius Fulber; Mergen, Sergio Luis Sardi

info:eu-repo/semantics/article

Registro en:

https://periodicos.ufsm.br/coming/article/view/22772

10.5902/2448190422772

https://repositorioslatinoamericanos.uchile.cl/handle/2250/8941714

Autor

Garcia, Vinicius Fulber

Mergen, Sergio Luis Sardi

Institución

Universidade Federal de Santa Maria (Brasil)

Resumen

Column oriented databases store columns contiguously on disk. The adjacency of values from the same domain leads to a reduced information entropy. Consequently, compression algorithms are able to achieve better results. Columns whose values have a high cardinality are usually compressed using variations of the LZ method. In this paper, we consider the usage of simpler methods based on run-length and symbols probability in scenarios where datasets are very sparse. Our experiments show in which cases the simple methods evaluated provide promising results.

Materias

compression

column oriented databases

Mostrar el registro completo del ítem