info:eu-repo/semantics/article
Gaussian distribution of trie depth for strongly tame sources
Fecha
2015-01Registro en:
Cesaratto, Eda; Vallée, Brigitte; Gaussian distribution of trie depth for strongly tame sources; Cambridge University Press; Combinatorics, Probability & Computing (print); 24; 1; 1-2015; 54-103
0963-5483
CONICET Digital
CONICET
Autor
Cesaratto, Eda
Vallée, Brigitte
Resumen
The depth of a trie has been deeply studied when the source which produces the words is a simple source (a memoryless source or a Markov chain). When a source is simple but not an unbiased memoryless source, the expectation and the variance are both of logarithmic order and their dominant terms involve characteristic objects of the source, for instance the entropy. Moreover, there is an asymptotic Gaussian law, even though the speed of convergence towards the Gaussian law has not yet been precisely estimated. The present paper describes a 'natural' class of general sources, which does not contain any simple source, where the depth of a random trie, built on a set of words independently drawn from the source, has the same type of probabilistic behaviour as for simple sources: the expectation and the variance are both of logarithmic order and there is an asymptotic Gaussian law. There are precise asymptotic expansions for the expectation and the variance, and the speed of convergence toward the Gaussian law is optimal. The paper first provides analytical conditions on the Dirichlet series of probabilities of a general source under which this Gaussian law can be derived: a pole-free region where the series is of polynomial growth. In a second step, the paper focuses on sources associated with dynamical systems, called dynamical sources, where the Dirichlet series of probabilities is expressed with the transfer operator of the dynamical system. Then, the paper extends results due to Dolgopyat, already generalized by Baladi and Vallée, and shows that the previous analytical conditions are fulfilled for 'most' dynamical sources, provided that they 'strongly differ' from simple sources. Finally, the present paper describes a class of sources not containing any simple source, where the trie depth has the same type of probabilistic behaviour as for simple sources, even with more precise estimates.