Artículos de revistas
Data-driven representations for testing independence: modeling, analysis and connection with mutual information estimation
Fecha
2022Registro en:
IEEE Transactions on Signal Processing, Vol. 70, 2022
10.1109/TSP.2021.3135689
Autor
González, Mauricio E.
Silva Sánchez, Jorge
Videla, Miguel
Orchard Concha, Marcos Eduardo
Institución
Resumen
This work addresses testing the independence of two
continuous and finite-dimensional random variables from the design of a data-driven partition. The empirical log-likelihood statistic is adopted to approximate the sufficient statistics of an oracle
test against independence (that knows the two hypotheses). It is
shown that approximating the sufficient statistics of the oracle test
offers a learning criterion for designing a data-driven partition that
connects with the problem of mutual information estimation. Applying these ideas in the context of a data-dependent tree-structured
partition (TSP), we derive conditions on the TSP’s parameters to
achieve a strongly consistent distribution-free test of independence
over the family of probabilities equipped with a density. Complementing this result, we present finite-length results that show
our TSP scheme’s capacity to detect the scenario of independence
structurally with the data-driven partition as well as new sampling
complexity bounds for this detection. Finally, some experimental
analyses provide evidence regarding our scheme’s advantage for
testing independence compared with some strategies that do not
use data-driven representations.