capítulo de libro
HMM-Based Speech Synthesis Enhancement with Hybrid Postfilters
Fecha
2018Registro en:
978-1-53613-842-9
Autor
Coto Jiménez, Marvin
Goddard Close, John
Institución
Resumen
In this chapter, we introduce hybrid postfilters into speech synthesis,
with the objective of enhancing the quality of the synthesized speech.
Our approach combines a Wiener filter with deep neural networks. Several attempts to enhance synthetic speech have contemplated single-stage
deep-learning-based postfilters, which learn to perform a mapping of the
synthetic speech parameters to the natural ones. In the synthetic speech
produced by statistical methods, we have measured low-level noise components, so the common single-stage postfilters must achieve the reduction of that component, as well as the complex relationship between the
parameters of the synthetic and the natural speech. That is why we consider a two-stage approach: In the first stage, the Wiener filter deals with
the noise components of the synthetic speech. In the second stage, a set
of multi-stream postfilters, which encompass a collection of autoencoders and auto-associative networks, deal with the relationship between the output of the Wiener filter and the natural speech. Results show that the hybrid approach succeeds in enhancing the synthetic speech in most cases
compared to a single-stage approach.