dc.contributor | Manrique Piramanrique, Rubén Francisco | |
dc.contributor | Núñez Castro, Haydemar María | |
dc.contributor | Pacheco Páramo, Diego Felipe | |
dc.contributor | FLAG | |
dc.creator | Tamayo Flórez, Pablo Andrés | |
dc.date.accessioned | 2023-01-30T15:31:59Z | |
dc.date.accessioned | 2023-09-06T23:20:26Z | |
dc.date.available | 2023-01-30T15:31:59Z | |
dc.date.available | 2023-09-06T23:20:26Z | |
dc.date.created | 2023-01-30T15:31:59Z | |
dc.date.issued | 2022-11-23 | |
dc.identifier | http://hdl.handle.net/1992/64313 | |
dc.identifier | instname:Universidad de los Andes | |
dc.identifier | reponame:Repositorio Institucional Séneca | |
dc.identifier | repourl:https://repositorio.uniandes.edu.co/ | |
dc.identifier.uri | https://repositorioslatinoamericanos.uchile.cl/handle/2250/8726422 | |
dc.description.abstract | In this work, a voice anti-spoofing dataset was built from samples in Latin America Spanish implementing voice conversion and text-to-speech algorithms and later a test was performed on anti-spoofing models trained on samples in English to see their behaviors with other languages than those in which they were trained. | |
dc.language | eng | |
dc.publisher | Universidad de los Andes | |
dc.publisher | Maestría en Ingeniería de Sistemas y Computación | |
dc.publisher | Facultad de Ingeniería | |
dc.publisher | Departamento de Ingeniería Sistemas y Computación | |
dc.relation | M. Dua, C. Jain, and S. Kumar, "Lstm and cnn based ensemble approach
for spoof detection task in automatic speaker verification systems," Journal
of Ambient Intelligence and Humanized Computing, 2021. | |
dc.relation | H. Dinkel, Y. Qian, and K. Yu, "Investigating raw wave deep neural networks for end-to-end speaker spoofing detection," IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 26, pp. 2002-2014, 2018. | |
dc.relation | Q. Fu, Z. Teng, J. White, M. Powell, and D. C. Schmidt, "Fastaudio: A learnable audio front-end for spoof speech detection," 2021. | |
dc.relation | T. Arif, A. Javed, M. Alhameed, F. Jeribi, and A. Tahir, "Voice spoofing countermeasure for logical access attacks detection," IEEE Access, vol. 9, pp. 162857-162868, 2021. | |
dc.relation | Y. Xie, Z. Zhang, and Y. Yang, "Siamese network with wav2vec feature for spoofing speech detection," Proceedings of the Annual Conference of the
International Speech Communication Association, INTERSPEECH, vol. 6,
pp. 4700-4704, 2021. | |
dc.relation | A. Gomez-Alanis, A. M. Peinado, J. A. Gonzalez, and A. M. Gomez, "A gated recurrent convolutional neural network for robust spoofing detection," IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 27, pp. 1985-1999, 2019. | |
dc.relation | X. Wang and J. Yamagishi, "chapter-a practical guide to logical access voice presentation attack detection," 2022. | |
dc.relation | A. Consortium, "Asvspoof 2019 evaluation plan," vol. 4, pp. 1-19, 2019. | |
dc.relation | Z. Wu, T. Kinnunen, N. Evans, J. Yamagishi, C. Hanil¸ci, M. Sahidullah, and A. Sizov, "Asvspoof 2015: The first automatic speaker verification spoofing and ountermeasures challenge," Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2015-Janua, pp. 2037-2041, 2015 | |
dc.relation | R. Reimao and V. Tzerpos, "For: A dataset for synthetic speech detection," 2019 10th International Conference on Speech Technology and Human-Computer Dialogue, SpeD 2019, 2019. | |
dc.relation | J. Schepens, T. Dijkstra, F. Grootjen, and W. van Heuven, "Cross-language distributions of high frequency and phonetically similar cognates," PloS one, vol. 8, p. e63006, 05 2013. | |
dc.relation | A. Guevara-Rukoz, I. Demirsahin, F. He, S. H. C. Chu, S. Sarin, K. Pipatsrisawat, A. Gutkin, A. Butryna, and O. Kjartansson, "Crowdsourcing
latin american spanish for low-resource text-to-speech," LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings, pp. 6504-6513, 2020. | |
dc.relation | J. Lorenzo-Trueba, J. Yamagishi, T. Toda, D. Saito, F. Villavicencio, T. Kinnunen, and Z. Ling, "The voice conversion challenge 2018: Promoting development of parallel and nonparallel methods," pp. 195-202, 2018. | |
dc.relation | S. S. Sribhashyam, M. S. Salekin, D. Goldgof, G. Zamzmi, M. Last, and Y. Sun, "Pattern recognition in vital signs using spectrograms," Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics, pp. 1133-1138, 2021. | |
dc.relation | L. Wyse, "Audio spectrogram representations for processing with convolutional neural networks," vol. 1, pp. 37-41, 2017. | |
dc.relation | Y. Jia, X. Chen, J. Yu, L. Wang, Y. Xu, S. Liu, and Y. Wang, "Speaker recognition based on characteristic spectrograms and an improved selforganizing feature map neural network," Complex and Intelligent Systems,
vol. 7, pp. 1749-1757, 8 2021. | |
dc.relation | M. Zhang, X. Wang, F. Fang, H. Li, and J. Yamagishi, "Joint training framework for text-to-speech and voice conversion using multi-source tacotron and wavenet," Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2019-Septe, pp. 1298-1302, 2019. | |
dc.relation | S. Russell and P. Norvig, "Artificial neural networks," 2016. | |
dc.relation | F. Chollet, ¿What is deep learning?, 2021. | |
dc.relation | J. Krohn, G. Beyleveld, and A. Bassens, "Generative adversarial networks," 2019. | |
dc.relation | J. Krohn, G. Beyleveld, and A. Bassens, Deep Learning Illustrated: A Visual, Interactive Guide to Artificial Intelligence. Addison-Wesley Professional, 1st ed., 2019. | |
dc.relation | J. Ho, A. Jain, and P. Abbeel, "Denoising diffusion probabilistic models," 2020. | |
dc.relation | K. S. Rao and M. K. E, Speech Recognition Using Articulatory and Excitation Source Features. Springer International Publishing, 2017. | |
dc.relation | B. Markovic, J. Galic, and M. Miji´c, "Application of teager energy operator on linear and mel scales for whispered speech recognition," Archives of
Acoustics, vol. 43, 01 2018. | |
dc.relation | Z. Wu, P. L. D. Leon, C. Demiroglu, A. Khodabakhsh, S. King, Z. H. Ling, D. Saito, B. Stewart, T. Toda, M. Wester, and J. Yamagish, "Anti-spoofing for text-independent speaker verification: An initial database, comparison
of countermeasures, and human performance," IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 24, pp. 768-783, 2016. | |
dc.relation | J. Yamagishi, C. Veaux, and K. MacDonald, "Cstr vctk corpus: English multi-speaker corpus for cstr voice cloning toolkit (version 0.92)," 2019. | |
dc.relation | H. Tak, M. Todisco, X. Wang, J.-W. Jung, J. Yamagishi, and N. Evans, "Automatic speaker verification spoofing and deepfake detection using
wav2vec 2.0 and data augmentation," | |
dc.relation | Z. Zhang, X. Yi, and X. Zhao, "Fake speech detection using residual network with transformer encoder," IH and MMSec 2021 - Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security, pp. 13-22, 2021. | |
dc.relation | X. Wang and J. Yamagishi, "A comparative study on recent neural spoofing countermeasures for synthetic speech detection," Proceedings of the Annual Conference of the International Speech Communication Association,
INTERSPEECH, vol. 6, pp. 4685-4689, 2021. | |
dc.relation | X. Wang and J. Yamagishi, "Investigating self-supervised front ends for speech spoofing countermeasures," 2021. | |
dc.relation | Y. Zhang, F. Jiang, and Z. Duan, "One-class learning towards synthetic voice spoofing detection," IEEE Signal Processing Letters, vol. 28, pp. 937-941, 2021. | |
dc.relation | Y. Ma, Z. Ren, and S. Xu, "Rw-resnet: A novel speech anti-spoofing model using raw waveform,"Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 5, pp. 3696-3700, 2021. | |
dc.relation | Y. Wang, M. Zhang, and Z. Zhu, "Detection of voice transformation disguise based on deep residual net, "PervasiveHealth: Pervasive Computing
Technologies for Healthcare, pp. 126-130, 2020 | |
dc.relation | A. Cohen, I. Rimon, E. Aflalo, and H. Permuter, "A study on data augmentation in voice anti-spoofing," 2021. | |
dc.relation | T. Kaneko and H. Kameoka, "Cyclegan-vc: Non-parallel voice conversion using cycle-consistent adversarial networks," European Signal Processing Conference, vol. 2018-Septe, pp. 2100-2104, 2018. | |
dc.relation | J. C. Chou and H. Y. Lee, "One-shot voice conversion by separating speaker and content representations with instance normalization," Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2019-Septe, pp. 664-668, 2019. | |
dc.relation | C. C. Lo, S. W. Fu, W. C. Huang, X. Wang, J. Yamagishi, Y. Tsao, and H. M. Wang, "Mosnet: Deep learning-based objective assessment
for voice conversion," Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2019-Septe, pp. 1541-1545, 2019. | |
dc.relation | K. Qian, Y. Zhang, S. Chang, X. Yang, and M. Hasegawa-Johnson, "Autovc: Zero-shot voice style transfer with only autoencoder loss," 2019. | |
dc.relation | H. Kameoka, T. Kaneko, K. Tanaka, and N. Hojo, "Stargan-vc: Nonparallel many-to-many voice conversion using star generative adversarial networks," 2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings, pp. 266-273, 2019. | |
dc.relation | W. Ping, K. Peng, A. Gibiansky, S. Ark, A. Kannan, S. Narang, J. Raiman, and J. Miller, "Deep voice 3: Scaling text-to-speech with convolutional
sequence learning," 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings, pp. 1-16, 2018. | |
dc.relation | V. Popov, I. Vovk, V. Gogoryan, T. Sadekova, M. Kudinov, and J. Wei, "Diffusion-based voice conversion with fast maximum likelihood sampling
scheme," 9 2021. | |
dc.relation | S. Liu, Y. Cao, D. Su, and H. Meng, "Diffsvc: A diffusion probabilistic model for singing voice conversion," 5 2021. | |
dc.relation | K. Akuzawa, K. Onishi, K. Takiguchi, K. Mametani, and K. Mori, "Conditional deep hierarchical variational autoencoder for voice conversion," 12
2021. | |
dc.relation | O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation, 5 2015. | |
dc.relation | X. Zhang, R. Zhao, J. Yan, M. Gao, Y. Qiao, X. Wang, and H. Li, "P2sgrad: Refined gradients for optimizing deep face models," 2019. | |
dc.relation | G. Lavrentyeva, S. Novoselov, A. Tseren, M. Volkova, A. Gorlanov, and A. Kozlov, "Stc antispoofing systems for the asvspoof2019 challenge," vol. 2019-September, pp. 1033-1037, International Speech Communication
Association, 2019. | |
dc.relation | A. Kashkin, I. Karpukhin, and S. Shishkin, "Hifi-vc: High quality asr-based voice conversion," 3 2022. | |
dc.rights | Atribución-CompartirIgual 4.0 Internacional | |
dc.rights | Atribución-CompartirIgual 4.0 Internacional | |
dc.rights | http://creativecommons.org/licenses/by-sa/4.0/ | |
dc.rights | info:eu-repo/semantics/openAccess | |
dc.rights | http://purl.org/coar/access_right/c_abf2 | |
dc.title | Voice anti-spoofing data-set built from Latin American Spanish accents implementing voice conversion and text-to-speech techniques | |
dc.type | Trabajo de grado - Maestría | |