dc.contributorManrique Piramanrique, Rubén Francisco
dc.contributorNúñez Castro, Haydemar María
dc.contributorPacheco Páramo, Diego Felipe
dc.contributorFLAG
dc.creatorTamayo Flórez, Pablo Andrés
dc.date.accessioned2023-01-30T15:31:59Z
dc.date.accessioned2023-09-06T23:20:26Z
dc.date.available2023-01-30T15:31:59Z
dc.date.available2023-09-06T23:20:26Z
dc.date.created2023-01-30T15:31:59Z
dc.date.issued2022-11-23
dc.identifierhttp://hdl.handle.net/1992/64313
dc.identifierinstname:Universidad de los Andes
dc.identifierreponame:Repositorio Institucional Séneca
dc.identifierrepourl:https://repositorio.uniandes.edu.co/
dc.identifier.urihttps://repositorioslatinoamericanos.uchile.cl/handle/2250/8726422
dc.description.abstractIn this work, a voice anti-spoofing dataset was built from samples in Latin America Spanish implementing voice conversion and text-to-speech algorithms and later a test was performed on anti-spoofing models trained on samples in English to see their behaviors with other languages than those in which they were trained.
dc.languageeng
dc.publisherUniversidad de los Andes
dc.publisherMaestría en Ingeniería de Sistemas y Computación
dc.publisherFacultad de Ingeniería
dc.publisherDepartamento de Ingeniería Sistemas y Computación
dc.relationM. Dua, C. Jain, and S. Kumar, "Lstm and cnn based ensemble approach for spoof detection task in automatic speaker verification systems," Journal of Ambient Intelligence and Humanized Computing, 2021.
dc.relationH. Dinkel, Y. Qian, and K. Yu, "Investigating raw wave deep neural networks for end-to-end speaker spoofing detection," IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 26, pp. 2002-2014, 2018.
dc.relationQ. Fu, Z. Teng, J. White, M. Powell, and D. C. Schmidt, "Fastaudio: A learnable audio front-end for spoof speech detection," 2021.
dc.relationT. Arif, A. Javed, M. Alhameed, F. Jeribi, and A. Tahir, "Voice spoofing countermeasure for logical access attacks detection," IEEE Access, vol. 9, pp. 162857-162868, 2021.
dc.relationY. Xie, Z. Zhang, and Y. Yang, "Siamese network with wav2vec feature for spoofing speech detection," Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 6, pp. 4700-4704, 2021.
dc.relationA. Gomez-Alanis, A. M. Peinado, J. A. Gonzalez, and A. M. Gomez, "A gated recurrent convolutional neural network for robust spoofing detection," IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 27, pp. 1985-1999, 2019.
dc.relationX. Wang and J. Yamagishi, "chapter-a practical guide to logical access voice presentation attack detection," 2022.
dc.relationA. Consortium, "Asvspoof 2019 evaluation plan," vol. 4, pp. 1-19, 2019.
dc.relationZ. Wu, T. Kinnunen, N. Evans, J. Yamagishi, C. Hanil¸ci, M. Sahidullah, and A. Sizov, "Asvspoof 2015: The first automatic speaker verification spoofing and ountermeasures challenge," Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2015-Janua, pp. 2037-2041, 2015
dc.relationR. Reimao and V. Tzerpos, "For: A dataset for synthetic speech detection," 2019 10th International Conference on Speech Technology and Human-Computer Dialogue, SpeD 2019, 2019.
dc.relationJ. Schepens, T. Dijkstra, F. Grootjen, and W. van Heuven, "Cross-language distributions of high frequency and phonetically similar cognates," PloS one, vol. 8, p. e63006, 05 2013.
dc.relationA. Guevara-Rukoz, I. Demirsahin, F. He, S. H. C. Chu, S. Sarin, K. Pipatsrisawat, A. Gutkin, A. Butryna, and O. Kjartansson, "Crowdsourcing latin american spanish for low-resource text-to-speech," LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings, pp. 6504-6513, 2020.
dc.relationJ. Lorenzo-Trueba, J. Yamagishi, T. Toda, D. Saito, F. Villavicencio, T. Kinnunen, and Z. Ling, "The voice conversion challenge 2018: Promoting development of parallel and nonparallel methods," pp. 195-202, 2018.
dc.relationS. S. Sribhashyam, M. S. Salekin, D. Goldgof, G. Zamzmi, M. Last, and Y. Sun, "Pattern recognition in vital signs using spectrograms," Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics, pp. 1133-1138, 2021.
dc.relationL. Wyse, "Audio spectrogram representations for processing with convolutional neural networks," vol. 1, pp. 37-41, 2017.
dc.relationY. Jia, X. Chen, J. Yu, L. Wang, Y. Xu, S. Liu, and Y. Wang, "Speaker recognition based on characteristic spectrograms and an improved selforganizing feature map neural network," Complex and Intelligent Systems, vol. 7, pp. 1749-1757, 8 2021.
dc.relationM. Zhang, X. Wang, F. Fang, H. Li, and J. Yamagishi, "Joint training framework for text-to-speech and voice conversion using multi-source tacotron and wavenet," Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2019-Septe, pp. 1298-1302, 2019.
dc.relationS. Russell and P. Norvig, "Artificial neural networks," 2016.
dc.relationF. Chollet, ¿What is deep learning?, 2021.
dc.relationJ. Krohn, G. Beyleveld, and A. Bassens, "Generative adversarial networks," 2019.
dc.relationJ. Krohn, G. Beyleveld, and A. Bassens, Deep Learning Illustrated: A Visual, Interactive Guide to Artificial Intelligence. Addison-Wesley Professional, 1st ed., 2019.
dc.relationJ. Ho, A. Jain, and P. Abbeel, "Denoising diffusion probabilistic models," 2020.
dc.relationK. S. Rao and M. K. E, Speech Recognition Using Articulatory and Excitation Source Features. Springer International Publishing, 2017.
dc.relationB. Markovic, J. Galic, and M. Miji´c, "Application of teager energy operator on linear and mel scales for whispered speech recognition," Archives of Acoustics, vol. 43, 01 2018.
dc.relationZ. Wu, P. L. D. Leon, C. Demiroglu, A. Khodabakhsh, S. King, Z. H. Ling, D. Saito, B. Stewart, T. Toda, M. Wester, and J. Yamagish, "Anti-spoofing for text-independent speaker verification: An initial database, comparison of countermeasures, and human performance," IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 24, pp. 768-783, 2016.
dc.relationJ. Yamagishi, C. Veaux, and K. MacDonald, "Cstr vctk corpus: English multi-speaker corpus for cstr voice cloning toolkit (version 0.92)," 2019.
dc.relationH. Tak, M. Todisco, X. Wang, J.-W. Jung, J. Yamagishi, and N. Evans, "Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation,"
dc.relationZ. Zhang, X. Yi, and X. Zhao, "Fake speech detection using residual network with transformer encoder," IH and MMSec 2021 - Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security, pp. 13-22, 2021.
dc.relationX. Wang and J. Yamagishi, "A comparative study on recent neural spoofing countermeasures for synthetic speech detection," Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 6, pp. 4685-4689, 2021.
dc.relationX. Wang and J. Yamagishi, "Investigating self-supervised front ends for speech spoofing countermeasures," 2021.
dc.relationY. Zhang, F. Jiang, and Z. Duan, "One-class learning towards synthetic voice spoofing detection," IEEE Signal Processing Letters, vol. 28, pp. 937-941, 2021.
dc.relationY. Ma, Z. Ren, and S. Xu, "Rw-resnet: A novel speech anti-spoofing model using raw waveform,"Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 5, pp. 3696-3700, 2021.
dc.relationY. Wang, M. Zhang, and Z. Zhu, "Detection of voice transformation disguise based on deep residual net, "PervasiveHealth: Pervasive Computing Technologies for Healthcare, pp. 126-130, 2020
dc.relationA. Cohen, I. Rimon, E. Aflalo, and H. Permuter, "A study on data augmentation in voice anti-spoofing," 2021.
dc.relationT. Kaneko and H. Kameoka, "Cyclegan-vc: Non-parallel voice conversion using cycle-consistent adversarial networks," European Signal Processing Conference, vol. 2018-Septe, pp. 2100-2104, 2018.
dc.relationJ. C. Chou and H. Y. Lee, "One-shot voice conversion by separating speaker and content representations with instance normalization," Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2019-Septe, pp. 664-668, 2019.
dc.relationC. C. Lo, S. W. Fu, W. C. Huang, X. Wang, J. Yamagishi, Y. Tsao, and H. M. Wang, "Mosnet: Deep learning-based objective assessment for voice conversion," Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2019-Septe, pp. 1541-1545, 2019.
dc.relationK. Qian, Y. Zhang, S. Chang, X. Yang, and M. Hasegawa-Johnson, "Autovc: Zero-shot voice style transfer with only autoencoder loss," 2019.
dc.relationH. Kameoka, T. Kaneko, K. Tanaka, and N. Hojo, "Stargan-vc: Nonparallel many-to-many voice conversion using star generative adversarial networks," 2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings, pp. 266-273, 2019.
dc.relationW. Ping, K. Peng, A. Gibiansky, S. Ark, A. Kannan, S. Narang, J. Raiman, and J. Miller, "Deep voice 3: Scaling text-to-speech with convolutional sequence learning," 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings, pp. 1-16, 2018.
dc.relationV. Popov, I. Vovk, V. Gogoryan, T. Sadekova, M. Kudinov, and J. Wei, "Diffusion-based voice conversion with fast maximum likelihood sampling scheme," 9 2021.
dc.relationS. Liu, Y. Cao, D. Su, and H. Meng, "Diffsvc: A diffusion probabilistic model for singing voice conversion," 5 2021.
dc.relationK. Akuzawa, K. Onishi, K. Takiguchi, K. Mametani, and K. Mori, "Conditional deep hierarchical variational autoencoder for voice conversion," 12 2021.
dc.relationO. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation, 5 2015.
dc.relationX. Zhang, R. Zhao, J. Yan, M. Gao, Y. Qiao, X. Wang, and H. Li, "P2sgrad: Refined gradients for optimizing deep face models," 2019.
dc.relationG. Lavrentyeva, S. Novoselov, A. Tseren, M. Volkova, A. Gorlanov, and A. Kozlov, "Stc antispoofing systems for the asvspoof2019 challenge," vol. 2019-September, pp. 1033-1037, International Speech Communication Association, 2019.
dc.relationA. Kashkin, I. Karpukhin, and S. Shishkin, "Hifi-vc: High quality asr-based voice conversion," 3 2022.
dc.rightsAtribución-CompartirIgual 4.0 Internacional
dc.rightsAtribución-CompartirIgual 4.0 Internacional
dc.rightshttp://creativecommons.org/licenses/by-sa/4.0/
dc.rightsinfo:eu-repo/semantics/openAccess
dc.rightshttp://purl.org/coar/access_right/c_abf2
dc.titleVoice anti-spoofing data-set built from Latin American Spanish accents implementing voice conversion and text-to-speech techniques
dc.typeTrabajo de grado - Maestría


Este ítem pertenece a la siguiente institución