Investigating wav2vec2 context representations and the effects of fine-tuning, a case-study of a Finnish model Yochai Blau, Rohan Agrawal, Lior Madmony, Gary Wang, Andrew Rosenberg, Zhehuai Chen, Zorik Gekhman, Genady Beryozkin, Parisa Haghani, Bhuvana Ramabhadran Using Text Injection to Improve Recognition of Personal Identifiers in Speech Speech Recognition: Signal Processing, Acoustic Modeling, Robustness, Adaptation 1 Léa-Marie Lam-Yee-Mui, Lucas Ondel Yang, Ondřej Klejch Ziyang Ma, Zhisheng Zheng, Changli Tang, Yujin Wang, Xie ChenĬomparing Self-Supervised Pre-Training and Semi-Supervised Training for Speech Recognition in Languages with Weak Language Models MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Kartik Audhkhasi O-1: Self-training with Oracle and 1-best Hypothesis Zhao Yang, Dianwen Ng, Chong Zhang, Xiao Fu, Rui Jiang, Wei Xi, Yukun Ma, Chongjia Ni, Eng Siong Chng, Bin Ma, Jizhong Zhao Salah Zaiem, Titouan Parcollet, Slim Essidĭual Acoustic Linguistic Self-supervised Representation Learning for Cross-Domain Speech Recognition Yifan Peng, Yui Sudo, Shakeel Muhammad, Shinji WatanabeĪutomatic Data Augmentation for Domain Adapted Fine-Tuning of Self-Supervised Speech Representations Guangyan Zhang, Thomas Merritt, Sam Ribeiro, Biel Tura-Vecino, Kayoko Yanagisawa, Kamil Pokora, Abdelhamid Ezzerg, Sebastian Cygert, Ammar Abbas, Piotr Bilinski, Roberto Barra-Chicote, Daniel Korzekwa, Jaime Lorenzo-TruebaĭPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models Rui Liu, Haolin Zuo, De Hu, Guanglai Gao, Haizhou LiĬomparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpusĭetai Xin, Shinnosuke Takamichi, Ai Morimatsu, Hiroshi SaruwatariĮxplicit Intensity Control for Accented Text-to-speech Haobin Tang, Xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao Zhao-Ci Liu, Zhen-Hua Ling, Ya-Jun Hu, Jia Pan, Jin-Wei Wang, Yun-Di WuĮmoMix: Emotion Mixing via Diffusion Models for Emotional Speech Synthesis Speech Synthesis with Self-Supervisedly Learnt Prosodic Representations Jianrong Wang, Yaxin Zhao, Li Liu, Tianyi Xu, Qi Li, Sen Li Emotional Talking Head Generation based on Memory-Sharing and Attention-Augmented Networks
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |