Text to Speech Fine-tuning Tutorial
🎙️Custom voice AI (ASR & TTS) for enterprise — you own the model: trelis.com/voice-ai ➡️ Get Life-time Access to the Trelis Scripts (and future improvements): https://Trelis.com/ADVANCED-transcription 🗝️ Get Trelis All Access (Trelis.com/All-Access) 1. Access all SEVEN Trelis Github Repos (-robotics, -vision, -evals, -fine-tuning, -inference, -voice, -time-series) 2. Support via Github Issues & Trelis’ Private Discord 3. Early access to Trelis videos via Discord ➡️ Trelis Runpod Affiliate Link (supports the channel): https://runpod.io?ref=jmfkcdio ➡️ Newsletter: https://blog.Trelis.com ➡️ Trelis Resources/Support/Discord: https://Trelis.com/About CREDIT: Rohan Sharma for his contribution to this video. 🤝Are you a talented developer? Work for Trelis: https://trelis.com/jobs/ VIDEO RESOURCES: - Original StyleTTS2 Fine-tuning Scripts: https://github.com/yl4579/StyleTTS2/ - Slides: https://docs.google.com/presentation/d/16y-fygVOC45Bb9LBvEENQyTC4AAu_sNXMzOPUiddytM/edit?usp=sharing - Speech-to-Text Fine-tuning Video: https://www.youtube.com/watch?v=anplUNnkM68 TIMESTAMPS: 0:00 Voice-cloning and fine-tuning text-to-speech models 0:40 Video Overview 1:36 Understanding text to speech models 6:29 Text to speech Transformers 8:14 Diffusion networks for text to speech 11:59 Generative Adversarial Networks for Text to Speech 15:05 Controlling style in text to speech models 18:34 StyleTTS2 Text to Speech 23:25 Voice cloning versus fine-tuning 25:44 Dataset preparation tips for voice cloning 28:11 Materials, Code, Scripts 30:06 Dataset preparation for StyleTTS fine-tuning in Colab 46:17 Fine-tuning StyleTTS2 in a Jupyter Notebook 1:00:06 Text to speech inference and performance 1:10:45 Understanding losses. 1:11:18 Voice Cloning performance without fine-tuning 1:13:31 Dataset and Fine-tuning tips 1:15:00 Trelis Internships
Download
1 formatsVideo Formats
Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.