This is Video 19 of our series on prosody. Systems for many tasks --- such as speech recognition, emotion recognition, and intent recognition --- are built today by applying machine learning methods to large datasets. The input to these models is usually some set of features computed from the audio signal. In this lecture we discuss the advantages and disadvantages of various types of prosodic features for various purposes.
00:00 Machine Learning needs Features
00:49 Different Types of Features
01:13 Using Meaningful Features
02:00 Using Midlevel Features
02:58 Using Frame-Level (Low-Level) Features
03:55 Using Filterbank and other Generic Features
04:47 Using Features from Pretrained Models
06:41 Feature Set Choices for Common Tasks
07:10 Summary
Download
0 formats
No download links available.
Prosody Tutorial: Lecture 19: Features for Machine Learning | NatokHD