Back to Browse

Compressing Neural Networks for Embedded AI: Pruning, Projection, and Quantization

8.7K views
Aug 13, 2025
20:42

This Tech Talk explores how to compress neural network models so they can run efficiently on embedded systems without sacrificing accuracy. Many neural networks are overparameterized, meaning they contain more weights and structure than necessary. This excess can be systematically reduced through three powerful techniques: pruning, projection, and quantization. Using a hands-on MATLAB® example, you’ll learn how to compress a trained model that classifies cracked pavement using acceleration data—achieving over 94% reduction in model size while maintaining high accuracy. Example in video: - Train Sequence Classification Network for Road Damage Detection Example: http://bit.ly/3UToDAe Related Resources: - MATLAB and Simulink for Embedded AI: http://bit.ly/47iRwxe -------------------------------------------------------------------------------------------------------- Get a free product trial: https://goo.gl/ZHFb5u Learn more about MATLAB: https://goo.gl/8QV7ZZ Learn more about Simulink: https://goo.gl/nqnbLe See what's new in MATLAB and Simulink: https://goo.gl/pgGtod © 2025 The MathWorks, Inc. MATLAB and Simulink are registered trademarks of The MathWorks, Inc. See www.mathworks.com/trademarks for a list of additional trademarks. Other product or brand names may be trademarks or registered trademarks of their respective holders.

Download

1 formats

Video Formats

360pmp430.2 MB

Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.

Compressing Neural Networks for Embedded AI: Pruning, Projection, and Quantization | NatokHD