Back to Browse

Apply Second-Order Pruning Algorithms for SOTA Model Compression

1.2K views
Apr 12, 2023
41:41

Second-order pruning methods enable higher sparsity while maintaining accuracy by removing weights that directly affect the loss function the least. The end result is a sparse model with much smaller files, lower latency, and higher throughput. For example, using second-order pruning algorithms, a ResNet-50 image classification model can be pruned 95% while maintaining 99% of the baseline accuracy, decreasing the size of the file from the original 90.8MB to 9.3MB In this video, we walk through the research, production results, and intuition for how second-order pruning algorithms work. We run through how to apply second-order pruning algorithms for SOTA model compression to your current ML projects. Speaker: Eldar Kurtić, Research Consultant, Neural Magic If you have any questions, join us in the Neural Magic Slack community: https://join.slack.com/t/discuss-neuralmagic/shared_invite/zt-q1a1cnvo-YBoICSIw3L1dmQpjBeDurQ

Download

1 formats

Video Formats

360pmp452.9 MB

Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.

Apply Second-Order Pruning Algorithms for SOTA Model Compression | NatokHD