The state of sparsity in deep neural networks

Trevor Gale, Erich Elsen, Sara Hooker
[arXiv] [Google Scholar] [DBLP] [Citeseer]
Read: 04 October 2021

arXiv 1902.09574 cs.LG
2019
Note(s): neural network, sparse model, google

About techniques for learning sparse models either by learning a model and then making it sparse or by modifying how the initial model is learned or by using a sparse architecture for the model.

They reimplement three techniques and do a massive analysis of the different techniques at different sparsity levels.

Main takeaway seems to be that magnitude pruning is the winner. Also, there is a lot of “tuning hyperparameters” in some of the other techniques.