Neural network

[Google Scholar]

Notes:
Papers:

TensorFlow: Large-scale machine learning on heterogeneous distributed systems [abadi:arxiv:2016]
Fast sparse ConvNets [elsen:arxiv:2019]
Rigging the lottery: Making all tickets winners [evci:arxiv:2021]
Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity [fedus:arxiv:2021]
The state of sparsity in deep neural networks [gale:arxiv:2019]
Sparse GPU kernels for deep learning [gale:arxiv:2020]
ExTensor: An accelerator for sparse tensor algebra [hedge:micro:2019]
In-datacenter performance analysis of a tensor processing unit [jouppi:isca:2017]
Motivation for and evaluation of the first tensor processing unit [jouppi:micro:2018]
The tensor algebra compiler [kjolstad:oopsla:2017]
GShard: Scaling giant models with conditional computation and automatic sharding [lepikhin:arxiv:2020]
SIGMA: A sparse and irregular GEMM accelerator with flexible interconnects for DNN training [qin:hpca:2020]
Outrageously large neural networks: The sparsely-gated mixture-of-experts layer [shazeer:arxiv:2017]
Attention is all you need [vaswani:arxiv:2017]

Notes related to Neural network