Gating network

[Google Scholar]

Notes: mixture of experts
Papers:

Mixture of experts

  • Outrageously large neural networks: The sparsely-gated mixture-of-experts layer [shazeer:arxiv:2017]