A pair of neural networks f and g such that g(f(x)) is kinda similar to x. The range of f is usually much smaller than its domain so that f is required to capture the most useful/salient properties of x. That is, they are “undercomplete” and are used for dimensionality reduction.

## Notes related to Auto-encoder model

Transformer machine learning model

## Papers related to Auto-encoder model

- Attention is all you need [vaswani:arxiv:2017]