Encoder-Decoder
An architecture where the encoder compresses input into a fixed representation and the decoder generates output from that representation. This structure is used in translation, summarization, and image captioning.
Why It Matters
The encoder-decoder paradigm is foundational — it is the basis for seq2seq models, transformers, and most modern generation systems.
Example
A machine translation system where the encoder processes an English sentence into a meaning vector and the decoder generates the equivalent French sentence from that vector.
Think of it like...
Like a telegraph system — one operator encodes a message into Morse code (encoder), sends it over the wire, and another operator decodes it back into words (decoder).
Related Terms
Transformer
A neural network architecture introduced in 2017 that uses self-attention mechanisms to process sequential data in parallel rather than sequentially. Transformers are the foundation of modern LLMs like GPT, Claude, and Gemini.
Sequence-to-Sequence
A model architecture that transforms one sequence into another, where the input and output can be different lengths. It uses an encoder to process input and a decoder to generate output.
Attention Mechanism
A component in neural networks that allows the model to focus on the most relevant parts of the input when producing each part of the output. It assigns different weights to different input elements based on their relevance.
Autoencoder
A neural network that learns to compress data into a lower-dimensional representation (encoding) and then reconstruct it back (decoding). It learns what features are most important for faithful reconstruction.