What?
It’s shorthand for Sequence2Sequence: any model that takes a sequence and transforms it into another.
Architecture:
- Traditionally they were encoder-decoder models.
- But now, we’ve began using decoder-only models in an autoregressive manner. This seems to work quite well.