What?

It’s shorthand for Sequence2Sequence: any model that takes a sequence and transforms it into another.

Architecture:

  • Traditionally they were encoder-decoder models.
  • But now, we’ve began using decoder-only models in an autoregressive manner. This seems to work quite well.