What?
It’s about taking a sentence and assigning each word into grammatical categories.
E.g.as
The → Determiner (DET)
quick → Adjective (ADJ)
brown → Adjective (ADJ)
fox → Noun (NOUN)
Approaches To POS Tagging:
- Rule-Based Tagging: Uses rules / dictionaries to come up with them
- Probabilistic Tagging: Uses Hidden Markov Models (HMM) , Probabilistic Finite State Machine etc.
- Deep Learning Tagging: Uses RNNs, LSTMs, etc. to come up with them.
Note on Probabilistic FSM:
This is a really cool way of thinking about it. You ideally want to fine-tune the probabilities of, given being on one state, going to another state.
There’s a dataset where each word has a probability of a corresponding category. If we take every word’s category in our sentence, and multiply their probabilities, then the result is the probability of that sentence.
The Parts of Speech:
- Open Class Words: Content words
- Nouns, verbs, adjectives, adverbs
- Content-bearing: refer to features of the world
- It’s open as there’s no limit to these words (and new ones are constantly added: email!)
- Closed Class Words: Functional Words (“Syntactic Glue”)
- Pronouns, prepositions, connectives etc.
- Limited amount
- The ties the concepts of a sentence together.
Challenges:
- Ambiguity: Water the plants vs the water on the plants.
- Verb or noun?
- Sparse Data: We’ve not seen all of the data in all contexts before.