What:

A type of Information Retrieval (IR) that allows you to use Logical Connectiveslogical operators (AND, OR, NOT). It’s made possible by Inverted Index.

Example
• Collection: search Shakespeare’s Collected Works
• Boolean query: Brutus AND Caesar AND NOT Calpurnia

How to implement it?

  1. Take an inverted index’s posting list.
  2. Split it up using the same preprocessing done for the indexing.
  3. Now, we break the user’s query into terms and operators ("scotland AND england").
  4. We then intelligently combine posting lists of each words into one
    1. E.g. For finding "scotland AND england", we’d find the intersection of the posting lists.
    2. For "scotland OR england", we’d find the union of them.

Limitations:

While powerful, this unfortunately just returns the documents that satisfy our search query. Ideally, we want them ranked by relevant. That’s why we invented way of scoring documents (Jaccard Coefficient)