Lazy loaded image
Technology
Lazy loaded imageInverted Index: A Foundation for Search Engines
Words 381Read Time 1 min
Apr 22, 2025
Apr 22, 2025
type
status
date
slug
summary
tags
category
icon
password

Inverted Index: A Foundation for Search Engines

An inverted index is a data structure used in search engines to map each word (term) to the documents that contain it. It reverses the traditional document-term relationship by focusing on:
"In which documents does the word 'Caesar' appear?" rather than "What words are in Document 1?"

Example: Two Documents

Let's work with two simple documents:

Step 1: Building the Inverted Index

Tokenization

First, we extract all words and note which document they come from:

Output

Our inverted index is now complete. Each word points to the list of documents where it appears.

Step 2: Boolean Queries

AND Query

Output:
Both "brutus" and "caesar" appear in documents 1 and 2.

OR Query

Output:

AND NOT Query

Output:
上一篇
NLP Application
下一篇
Different data structures for inverted index postings lists