Hashing
Hashing
- A hash function turns a record's key into an address where the record is stored.
- It lets a program jump straight to a record — near-instant lookup.
- The tricky part is handling collisions.
The hash function
- A good hash is fast, deterministic (same key → same address), and spreads keys evenly.
- Simple examples for $N$ slots: modulo (
address ← key MOD N), folding (split the key, add the parts, MOD N), or a string hash (sum character codes, MOD N).
Practice
A hash function:
A hash function maps a key to an address, enabling near-instant direct lookup.
Practice
Using the modulo hash address ← key MOD N with key = 27 and N = 10, what address is produced?
27 MOD 10 = 7 (the remainder when 27 is divided by 10).
Collisions and resolution
- A collision is when two keys hash to the same address. Ways to resolve it:
- Linear probing — try the next slot (wrapping around); simple but clusters.
- Chaining — each slot points to a linked list of records that hashed there.
- Rehashing — use a second hash function.
- Keep the load factor (records ÷ slots) below about 70% for near-O(1) lookups.
Practice
A collision occurs when:
Two keys mapping to the same slot is a collision; it must be resolved by probing, chaining or rehashing.
Practice
Chaining resolves collisions by:
With chaining, each slot holds a linked list of all records that hashed there. (Linear probing moves to the next slot.)
Practice
To keep hash lookups fast, the load factor (records ÷ slots) should be kept:
A lower load factor means fewer collisions, so lookups stay close to O(1).
Searching and inserting
- Search: hash the key, read that slot; if the keys match you're done, else follow the resolution strategy until a match or an empty slot.
- Insert: hash the key, write to that slot — or the next free one if it's taken.
You've got it
Key idea
- a hash function maps a key to an address (fast, deterministic, even spread)
- a collision = two keys → same address; resolve by probing, chaining, or rehashing
- keep the load factor below ~70% for fast lookups
- search/insert: hash the key, then follow the resolution strategy