Hash Tables

What is a Hash Table?

Summary What is a Hash Table?
Figure %: The hash table after inserting "Spark"

Let's try another: "Notes". We run "Notes" through the hash function and find that hash("Notes",12) is 3. Ok. We insert it into the hash table:

Figure %: A hash table collision

What happened? A hash function doesn't guarantee that every input will map to a different output (in fact, as we'll see in the next section, it shouldn't do this). There is always the chance that two inputs will hash to the same output. This indicates that both elements should be inserted at the same place in the array, and this is impossible. This phenomenon is known as a collision.

There are many algorithms for dealing with collisions, such as linear probing an d separate chaining. While each of the methods has its advantages, we will only discuss separate chaining here.

Separate chaining requires a slight modification to the data structure. Instead of storing the data elements right into the array, they are stored in linked lists. Each slot in the array then points to one of these linked lists. When an element hashes to a value, it is added to the linked list at that index in the array. Because a linked list has no limit on length, collisions are no longer a problem. If more than one element hashes to the same value, then both are stored in that linked list.

Let's look at the above example again, this time with our modified data structure:

Figure %: Modified table for separate chaining

Again, let's try adding "Steve" which hashes to 3: