Skip to content

doublethink.co.uk

Tag: #compsci

Modern Hashtables

As it has been a decade since last learning any compsci I've recently been looking over some more advanced and modern hashtable variations that weren't taught much back and came across a blog series by Emmanuel Goossaert that covered these advancements in open addressed hashtables that's worth highlighting:

Robin Hood Linear Probing

A slight variation on linear probing where when dealing with collisions the item to be inserted is updated with an existing item and the existing item's slot is used if the original item to insert would have a further distance from its intended bucket. Thus stealing from the rich (nearer their intended bucket) and giving to the poor (the more displaced). Main article and a follow up specifically about deletions.

Hopscotch Hashing

A variation where the entry to be inserted will be placed within the neighbourhood of the intended bucket, when the neighbourhood is full an item is displaced recursively until all the items have found a place in their respective neighbourhoods. This algorithm is likely to be very friendly to cache performance, similar to linear probing. Main article.

Cuckoo Hashing

A collision resolution that has a list of distinct hashes to check for each key on lookup, and on insertion guarantees value will be inserted into one of them by displacing one of the existing values (which then needs to find a home).

Article and another discussing the ability to do lock-free cuckoo hashes.

None of these are likely to be make it into a general purpose implementation such as std::unordered_map, that will likely used chained collision resolution, but they are interesting to know for specialized applications and where an open addressed hashtable is most suited (disk and shared memory implementations).