JavaScript Hashmap Equivalent

JavaScript Hashmap Equivalent

As made clear in update 3 on this answer, this notation:
var hash = {};
hash[X]

does not actually hash the object X; it actually just converts X to a string (via .toString() if it’s an object, or some other built-in conversions for various primitive types) and then looks that string up, without hashing it, in “hash”. Object equality is also not checked – if two different objects have the same string conversion, they will just overwrite each other.
Given this – are there any efficient implementations of hashmaps in javascript? (For example, the 2nd Google result of javascript hashmap yields an implementation which is O(n) for any operation. Various other results ignore the fact that different objects with equivalent string representations overwrite each other.

Solutions/Answers:

Solution 1:

Why not hash your objects yourself manually, and use the resulting strings as keys for a regular JavaScript dictionary? After all you are in the best position to know what makes your objects unique. That’s what I do.

Example:

var key = function(obj){
  // some unique object-dependent key
  return obj.totallyUniqueEmployeeIdKey; // just an example
};

var dict = {};

dict[key(obj1)] = obj1;
dict[key(obj2)] = obj2;

This way you can control indexing done by JavaScript without heavy lifting of memory allocation, and overflow handling.

Of course, if you truly want the “industrial-grade solution”, you can build a class parameterized by the key function, and with all necessary API of the container, but … we use JavaScript, and trying to be simple and lightweight, so this functional solution is simple and fast.

The key function can be as simple as selecting right attributes of the object, e.g., a key, or a set of keys, which are already unique, a combination of keys, which are unique together, or as complex as using some cryptographic hashes like in DojoX Encoding, or DojoX UUID. While the latter solutions may produce unique keys, personally I try to avoid them at all costs, especially, if I know what makes my objects unique.

Update in 2014: Answered back in 2008 this simple solution still requires more explanations. Let me clarify the idea in a Q&A form.

Your solution doesn’t have a real hash. Where is it???

JavaScript is a high-level language. Its basic primitive
(Object)
includes a hash table to keep properties. This hash table is usually written
in a low-level language for efficiency. Using a simple object with string keys we use an efficiently implemented hash table with no efforts on our part.

How do you know they use hash?

There are three major ways to keep a collection of objects addressable by a key:

  • Unordered. In this case to retrieve an object by its key we have to go over all keys stopping when we find it. On average it will take n/2 comparisons.
  • Ordered.
    • Example #1: a sorted array — doing a binary search we will find our key after ~log2(n) comparisons on average. Much better.
    • Example #2: a tree. Again it’ll be ~log(n) attempts.
  • Hash table. On average it requires a constant time. Compare: O(n) vs. O(log n) vs. O(1). Boom.

Obviously JavaScript objects use hash tables in some form to handle general cases.

Do browser vendors really use hash tables???

Really.

Do they handle collisions?

Yes. See above. If you found a collision on unequal strings, please do not hesitate to file a bug with a vendor.

So what is your idea?

If you want to hash an object, find what makes it unique and use it as a key. Do not try to calculate real hash or emulate hash tables — it is already efficiently handled by the underlying JavaScript object.

Use this key with JavaScript Object to leverage its built-in hash table while steering clear of possible clashes with default properties.

Examples to get you started:

  • If your objects include a unique user name — use it as a key.
  • If it includes a unique customer number — use it as a key.
    • If it includes unique government-issued numbers like an SSN, or a passport number, and your system doesn’t allow duplicates — use it as a key.
  • If a combination of fields is unique — use it as a key.
    • State abbreviation + driver license number makes an excellent key.
    • Country abbreviation + passport number is an excellent key too.
  • Some function on fields, or a whole object, can return a unique value — use it as a key.

I used your suggestion and cached all objects using a user name. But some wise guy is named “toString”, which is a built-in property! What should I do now?

Obviously, if it is even remotely possible that the resulting key will exclusively consists of Latin characters, you should do something about it. For example, add any non-Latin Unicode character you like at the beginning or at the end to un-clash with default properties: “#toString”, “#MarySmith”. If a composite key is used, separate key components using some kind of non-Latin delimiter: “name,city,state”.

In general this is the place, where we have to be creative, and select the easiest keys with given limitations (uniqueness, potential clashes with default properties).

Note: unique keys do not clash by definition, while potential hash clashes will be handled by the underlying Object.

Why don’t you like industrial solutions?

IMHO, the best code is no code at all: it has no errors, requires no maintenance, easy to understand, and executes instantaneously. All “hash tables in JavaScript” I saw were >100 lines of code, and involved multiple objects. Compare it with: dict[key] = value.

Another point: is it even possible to beat a performance of a primordial object written in a low-level language, using JavaScript and the very same primordial objects to implement what is already implemented?

I still want to hash my objects without any keys!

We are in luck: ECMAScript 6 (scheduled for mid 2015 release, give or take 1-2 years after that to become widespread) defines
map and
set.

Judging by the definition they can use object’s address as a key, which makes objects instantly distinct without artificial keys. OTOH, two different, yet identical objects, will be mapped as distinct.

Comparison breakdown from MDN:

Objects are similar to Maps in that both let you set keys to values,
retrieve those values, delete keys, and detect whether something is
stored at a key. Because of this (and because there were no built-in
alternatives), Objects have been used as Maps historically; however,
there are important differences that make using a Map preferable in
certain cases:

  • The keys of an Object are Strings and Symbols, whereas they can be any value for a Map, including functions, objects, and any primitive.
  • The keys in Map are ordered while keys added to object are not. Thus, when iterating over it, a Map object returns keys in order of
    insertion.
  • You can get the size of a Map easily with the size property, while the number of properties in an Object must be determined manually.
  • A Map is an iterable and can thus be directly iterated, whereas iterating over an Object requires obtaining its keys in some fashion
    and iterating over them.
  • An Object has a prototype, so there are default keys in the map that could collide with your keys if you’re not careful. As of ES5 this can
    be bypassed by using map = Object.create(null), but this is seldom
    done.
  • A Map may perform better in scenarios involving frequent addition and removal of key pairs.

Solution 2:

Problem description

JavaScript has no built-in general map type (sometimes called associative array or dictionary) which allows to access arbitrary values by arbitrary keys. JavaScript’s fundamental data structure is the object, a special type of map which only accepts strings as keys and has special semantics like prototypical inheritance, getters and setters and some further voodoo.

When usings objects as maps, you have to remember that the key will be converted to a string value via toString(), which results in mapping 5 and '5' to the same value and all objects which don’t overwrite the toString() method to the value indexed by '[object Object]'. You might also involuntarily access its inherited properties if you don’t check hasOwnProperty().

JavaScript’s built-in array type does not help one bit: JavaScript arrays are not associative arrays, but just objects with a few more special properties. If you want to know why they can’t be used as maps, look here.

Eugene’s Solution

Eugene Lazutkin already described the basic idea of using a custom hash function to generate unique strings which can be used to look up the associated values as properties of a dictionary object. This will most likely be the fastest solution, because objects are internally implemented as hash tables.

  • Note: Hash tables (sometimes called hash maps) are a particular implementation of the map concept using a backing array and lookup via numeric hash values. The runtime environment might use other structures (such as search trees or skip lists) to implement JavaScript objects, but as objects are the fundamental data structure, they should be sufficiently optimised.

In order to get a unique hash value for arbitrary objects, one possibility is to use a global counter and cache the hash value in the object itself (eg in a property named __hash).

A hash function which does this is and works for both primitive values and objects is:

function hash(value) {
    return (typeof value) + ' ' + (value instanceof Object ?
        (value.__hash || (value.__hash = ++arguments.callee.current)) :
        value.toString());
}

hash.current = 0;

This function can be used as described by Eugene. For convenience, we will further wrap it in a Map class.

My Map implementation

The following implementation will additionally store the key-value-pairs in a doubly linked list in order to allow fast iteration over both keys and values. To supply your own hash function, you can overwrite the instance’s hash() method after creation.

// linking the key-value-pairs is optional
// if no argument is provided, linkItems === undefined, i.e. !== false
// --> linking will be enabled
function Map(linkItems) {
    this.current = undefined;
    this.size = 0;

    if(linkItems === false)
        this.disableLinking();
}

Map.noop = function() {
    return this;
};

Map.illegal = function() {
    throw new Error("illegal operation for maps without linking");
};

// map initialisation from existing object
// doesn't add inherited properties if not explicitly instructed to:
// omitting foreignKeys means foreignKeys === undefined, i.e. == false
// --> inherited properties won't be added
Map.from = function(obj, foreignKeys) {
    var map = new Map;

    for(var prop in obj) {
        if(foreignKeys || obj.hasOwnProperty(prop))
            map.put(prop, obj[prop]);
    }

    return map;
};

Map.prototype.disableLinking = function() {
    this.link = Map.noop;
    this.unlink = Map.noop;
    this.disableLinking = Map.noop;
    this.next = Map.illegal;
    this.key = Map.illegal;
    this.value = Map.illegal;
    this.removeAll = Map.illegal;

    return this;
};

// overwrite in Map instance if necessary
Map.prototype.hash = function(value) {
    return (typeof value) + ' ' + (value instanceof Object ?
        (value.__hash || (value.__hash = ++arguments.callee.current)) :
        value.toString());
};

Map.prototype.hash.current = 0;

// --- mapping functions

Map.prototype.get = function(key) {
    var item = this[this.hash(key)];
    return item === undefined ? undefined : item.value;
};

Map.prototype.put = function(key, value) {
    var hash = this.hash(key);

    if(this[hash] === undefined) {
        var item = { key : key, value : value };
        this[hash] = item;

        this.link(item);
        ++this.size;
    }
    else this[hash].value = value;

    return this;
};

Map.prototype.remove = function(key) {
    var hash = this.hash(key);
    var item = this[hash];

    if(item !== undefined) {
        --this.size;
        this.unlink(item);

        delete this[hash];
    }

    return this;
};

// only works if linked
Map.prototype.removeAll = function() {
    while(this.size)
        this.remove(this.key());

    return this;
};

// --- linked list helper functions

Map.prototype.link = function(item) {
    if(this.size == 0) {
        item.prev = item;
        item.next = item;
        this.current = item;
    }
    else {
        item.prev = this.current.prev;
        item.prev.next = item;
        item.next = this.current;
        this.current.prev = item;
    }
};

Map.prototype.unlink = function(item) {
    if(this.size == 0)
        this.current = undefined;
    else {
        item.prev.next = item.next;
        item.next.prev = item.prev;
        if(item === this.current)
            this.current = item.next;
    }
};

// --- iterator functions - only work if map is linked

Map.prototype.next = function() {
    this.current = this.current.next;
};

Map.prototype.key = function() {
    return this.current.key;
};

Map.prototype.value = function() {
    return this.current.value;
};

Example

The following script

var map = new Map;

map.put('spam', 'eggs').
    put('foo', 'bar').
    put('foo', 'baz').
    put({}, 'an object').
    put({}, 'another object').
    put(5, 'five').
    put(5, 'five again').
    put('5', 'another five');

for(var i = 0; i++ < map.size; map.next())
    document.writeln(map.hash(map.key()) + ' : ' + map.value());

generates this output:

string spam : eggs
string foo : baz
object 1 : an object
object 2 : another object
number 5 : five again
string 5 : another five

Further considerations

PEZ suggested to overwrite the toString() method, presumably with our hash function. This is not feasible because it doesn’t work for primitive values (changing toString() for primitives is a very bad idea). If we want toString() to return meaningful values for arbitrary objects, we would have to modify Object.prototype, which some people (myself not included) consider verboten.


Edit: The current version of my Map implementation as well as other JavaScript goodies can be obtained from here.

Solution 3:

I know this question is pretty old, but there are some really great solutions nowadays with external libraries.

JavaScript also has its language provided Map as well.

Solution 4:

Here is an easy and convenient way of using something similar to the java map:

var map= {
        'map_name_1': map_value_1,
        'map_name_2': map_value_2,
        'map_name_3': map_value_3,
        'map_name_4': map_value_4
        }

And to get the value:

alert( map['map_name_1'] );    // fives the value of map_value_1

......  etc  .....

Solution 5:

You can use ES6 WeakMap or Map:

  • WeakMaps are key/value maps in which keys are objects.

  • Map objects are simple key/value maps. Any value (both objects and primitive values) may be used as either a key or a value.

Be aware that neither is widely supported, but you can use ES6 Shim (requires native ES5 or ES5 Shim) to support Map, but not WeakMap (see why).

Solution 6:

According to ECMAScript 2015 (ES6) standard javascript has a Map implementation. More about which could be found here

Basic usage:

var myMap = new Map();
var keyString = "a string",
    keyObj = {},
    keyFunc = function () {};

// setting the values
myMap.set(keyString, "value associated with 'a string'");
myMap.set(keyObj, "value associated with keyObj");
myMap.set(keyFunc, "value associated with keyFunc");

myMap.size; // 3

// getting the values
myMap.get(keyString);    // "value associated with 'a string'"
myMap.get(keyObj);       // "value associated with keyObj"
myMap.get(keyFunc);      // "value associated with keyFunc"