Generate a Hash from string in Javascript

Generate a Hash from string in Javascript

I need to convert strings to some form of hash. Is this possible in JavaScript?
I’m not utilizing a server-side language so I can’t do it that way.

Solutions/Answers:

Solution 1:

String.prototype.hashCode = function() {
  var hash = 0, i, chr;
  if (this.length === 0) return hash;
  for (i = 0; i < this.length; i++) {
    chr   = this.charCodeAt(i);
    hash  = ((hash << 5) - hash) + chr;
    hash |= 0; // Convert to 32bit integer
  }
  return hash;
};

Source:
http://werxltd.com/wp/2010/05/13/javascript-implementation-of-javas-string-hashcode-method/

Solution 2:

EDIT

based on my jsperf tests, the accepted answer is actually faster: http://jsperf.com/hashcodelordvlad

ORIGINAL

if anyone is interested, here is an improved ( faster ) version, which will fail on older browsers who lack the reduce array function.

hashCode = function(s){
  return s.split("").reduce(function(a,b){a=((a<<5)-a)+b.charCodeAt(0);return a&a},0);              
}

Solution 3:

Note: Even with the best 32-bit hash, collisions will occur sooner or later.

The hash collision probablility can be calculated as
1 - e ^ (-k(k-1) / 2N,
aproximated as
k^2 / 2N
(see here).
This may be higher than intuition suggests:
Assuming a 32-bit hash and k=10,000 items, a collision will occur with a probablility of 1.2%.
For 77,163 samples the probability becomes 50%!
(calculator).
I suggest a workaround at the bottom.

In an answer to this question
Which hashing algorithm is best for uniqueness and speed?,
Ian Boyd posted a good in depth analysis.
In short (as I interpret it), he comes to the conclusion that Murmur is best, followed by FNV-1a.
Java’s String.hashCode() algorithm that esmiralha proposed seems to be a variant of DJB2.

  • FNV-1a has a a better distribution than DJB2, but is slower
  • DJB2 is faster than FNV-1a, but tends to yield more collisions
  • MurmurHash3 is better and faster than DJB2 and FNV-1a (but the optimized implementation requires more lines of code than FNV and DJB2)

Some benchmarks with large input strings here: http://jsperf.com/32-bit-hash
When short input strings are hashed, murmur’s performance drops, relative to DJ2B and FNV-1a: http://jsperf.com/32-bit-hash/3

So in general I would recommend murmur3.
See here for a JavaScript implementation:
https://github.com/garycourt/murmurhash-js

If input strings are short and performance is more important than distribution quality, use DJB2 (as proposed by the accepted answer by esmiralha).

If quality and small code size are more important than speed, I use this implementation of FNV-1a (based on this code).

/**
 * Calculate a 32 bit FNV-1a hash
 * Found here: https://gist.github.com/vaiorabbit/5657561
 * Ref.: http://isthe.com/chongo/tech/comp/fnv/
 *
 * @param {string} str the input value
 * @param {boolean} [asString=false] set to true to return the hash value as 
 *     8-digit hex string instead of an integer
 * @param {integer} [seed] optionally pass the hash of the previous chunk
 * @returns {integer | string}
 */
function hashFnv32a(str, asString, seed) {
    /*jshint bitwise:false */
    var i, l,
        hval = (seed === undefined) ? 0x811c9dc5 : seed;

    for (i = 0, l = str.length; i < l; i++) {
        hval ^= str.charCodeAt(i);
        hval += (hval << 1) + (hval << 4) + (hval << 7) + (hval << 8) + (hval << 24);
    }
    if( asString ){
        // Convert to 8 digit hex string
        return ("0000000" + (hval >>> 0).toString(16)).substr(-8);
    }
    return hval >>> 0;
}

Improve Collision Probability

As explained here, we can extend the hash bit size using this trick:

function hash64(str) {
    var h1 = hash32(str);  // returns 32 bit (as 8 byte hex string)
    return h1 + hash32(h1 + str);  // 64 bit (as 16 byte hex string)
}

Use it with care and don’t expect too much though.

Solution 4:

Based on accepted answer in ES6. Smaller, maintainable and works in modern browsers.

function hashCode(str) {
  return str.split('').reduce((prevHash, currVal) =>
    (((prevHash << 5) - prevHash) + currVal.charCodeAt(0))|0, 0);
}

// Test
console.log("hashCode(\"Hello!\"): ", hashCode('Hello!'));

Solution 5:

If it helps anyone, I combined the top two answers into an older-browser-tolerant version, which uses the fast version if reduce is available and falls back to esmiralha’s solution if it’s not.

/**
 * @see http://stackoverflow.com/q/7616461/940217
 * @return {number}
 */
String.prototype.hashCode = function(){
    if (Array.prototype.reduce){
        return this.split("").reduce(function(a,b){a=((a<<5)-a)+b.charCodeAt(0);return a&a},0);              
    } 
    var hash = 0;
    if (this.length === 0) return hash;
    for (var i = 0; i < this.length; i++) {
        var character  = this.charCodeAt(i);
        hash  = ((hash<<5)-hash)+character;
        hash = hash & hash; // Convert to 32bit integer
    }
    return hash;
}

Usage is like:

var hash = new String("some string to be hashed").hashCode();

Solution 6:

This is a refined and better performing variant:

String.prototype.hashCode = function() {
    var hash = 0, i = 0, len = this.length;
    while ( i < len ) {
        hash  = ((hash << 5) - hash + this.charCodeAt(i++)) << 0;
    }
    return hash;
};

This matches Java’s implementation of the standard object.hashCode()

Here is also one that returns only positive hashcodes:

String.prototype.hashcode = function() {
    return (this.hashCode() + 2147483647) + 1;
};

And here is a matching one for Java that only returns positive hashcodes:

public static long hashcode(Object obj) {
    return ((long) obj.hashCode()) + Integer.MAX_VALUE + 1l;
}

Enjoy!