Remove duplicate values from JS array [duplicate]

Remove duplicate values from JS array [duplicate]

This question already has an answer here:

Get all unique values in a JavaScript array (remove duplicates)

70 answers

I have a very simple JavaScript array that may or may not contain duplicates.
var names = [“Mike”,”Matt”,”Nancy”,”Adam”,”Jenny”,”Nancy”,”Carl”];

I need to remove the duplicates and put the unique values in a new array.
I could point to all the codes that I’ve tried but I think it’s useless because they don’t work. I accept jQuery solutions too.
Similar question:

Get all non-unique values (i.e.: duplicate/more than one occurrence) in an array

Solutions/Answers:

Solution 1:

Quick and dirty using jQuery:

var names = ["Mike","Matt","Nancy","Adam","Jenny","Nancy","Carl"];
var uniqueNames = [];
$.each(names, function(i, el){
    if($.inArray(el, uniqueNames) === -1) uniqueNames.push(el);
});

Solution 2:

TL;DR

Using the Set constructor and the spread syntax:

uniq = [...new Set(array)];

“Smart” but naïve way

uniqueArray = a.filter(function(item, pos) {
    return a.indexOf(item) == pos;
})

Basically, we iterate over the array and, for each element, check if the first position of this element in the array is equal to the current position. Obviously, these two positions are different for duplicate elements.

Using the 3rd (“this array”) parameter of the filter callback we can avoid a closure of the array variable:

uniqueArray = a.filter(function(item, pos, self) {
    return self.indexOf(item) == pos;
})

Although concise, this algorithm is not particularly efficient for large arrays (quadratic time).

Hashtables to the rescue

function uniq(a) {
    var seen = {};
    return a.filter(function(item) {
        return seen.hasOwnProperty(item) ? false : (seen[item] = true);
    });
}

This is how it’s usually done. The idea is to place each element in a hashtable and then check for its presence instantly. This gives us linear time, but has at least two drawbacks:

  • since hash keys can only be strings in JavaScript, this code doesn’t distinguish numbers and “numeric strings”. That is, uniq([1,"1"]) will return just [1]
  • for the same reason, all objects will be considered equal: uniq([{foo:1},{foo:2}]) will return just [{foo:1}].

That said, if your arrays contain only primitives and you don’t care about types (e.g. it’s always numbers), this solution is optimal.

The best from two worlds

A universal solution combines both approaches: it uses hash lookups for primitives and linear search for objects.

function uniq(a) {
    var prims = {"boolean":{}, "number":{}, "string":{}}, objs = [];

    return a.filter(function(item) {
        var type = typeof item;
        if(type in prims)
            return prims[type].hasOwnProperty(item) ? false : (prims[type][item] = true);
        else
            return objs.indexOf(item) >= 0 ? false : objs.push(item);
    });
}

sort | uniq

Another option is to sort the array first, and then remove each element equal to the preceding one:

function uniq(a) {
    return a.sort().filter(function(item, pos, ary) {
        return !pos || item != ary[pos - 1];
    })
}

Again, this doesn’t work with objects (because all objects are equal for sort). Additionally, we silently change the original array as a side effect – not good! However, if your input is already sorted, this is the way to go (just remove sort from the above).

Unique by…

Sometimes it’s desired to uniquify a list based on some criteria other than just equality, for example, to filter out objects that are different, but share some property. This can be done elegantly by passing a callback. This “key” callback is applied to each element, and elements with equal “keys” are removed. Since key is expected to return a primitive, hash table will work fine here:

function uniqBy(a, key) {
    var seen = {};
    return a.filter(function(item) {
        var k = key(item);
        return seen.hasOwnProperty(k) ? false : (seen[k] = true);
    })
}

A particularly useful key() is JSON.stringify which will remove objects that are physically different, but “look” the same:

a = [[1,2,3], [4,5,6], [1,2,3]]
b = uniqBy(a, JSON.stringify)
console.log(b) // [[1,2,3], [4,5,6]]

If the key is not primitive, you have to resort to the linear search:

function uniqBy(a, key) {
    var index = [];
    return a.filter(function (item) {
        var k = key(item);
        return index.indexOf(k) >= 0 ? false : index.push(k);
    });
}

In ES6 you can use a Set:

function uniqBy(a, key) {
    let seen = new Set();
    return a.filter(item => {
        let k = key(item);
        return seen.has(k) ? false : seen.add(k);
    });
}

or a Map:

function uniqBy(a, key) {
    return [
        ...new Map(
            a.map(x => [key(x), x])
        ).values()
    ]
}

which both also work with non-primitive keys.

First or last?

When removing objects by a key, you might to want to keep the first of “equal” objects or the last one.

Use the Set variant above to keep the first, and the Map to keep the last:

function uniqByKeepFirst(a, key) {
    let seen = new Set();
    return a.filter(item => {
        let k = key(item);
        return seen.has(k) ? false : seen.add(k);
    });
}


function uniqByKeepLast(a, key) {
    return [
        ...new Map(
            a.map(x => [key(x), x])
        ).values()
    ]
}

//

data = [
    {a:1, u:1},
    {a:2, u:2},
    {a:3, u:3},
    {a:4, u:1},
    {a:5, u:2},
    {a:6, u:3},
];

console.log(uniqByKeepFirst(data, it => it.u))
console.log(uniqByKeepLast(data, it => it.u))

Libraries

Both underscore and Lo-Dash provide uniq methods. Their algorithms are basically similar to the first snippet above and boil down to this:

var result = [];
a.forEach(function(item) {
     if(result.indexOf(item) < 0) {
         result.push(item);
     }
});

This is quadratic, but there are nice additional goodies, like wrapping native indexOf, ability to uniqify by a key (iteratee in their parlance), and optimizations for already sorted arrays.

If you’re using jQuery and can’t stand anything without a dollar before it, it goes like this:

  $.uniqArray = function(a) {
        return $.grep(a, function(item, pos) {
            return $.inArray(item, a) === pos;
        });
  }

which is, again, a variation of the first snippet.

Performance

Function calls are expensive in JavaScript, therefore the above solutions, as concise as they are, are not particularly efficient. For maximal performance, replace filter with a loop and get rid of other function calls:

function uniq_fast(a) {
    var seen = {};
    var out = [];
    var len = a.length;
    var j = 0;
    for(var i = 0; i < len; i++) {
         var item = a[i];
         if(seen[item] !== 1) {
               seen[item] = 1;
               out[j++] = item;
         }
    }
    return out;
}

This chunk of ugly code does the same as the snippet #3 above, but an order of magnitude faster (as of 2017 it’s only twice as fast – JS core folks are doing a great job!)

function uniq(a) {
    var seen = {};
    return a.filter(function(item) {
        return seen.hasOwnProperty(item) ? false : (seen[item] = true);
    });
}

function uniq_fast(a) {
    var seen = {};
    var out = [];
    var len = a.length;
    var j = 0;
    for(var i = 0; i < len; i++) {
         var item = a[i];
         if(seen[item] !== 1) {
               seen[item] = 1;
               out[j++] = item;
         }
    }
    return out;
}

/////

var r = [0,1,2,3,4,5,6,7,8,9],
    a = [],
    LEN = 1000,
    LOOPS = 1000;

while(LEN--)
    a = a.concat(r);

var d = new Date();
for(var i = 0; i < LOOPS; i++)
    uniq(a);
document.write('<br>uniq, ms/loop: ' + (new Date() - d)/LOOPS)

var d = new Date();
for(var i = 0; i < LOOPS; i++)
    uniq_fast(a);
document.write('<br>uniq_fast, ms/loop: ' + (new Date() - d)/LOOPS)

ES6

ES6 provides the Set object, which makes things a whole lot easier:

function uniq(a) {
   return Array.from(new Set(a));
}

or

let uniq = a => [...new Set(a)];

Note that, unlike in python, ES6 sets are iterated in insertion order, so this code preserves the order of the original array.

However, if you need an array with unique elements, why not use sets right from the beginning?

Generators

A “lazy”, generator-based version of uniq can be built on the same basis:

  • take the next value from the argument
  • if it’s been seen already, skip it
  • otherwise, yield it and add it to the set of already seen values
function* uniqIter(a) {
    let seen = new Set();

    for (let x of a) {
        if (!seen.has(x)) {
            seen.add(x);
            yield x;
        }
    }
}

// example:

function* randomsBelow(limit) {
    while (1)
        yield Math.floor(Math.random() * limit);
}

// note that randomsBelow is endless

count = 20;
limit = 30;

for (let r of uniqIter(randomsBelow(limit))) {
    console.log(r);
    if (--count === 0)
        break
}

// exercise for the reader: what happens if we set `limit` less than `count` and why

Solution 3:

Got tired of seeing all bad examples with for-loops or jQuery. Javascript has the perfect tools for this nowadays: sort, map and reduce.

Uniq reduce while keeping existing order

var names = ["Mike","Matt","Nancy","Adam","Jenny","Nancy","Carl"];

var uniq = names.reduce(function(a,b){
    if (a.indexOf(b) < 0 ) a.push(b);
    return a;
  },[]);

console.log(uniq, names) // [ 'Mike', 'Matt', 'Nancy', 'Adam', 'Jenny', 'Carl' ]

// one liner
return names.reduce(function(a,b){if(a.indexOf(b)<0)a.push(b);return a;},[]);

Faster uniq with sorting

There are probably faster ways but this one is pretty decent.

var uniq = names.slice() // slice makes copy of array before sorting it
  .sort(function(a,b){
    return a > b;
  })
  .reduce(function(a,b){
    if (a.slice(-1)[0] !== b) a.push(b); // slice(-1)[0] means last item in array without removing it (like .pop())
    return a;
  },[]); // this empty array becomes the starting value for a

// one liner
return names.slice().sort(function(a,b){return a > b}).reduce(function(a,b){if (a.slice(-1)[0] !== b) a.push(b);return a;},[]);

Update 2015: ES6 version:

In ES6 you have Sets and Spread which makes it very easy and performant to remove all duplicates:

var uniq = [ ...new Set(names) ]; // [ 'Mike', 'Matt', 'Nancy', 'Adam', 'Jenny', 'Carl' ]

Sort based on occurrence:

Someone asked about ordering the results based on how many unique names there are:

var names = ['Mike', 'Matt', 'Nancy', 'Adam', 'Jenny', 'Nancy', 'Carl']

var uniq = names
  .map((name) => {
    return {count: 1, name: name}
  })
  .reduce((a, b) => {
    a[b.name] = (a[b.name] || 0) + b.count
    return a
  }, {})

var sorted = Object.keys(uniq).sort((a, b) => uniq[a] < uniq[b])

console.log(sorted)

Solution 4:

Vanilla JS: Remove duplicates using an Object like a Set

You can always try putting it into an object, and then iterating through its keys:

function remove_duplicates(arr) {
    var obj = {};
    var ret_arr = [];
    for (var i = 0; i < arr.length; i++) {
        obj[arr[i]] = true;
    }
    for (var key in obj) {
        ret_arr.push(key);
    }
    return ret_arr;
}

Vanilla JS: Remove duplicates by tracking already seen values (order-safe)

Or, for an order-safe version, use an object to store all previously seen values, and check values against it before before adding to an array.

function remove_duplicates_safe(arr) {
    var seen = {};
    var ret_arr = [];
    for (var i = 0; i < arr.length; i++) {
        if (!(arr[i] in seen)) {
            ret_arr.push(arr[i]);
            seen[arr[i]] = true;
        }
    }
    return ret_arr;

}

ECMAScript 6: Use the new Set data structure (order-safe)

ECMAScript 6 adds the new Set Data-Structure, which lets you store values of any type. Set.values returns elements in insertion order.

function remove_duplicates_es6(arr) {
    let s = new Set(arr);
    let it = s.values();
    return Array.from(it);
}

Example usage:

a = ["Mike","Matt","Nancy","Adam","Jenny","Nancy","Carl"];

b = remove_duplicates(a);
// b:
// ["Adam", "Carl", "Jenny", "Matt", "Mike", "Nancy"]

c = remove_duplicates_safe(a);
// c:
// ["Mike", "Matt", "Nancy", "Adam", "Jenny", "Carl"]

d = remove_duplicates_es6(a);
// d:
// ["Mike", "Matt", "Nancy", "Adam", "Jenny", "Carl"]

Solution 5:

Use Underscore.js

It’s a library with a host of functions for manipulating arrays.

It’s the tie to go along with jQuery’s tux, and Backbone.js’s
suspenders.

_.uniq

_.uniq(array, [isSorted], [iterator]) Alias: unique
Produces a duplicate-free version of the array, using === to test object
equality. If you know in advance that the array is sorted, passing
true for isSorted will run a much faster algorithm. If you want to
compute unique items based on a transformation, pass an iterator
function.

Example

var names = ["Mike","Matt","Nancy","Adam","Jenny","Nancy","Carl"];

alert(_.uniq(names, false));

Note: Lo-Dash (an underscore competitor) also offers a comparable .uniq implementation.

Solution 6:

A single line version using array filter and indexOf functions:

arr = arr.filter (function (value, index, array) { 
    return array.indexOf (value) == index;
});