How do I check if there are duplicates in a flat list?


Question

For example, given the list ['one', 'two', 'one'], the algorithm should return True, whereas given ['one', 'two', 'three'] it should return False.

1
157
4/2/2018 11:29:26 AM

Accepted Answer

Use set() to remove duplicates if all values are hashable:

>>> your_list = ['one', 'two', 'one']
>>> len(your_list) != len(set(your_list))
True
349
2/20/2016 6:17:04 PM

Recommended for short lists only:

any(thelist.count(x) > 1 for x in thelist)

Do not use on a long list -- it can take time proportional to the square of the number of items in the list!

For longer lists with hashable items (strings, numbers, &c):

def anydup(thelist):
  seen = set()
  for x in thelist:
    if x in seen: return True
    seen.add(x)
  return False

If your items are not hashable (sublists, dicts, etc) it gets hairier, though it may still be possible to get O(N logN) if they're at least comparable. But you need to know or test the characteristics of the items (hashable or not, comparable or not) to get the best performance you can -- O(N) for hashables, O(N log N) for non-hashable comparables, otherwise it's down to O(N squared) and there's nothing one can do about it:-(.


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon