What is an efficient way to find the most common element in a Python list?

My list items may not be hashable so can't use a dictionary. Also in case of draws the item with the lowest index should be returned. Example:

```
>>> most_common(['duck', 'duck', 'goose'])
'duck'
>>> most_common(['goose', 'duck', 'duck', 'goose'])
'goose'
```

With so many solutions proposed, I'm amazed nobody's proposed what I'd consider an obvious one (for non-hashable but comparable elements) -- [`itertools.groupby`

][1]. `itertools`

offers fast, reusable functionality, and lets you delegate some tricky logic to well-tested standard library components. Consider for example:

```
import itertools
import operator
def most_common(L):
# get an iterable of (item, iterable) pairs
SL = sorted((x, i) for i, x in enumerate(L))
# print 'SL:', SL
groups = itertools.groupby(SL, key=operator.itemgetter(0))
# auxiliary function to get "quality" for an item
def _auxfun(g):
item, iterable = g
count = 0
min_index = len(L)
for _, where in iterable:
count += 1
min_index = min(min_index, where)
# print 'item %r, count %r, minind %r' % (item, count, min_index)
return count, -min_index
# pick the highest-count/earliest item
return max(groups, key=_auxfun)[0]
```

This could be written more concisely, of course, but I'm aiming for maximal clarity. The two `print`

statements can be uncommented to better see the machinery in action; for example, *with* prints uncommented:

```
print most_common(['goose', 'duck', 'duck', 'goose'])
```

emits:

```
SL: [('duck', 1), ('duck', 2), ('goose', 0), ('goose', 3)]
item 'duck', count 2, minind 1
item 'goose', count 2, minind 0
goose
```

As you see, `SL`

is a list of pairs, each pair an item followed by the item's index in the original list (to implement the key condition that, if the "most common" items with the same highest count are > 1, the result must be the earliest-occurring one).

`groupby`

groups by the item only (via `operator.itemgetter`

). The auxiliary function, called once per grouping during the `max`

computation, receives and internally unpacks a group - a tuple with two items `(item, iterable)`

where the iterable's items are also two-item tuples, `(item, original index)`

[[the items of `SL`

]].

Then the auxiliary function uses a loop to determine both the count of entries in the group's iterable, *and* the minimum original index; it returns those as combined "quality key", with the min index sign-changed so the `max`

operation will consider "better" those items that occurred earlier in the original list.

This code could be much simpler if it worried a *little* less about big-O issues in time and space, e.g....:

```
def most_common(L):
groups = itertools.groupby(sorted(L))
def _auxfun((item, iterable)):
return len(list(iterable)), -L.index(item)
return max(groups, key=_auxfun)[0]
```

same basic idea, just expressed more simply and compactly... but, alas, an extra O(N) auxiliary space (to embody the groups' iterables to lists) and O(N squared) time (to get the `L.index`

of every item). While premature optimization is the root of all evil in programming, deliberately picking an O(N squared) approach when an O(N log N) one is available just goes too much against the grain of scalability!-)

Finally, for those who prefer "oneliners" to clarity and performance, a bonus 1-liner version with suitably mangled names:-).

```
from itertools import groupby as g
def most_common_oneliner(L):
return max(g(sorted(L)), key=lambda(x, v):(len(list(v)),-L.index(x)))[0]
```

A simpler one-liner:

```
def most_common(lst):
return max(set(lst), key=lst.count)
```

Licensed under: CC-BY-SA with attribution

Not affiliated with: Stack Overflow