I need the following function:
Input: a list
Output:
True
if all elements in the input list evaluate as equal to each other using the standard equality operator; False
otherwise.Performance: of course, I prefer not to incur any unnecessary overhead.
I feel it would be best to:
AND
all the resulting Boolean valuesBut I'm not sure what's the most Pythonic way to do that.
EDIT:
Thank you for all the great answers. I rated up several, and it was really hard to choose between @KennyTM and @Ivo van der Wijk solutions.
The lack of short-circuit feature only hurts on a long input (over ~50 elements) that have unequal elements early on. If this occurs often enough (how often depends on how long the lists might be), the short-circuit is required. The best short-circuit algorithm seems to be @KennyTM checkEqual1
. It pays, however, a significant cost for this:
If the long inputs with early unequal elements don't happen (or happen sufficiently rarely), short-circuit isn't required. Then, by far the fastest is @Ivo van der Wijk solution.
General method:
def checkEqual1(iterator):
iterator = iter(iterator)
try:
first = next(iterator)
except StopIteration:
return True
return all(first == rest for rest in iterator)
One-liner:
def checkEqual2(iterator):
return len(set(iterator)) <= 1
Also one-liner:
def checkEqual3(lst):
return lst[1:] == lst[:-1]
The difference between the 3 versions are that:
checkEqual2
the content must be hashable.checkEqual1
and checkEqual2
can use any iterators, but checkEqual3
must take a sequence input, typically concrete containers like a list or tuple.checkEqual1
stops as soon as a difference is found.checkEqual1
contains more Python code, it is less efficient when many of the items are equal in the beginning.checkEqual2
and checkEqual3
always perform O(N) copying operations, they will take longer if most of your input will return False.checkEqual2
and checkEqual3
it's harder to adapt comparison from a == b
to a is b
.timeit
result, for Python 2.7 and (only s1, s4, s7, s9 should return True)
s1 = [1] * 5000
s2 = [1] * 4999 + [2]
s3 = [2] + [1]*4999
s4 = [set([9])] * 5000
s5 = [set([9])] * 4999 + [set([10])]
s6 = [set([10])] + [set([9])] * 4999
s7 = [1,1]
s8 = [1,2]
s9 = []
we get
| checkEqual1 | checkEqual2 | checkEqual3 | checkEqualIvo | checkEqual6502 |
|-----|-------------|-------------|--------------|---------------|----------------|
| s1 | 1.19 msec | 348 usec | 183 usec | 51.6 usec | 121 usec |
| s2 | 1.17 msec | 376 usec | 185 usec | 50.9 usec | 118 usec |
| s3 | 4.17 usec | 348 usec | 120 usec | 264 usec | 61.3 usec |
| | | | | | |
| s4 | 1.73 msec | | 182 usec | 50.5 usec | 121 usec |
| s5 | 1.71 msec | | 181 usec | 50.6 usec | 125 usec |
| s6 | 4.29 usec | | 122 usec | 423 usec | 61.1 usec |
| | | | | | |
| s7 | 3.1 usec | 1.4 usec | 1.24 usec | 0.932 usec | 1.92 usec |
| s8 | 4.07 usec | 1.54 usec | 1.28 usec | 0.997 usec | 1.79 usec |
| s9 | 5.91 usec | 1.25 usec | 0.749 usec | 0.407 usec | 0.386 usec |
Note:
# http://stackoverflow.com/q/3844948/
def checkEqualIvo(lst):
return not lst or lst.count(lst[0]) == len(lst)
# http://stackoverflow.com/q/3844931/
def checkEqual6502(lst):
return not lst or [lst[0]]*len(lst) == lst
A solution faster than using set() that works on sequences (not iterables) is to simply count the first element. This assumes the list is non-empty (but that's trivial to check, and decide yourself what the outcome should be on an empty list)
x.count(x[0]) == len(x)
some simple benchmarks:
>>> timeit.timeit('len(set(s1))<=1', 's1=[1]*5000', number=10000)
1.4383411407470703
>>> timeit.timeit('len(set(s1))<=1', 's1=[1]*4999+[2]', number=10000)
1.4765670299530029
>>> timeit.timeit('s1.count(s1[0])==len(s1)', 's1=[1]*5000', number=10000)
0.26274609565734863
>>> timeit.timeit('s1.count(s1[0])==len(s1)', 's1=[1]*4999+[2]', number=10000)
0.25654196739196777