I'm checking if two strings
b are permutations of each other, and I'm wondering what the ideal way to do this is in Python. From the Zen of Python, "There should be one -- and preferably only one -- obvious way to do it," but I see there are at least two ways:
sorted(a) == sorted(b)
all(a.count(char) == b.count(char) for char in a)
but the first one is slower when (for example) the first char of
a is nowhere in
b, and the second is slower when they are actually permutations.
Is there any better (either in the sense of more Pythonic, or in the sense of faster on average) way to do it? Or should I just choose from these two depending on which situation I expect to be most common?
heuristically you're probably better to split them off based on string size.
returnvalue = false if len(a) == len(b) if len(a) < threshold returnvalue = (sorted(a) == sorted(b)) else returnvalue = naminsmethod(a, b) return returnvalue
If performance is critical, and string size can be large or small then this is what I'd do.
It's pretty common to split things like this based on input size or type. Algorithms have different strengths or weaknesses and it would be foolish to use one where another would be better... In this case Namin's method is O(n), but has a larger constant factor than the O(n log n) sorted method.
Here is a way which is O(n), asymptotically better than the two ways you suggest.
import collections def same_permutation(a, b): d = collections.defaultdict(int) for x in a: d[x] += 1 for x in b: d[x] -= 1 return not any(d.itervalues()) ## same_permutation([1,2,3],[2,3,1]) #. True ## same_permutation([1,2,3],[2,3,1,1]) #. False