# numpy: most efficient frequency counts for unique values in an array

### Question

In `numpy` / `scipy`, is there an efficient way to get frequency counts for unique values in an array?

Something along these lines:

``````x = array( [1,1,1,2,2,2,5,25,1,1] )
y = freq_count( x )
print y

>> [[1, 5], [2,3], [5,1], [25,1]]
``````

( For you, R users out there, I'm basically looking for the `table()` function )

1
201
8/7/2015 9:51:25 AM

Take a look at `np.bincount`:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.bincount.html

``````import numpy as np
x = np.array([1,1,1,2,2,2,5,25,1,1])
y = np.bincount(x)
ii = np.nonzero(y)[0]
``````

And then:

``````zip(ii,y[ii])
# [(1, 5), (2, 3), (5, 1), (25, 1)]
``````

or:

``````np.vstack((ii,y[ii])).T
# array([[ 1,  5],
[ 2,  3],
[ 5,  1],
[25,  1]])
``````

or however you want to combine the counts and the unique values.

131
5/24/2012 4:53:05 PM

As of Numpy 1.9, the easiest and fastest method is to simply use `numpy.unique`, which now has a `return_counts` keyword argument:

``````import numpy as np

x = np.array([1,1,1,2,2,2,5,25,1,1])
unique, counts = np.unique(x, return_counts=True)

print np.asarray((unique, counts)).T
``````

Which gives:

`````` [[ 1  5]
[ 2  3]
[ 5  1]
[25  1]]
``````

A quick comparison with `scipy.stats.itemfreq`:

``````In [4]: x = np.random.random_integers(0,100,1e6)

In [5]: %timeit unique, counts = np.unique(x, return_counts=True)
10 loops, best of 3: 31.5 ms per loop

In [6]: %timeit scipy.stats.itemfreq(x)
10 loops, best of 3: 170 ms per loop
``````