filtering grouped df in pandas


Question

I am creating a groupby object from a Pandas DataFrame and want to select out all the groups with > 1 size.

The following doesn't seem to work:

grouped[grouped.size > 1 ]

Also, how can one filter out certain values from a grouped DataFrame? For example, how could I remove all the rows from grouped where the column 'name' has a value 'foo' or 'bar'?

Contrived Example:

df = pandas.DataFrame({'A': ['foo','bar','foo','foo'],
                       'B': range(4)})
grouped = df.groupby('A')

I need the groupby object after removing the groups that have a group size <= 1.

I tried the following, which didn't work:

grouped[grouped.size() > 1]

I expected:

A
foo 0
    2
    3

I am not sure how indexing/slicing works for the grouped object.

1
39
3/10/2015 11:44:10 PM

As of pandas 0.12 you can do:

>>> grouped.filter(lambda x: len(x) > 1)

     A  B
0  foo  0
2  foo  2
3  foo  3
42
8/15/2013 10:03:39 PM

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon