getting string from pandas Series and DataFrames in python?


Question

I have this dataframe in pandas:

d=pandas.DataFrame([{"a": 1, "b": 1}, {"c": 2, "b": 4}])
d["name"] = ["Hello", "World"]

I want to select an element based on its string value in "name" column and then get the value as a string. To select the element:

d[d["name"] == "World"]["name"]
Out:
1    World
Name: name

The problem is that it doesn't give a simple string but a series. Casting to a string won't help -- how can I just get the string "World" out of this? Is this the only way?

d[d["name"] == "World"]["name"].values[0]

thanks.

1
11
3/29/2013 2:18:55 PM

Accepted Answer

As @DSM points out, in general there could be many rows with name 'World', so somewhere down the line we'll need to pick one.

One way to do this which seems kind of nice could be to use where (and then max):

In [11]: d.name.where(d.name == 'World', np.nan)
Out[11]: 
0      NaN
1    World
Name: name, dtype: object

In [12]: d.name.where(d.name == 'World', np.nan).max()
Out[12]: 'World'

Note: if there is no row with name 'World' this will return NaN.

7
3/29/2013 3:42:25 PM

There's one method that no one mentioned that might be worth noting. This was a problem I was having where I was doing multiple criteria checks and getting back a single item Series (basically a unique row result). If you have a single item in a Series and just need that item OR know the index of the particular item you want to gather, just do this:

d[d["name"] == "World"].tolist()[0]

for the first (and only) item in a single item Series.

Or this:

d[d["name"] == "World"].tolist()[index]

where index is the index of the item you are looking for in the Series.

If you want it as a string, you may have to cast as a string if it is not already stringified by default.


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon