To extract non-nan values from multiple rows in a pandas dataframe


Question

I am working on several taxi datasets. I have used pandas to concat all the dataset into a single dataframe.

My dataframe looks something like this.

                     675                       1039                #and rest 125 taxis
                     longitude     latitude    longitude    latitude
date
2008-02-02 13:31:21  116.56359  40.06489       Nan          Nan
2008-02-02 13:31:51  116.56486  40.06415       Nan          Nan
2008-02-02 13:32:21  116.56855  40.06352       116.58243    39.6313
2008-02-02 13:32:51  116.57127  40.06324       Nan          Nan
2008-02-02 13:33:21  116.57120  40.06328       116.55134    39.6313
2008-02-02 13:33:51  116.57121  40.06329       116.55126    39.6123
2008-02-02 13:34:21  Nan        Nan            116.55134    39.5123

where 675,1039 are the taxi ids. Basically there are totally 127 taxis having their corresponding latitudes and longitudes columned up.

I have several ways to extract not-null values for a row.

df.ix[k,df.columns[np.isnan(df.irow(0))!=1]]
              (or)
df.irow(0)[np.isnan(df.irow(0))!=1]
              (or)
df.irow(0)[np.where(df.irow(0)[df.columns].notnull())[0]]

any of the above commands will return,

675   longitude    116.56359
      latitude     40.064890 
4549  longitude    116.34642
      latitude      39.96662
Name: 2008-02-02 13:31:21

now i want to extract all the notnull values from first few rows(say from row 1 to row 6).

how do i do that?

i can probably loop it up. But i want a non-looped way of doing it.

Any help, suggestions are welcome. Thanks in adv! :)

1
5
4/15/2013 1:54:49 PM

df.ix[1:6].dropna(axis=1)

As a heads up, irow will be deprecated in the next release of pandas. New methods, with clearer usage, replace it.

http://pandas.pydata.org/pandas-docs/dev/indexing.html#deprecations

4
4/15/2013 2:00:40 PM

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon