Pandas Drop Rows Outside of Time Range


Question

I am trying to go through every row in a DataFrame index and remove all rows that are not between a certain time.

I have been looking for solutions but none of them separate the Date from the Time, and all I want to do is drop the rows that are outside of a Time range.

1
16
1/26/2013 6:15:35 PM

Accepted Answer

You can use the between_time function directly:

ts.between_time(datetime.time(18), datetime.time(9), include_start=False, include_end=False)

Original answer:

You can use the indexer_between_time Index method.

For example, to include those times between 9am and 6pm (inclusive):

ts.ix[ts.index.indexer_between_time(datetime.time(9), datetime.time(18))]

to do the opposite and exclude those times between 6pm and 9am (exclusive):

ts.ix[ts.index.indexer_between_time(datetime.time(18), datetime.time(9),
                                    include_start=False, include_end=False)]

Note: indexer_between_time's arguments include_start and include_end are by default True, setting include_start to False means that datetimes whose time-part is precisely start_time (the first argument), in this case 6pm, will not be included.

Example:

In [1]: rng = pd.date_range('1/1/2000', periods=24, freq='H')

In [2]: ts = pd.Series(pd.np.random.randn(len(rng)), index=rng)

In [3]: ts.ix[ts.index.indexer_between_time(datetime.time(10), datetime.time(14))] 
Out[3]: 
2000-01-01 10:00:00    1.312561
2000-01-01 11:00:00   -1.308502
2000-01-01 12:00:00   -0.515339
2000-01-01 13:00:00    1.536540
2000-01-01 14:00:00    0.108617

Note: the same syntax (using ix) works for a DataFrame:

In [4]: df = pd.DataFrame(ts)

In [5]: df.ix[df.index.indexer_between_time(datetime.time(10), datetime.time(14))]
Out[5]: 
                            0
2000-01-03 10:00:00  1.312561
2000-01-03 11:00:00 -1.308502
2000-01-03 12:00:00 -0.515339
2000-01-03 13:00:00  1.536540
2000-01-03 14:00:00  0.108617
19
7/10/2018 6:12:19 PM

You can also do:

´╗┐rng = pd.date_range('1/1/2000', periods=24, freq='H')
ts = pd.Series(pd.np.random.randn(len(rng)), index=rng)
ts.ix[datetime.time(10):datetime.time(14)]
Out[4]: 
2000-01-01 10:00:00   -0.363420
2000-01-01 11:00:00   -0.979251
2000-01-01 12:00:00   -0.896648
2000-01-01 13:00:00   -0.051159
2000-01-01 14:00:00   -0.449192
Freq: H, dtype: float64

DataFrame works same way.


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon