python - Using groupby ("1d") and first_valid_index together -


this post shows how use first_valid_index find first occurrence of value in dataframe column. how use first_valid_index along daily groupby in order find first occurrence each day same example dataframe shown in linked post?

this groupby code need use:

grouper = pd.timegrouper("1d") 

edit:

when use lambda , apply approach gives correct output. unable send output new column ['test_output'] though shows nat:

df['test_output'] = df.groupby(grouper)['test_1'].apply(lambda x: x.first_valid_index())  df out[9]: test_1  test_output 2014-03-04 09:00:00 nan nat 2014-03-04 10:00:00 nan nat 2014-03-04 11:00:00 nan nat 2014-03-04 12:00:00 nan nat 2014-03-04 13:00:00 nan nat 2014-03-04 14:00:00 1.0 nat 2014-03-04 15:00:00 1.0 nat 2014-03-04 16:00:00 1.0 nat 2014-03-05 09:00:00 1.0 nat 

iiuc can use first on groupby object:

in [95]: df.groupby(grouper).first()  out[95]:             test_1 2014-03-04     1.0 2014-03-05     1.0 

should work, above generated using same data linked question

edit

i think above correct it's different calling head(1) instance:

in [3]: df.groupby(grouper).head(1)  out[3]:                      test_1  test_output 2014-03-04 09:00:00     nan          nan 2014-03-05 09:00:00       1            1 

but can call first_valid_index using lambda apply:

in [6]: df.groupby(grouper)['test_1'].apply(lambda x: x.first_valid_index())  out[6]: 2014-03-04   2014-03-04 14:00:00 2014-03-05   2014-03-05 09:00:00 name: test_1, dtype: datetime64[ns] 

edit

to add column bit tricky, because you're trying match orig index against new daily grouped groupby object won't align why nat. can call to_series on index, reason want can call map, , access date attribute. map perform lookup match on date on groupby result , return first valid date desired:

in [136]: df['first'] = df.index.to_series().dt.date.map(df.groupby(grouper)['test_1'].apply(lambda x: x.first_valid_index())) df  out[136]:                      test_1  test_output               first 2014-03-04 09:00:00     nan          nan 2014-03-04 14:00:00 2014-03-04 10:00:00     nan          nan 2014-03-04 14:00:00 2014-03-04 11:00:00     nan          nan 2014-03-04 14:00:00 2014-03-04 12:00:00     nan          nan 2014-03-04 14:00:00 2014-03-04 13:00:00     nan          nan 2014-03-04 14:00:00 2014-03-04 14:00:00     1.0          1.0 2014-03-04 14:00:00 2014-03-04 15:00:00     1.0          1.0 2014-03-04 14:00:00 2014-03-04 16:00:00     1.0          1.0 2014-03-04 14:00:00 2014-03-05 09:00:00     1.0          1.0 2014-03-05 09:00:00 2014-03-05 10:00:00     1.0          1.0 2014-03-05 09:00:00 2014-03-05 11:00:00     1.0          1.0 2014-03-05 09:00:00 2014-03-05 12:00:00     1.0          1.0 2014-03-05 09:00:00 2014-03-05 13:00:00     1.0          1.0 2014-03-05 09:00:00 2014-03-05 14:00:00     1.0          1.0 2014-03-05 09:00:00 2014-03-05 15:00:00     1.0          1.0 2014-03-05 09:00:00 2014-03-05 16:00:00     1.0          1.0 2014-03-05 09:00:00 

Comments

Popular posts from this blog

matlab - error with cyclic autocorrelation function -

django - (fields.E300) Field defines a relation with model 'AbstractEmailUser' which is either not installed, or is abstract -

c# - What is a good .Net RefEdit control to use with ExcelDna? -