python - Splitting Dataframe based on corresponding numpy array values -
i have pandas dataframe looks :
2007-12-31 50230.62 2008-01-02 48646.84 2008-01-03 48748.04 2008-01-04 46992.22 2008-01-07 46491.28 2008-01-08 45347.72 2008-01-09 45681.68 2008-01-10 46430.5
where date column index. have numpy array b of same length has element -1, 0 , 1. cleanest way split dataframe 3 dataframes such rows equal corresponding b elements grouped together. eg. if b = numpy.array([0, 0, 0, 1, 1, -1, -1, 0]) dataframe should split :
x 2007-12-31 50230.62 2008-01-02 48646.84 2008-01-03 48748.04 2008-01-10 46430.5 y 2008-01-04 46992.22 2008-01-07 46491.28 z 2008-01-08 45347.72 2008-01-09 45681.68
it's easy utilize groupby
pandas, have option keep them grouped you're not doubling data. can assign then
import numpy np import pandas pd import io data = """ 2007-12-31 50230.62 2008-01-02 48646.84 2008-01-03 48748.04 2008-01-04 46992.22 2008-01-07 46491.28 2008-01-08 45347.72 2008-01-09 45681.68 2008-01-10 46430.5""" df = pd.read_csv(io.stringio(data), delimiter='\s+', header=none) b = np.array([0, 0, 0, 1, 1, -1, -1, 0]) df['b'] = b df_groups = df.groupby(['b']) x = df_groups.get_group((0)) y = df_groups.get_group((-1)) z = df_groups.get_group((1))
the 0,-1,1
names based on b
value.
Comments
Post a Comment