python - Splitting Dataframe based on corresponding numpy array values -
i have pandas dataframe looks :
2007-12-31 50230.62 2008-01-02 48646.84 2008-01-03 48748.04 2008-01-04 46992.22 2008-01-07 46491.28 2008-01-08 45347.72 2008-01-09 45681.68 2008-01-10 46430.5 where date column index. have numpy array b of same length has element -1, 0 , 1. cleanest way split dataframe 3 dataframes such rows equal corresponding b elements grouped together. eg. if b = numpy.array([0, 0, 0, 1, 1, -1, -1, 0]) dataframe should split :
x 2007-12-31 50230.62 2008-01-02 48646.84 2008-01-03 48748.04 2008-01-10 46430.5 y 2008-01-04 46992.22 2008-01-07 46491.28 z 2008-01-08 45347.72 2008-01-09 45681.68
it's easy utilize groupby pandas, have option keep them grouped you're not doubling data. can assign then
import numpy np import pandas pd import io data = """ 2007-12-31 50230.62 2008-01-02 48646.84 2008-01-03 48748.04 2008-01-04 46992.22 2008-01-07 46491.28 2008-01-08 45347.72 2008-01-09 45681.68 2008-01-10 46430.5""" df = pd.read_csv(io.stringio(data), delimiter='\s+', header=none) b = np.array([0, 0, 0, 1, 1, -1, -1, 0]) df['b'] = b df_groups = df.groupby(['b']) x = df_groups.get_group((0)) y = df_groups.get_group((-1)) z = df_groups.get_group((1)) the 0,-1,1 names based on b value.
Comments
Post a Comment