python - Splitting Dataframe based on corresponding numpy array values -
i have pandas dataframe looks :
    2007-12-31    50230.62     2008-01-02    48646.84     2008-01-03    48748.04     2008-01-04    46992.22     2008-01-07    46491.28     2008-01-08    45347.72     2008-01-09    45681.68     2008-01-10    46430.5 where date column index. have numpy array b of same length has element -1, 0 , 1. cleanest way split dataframe 3 dataframes such rows equal corresponding b elements grouped together. eg. if b = numpy.array([0, 0, 0, 1, 1, -1, -1, 0]) dataframe should split :
    x     2007-12-31    50230.62     2008-01-02    48646.84     2008-01-03    48748.04     2008-01-10    46430.5      y     2008-01-04    46992.22     2008-01-07    46491.28      z     2008-01-08    45347.72     2008-01-09    45681.68 
it's easy utilize groupby pandas, have option keep them grouped you're not doubling data. can assign then
import numpy np import pandas pd import io  data = """    2007-12-31    50230.62     2008-01-02    48646.84     2008-01-03    48748.04     2008-01-04    46992.22     2008-01-07    46491.28     2008-01-08    45347.72     2008-01-09    45681.68     2008-01-10    46430.5"""  df = pd.read_csv(io.stringio(data), delimiter='\s+', header=none) b = np.array([0, 0, 0, 1, 1, -1, -1, 0])  df['b'] = b  df_groups = df.groupby(['b'])  x = df_groups.get_group((0)) y = df_groups.get_group((-1)) z = df_groups.get_group((1)) the 0,-1,1 names based on b value.
Comments
Post a Comment