python - Sum across all NaNs in pandas returns zero? -
i'm trying sum across columns of pandas dataframe, , when have nans in every column i'm getting sum = zero; i'd expected sum = nan based on docs. here's i've got:
in [136]: df = pd.dataframe() in [137]: df['a'] = [1,2,np.nan,3] in [138]: df['b'] = [4,5,np.nan,6] in [139]: df out[139]: b 0 1 4 1 2 5 2 nan nan 3 3 6 in [140]: df['total'] = df.sum(axis=1) in [141]: df out[141]: b total 0 1 4 5 1 2 5 7 2 nan nan 0 3 3 6 9
the pandas.dataframe.sum docs "if entire row/column na, result na", don't understand why "total" = 0 , not nan index 2. missing?
a solution select cases rows all-nan, set sum nan:
df['total'] = df.sum(axis=1) df.loc[df['a'].isnull() & df['b'].isnull(),'total']=np.nan
or
df['total'] = df.sum(axis=1) df.loc[df[['a','b']].isnull().all(1),'total']=np.nan
the latter option more practical, because can create list of columns ['a','b', ... , 'z']
may want sum.
Comments
Post a Comment