python - pandas dataframe data retrieval -
here sample pandas data frame
icd_code from_date paid_amount claim_id ckey-7724339 719.43 2015-09-26 300.09 ckey-5008998 722.2 2015-04-21 11.65 ckey-7896598 722 2015-02-23 17.19 ckey-7758556 850.9 2014-03-13 414.02 ckey-7749118 847.0 2012-07-18 4.42 ckey-10383160 854.00 2015-06-16 751.68 ckey-10678452 607.84 2015-07-07 11.13 ckey-10734364 882.2 2015-07-22 5625.00 ckey-3500566 307.89 2011-08-09 500.00 ckey-10766667 344.1 2013-12-03 139.41
when use .loc retrieve, output follows
$ indexed_data.loc['ckey-10766667'] icd_code 344.1 from_date 2013-12-03 paid_amount 139.41 name: ckey-10766667, dtype: object ~~~~~~~~expected output ~~~~~~~~~~ ckey-10766667 344.1 2013-12-03 139.41
can point me what's wrong in above code
note : have call data.set_index('claim_id') on original data set created index on 'claim_id'.
using code below gave me expected ouput:
$>>> indexed_data.loc[['ckey-8369057']]
passing single value .loc return dataframe when multiple rows exist , series if 1 row exists. passing list .loc return dataframe.
consider execution time account, passing list consumes more time single element, when statement inside loop. here did achieve better execution time
df = indexed_data.loc[x] if type(df).__name__ == 'series': df = df.to_frame().t
the above code makes sure have dataframe @ end of these 3 lines.
Comments
Post a Comment