python - pandas dataframe data retrieval -


here sample pandas data frame

              icd_code   from_date  paid_amount claim_id                                        ckey-7724339    719.43  2015-09-26       300.09 ckey-5008998     722.2  2015-04-21        11.65 ckey-7896598       722  2015-02-23        17.19 ckey-7758556     850.9  2014-03-13       414.02 ckey-7749118     847.0  2012-07-18         4.42 ckey-10383160   854.00  2015-06-16       751.68 ckey-10678452   607.84  2015-07-07        11.13 ckey-10734364    882.2  2015-07-22      5625.00 ckey-3500566    307.89  2011-08-09       500.00 ckey-10766667    344.1  2013-12-03       139.41 

when use .loc retrieve, output follows

$ indexed_data.loc['ckey-10766667'] icd_code            344.1 from_date      2013-12-03 paid_amount        139.41 name: ckey-10766667, dtype: object  ~~~~~~~~expected output ~~~~~~~~~~ ckey-10766667    344.1  2013-12-03       139.41 

can point me what's wrong in above code

note : have call data.set_index('claim_id') on original data set created index on 'claim_id'.

using code below gave me expected ouput:

$>>> indexed_data.loc[['ckey-8369057']]  

passing single value .loc return dataframe when multiple rows exist , series if 1 row exists. passing list .loc return dataframe.

consider execution time account, passing list consumes more time single element, when statement inside loop. here did achieve better execution time

df = indexed_data.loc[x] if type(df).__name__ == 'series':     df = df.to_frame().t 

the above code makes sure have dataframe @ end of these 3 lines.


Comments

Popular posts from this blog

matlab - error with cyclic autocorrelation function -

django - (fields.E300) Field defines a relation with model 'AbstractEmailUser' which is either not installed, or is abstract -

c# - What is a good .Net RefEdit control to use with ExcelDna? -