python - Select rows containing certain values from pandas dataframe -
i have pandas dataframe entries strings:
b c 1 apple banana pear 2 pear pear apple 3 banana pear pear 4 apple apple pear
etc. want select rows contain string, say, 'banana'. don't know column appear in each time. of course, can write loop , iterate on rows. there easier or faster way this?
with numpy, vectorized search many strings wish, -
def select_rows(df,search_strings): unq,ids = np.unique(df,return_inverse=true) unqids = np.searchsorted(unq,search_strings) return df[((ids.reshape(df.shape) == unqids[:,none,none]).any(-1)).all(0)]
sample run -
in [393]: df out[393]: b c 0 apple banana pear 1 pear pear apple 2 banana pear pear 3 apple apple pear in [394]: select_rows(df,['apple','banana']) out[394]: b c 0 apple banana pear in [395]: select_rows(df,['apple','pear']) out[395]: b c 0 apple banana pear 1 pear pear apple 3 apple apple pear in [396]: select_rows(df,['apple','banana','pear']) out[396]: b c 0 apple banana pear
Comments
Post a Comment