python - pandas dataframe check specific columns for same values -


is there way check , sum specific dataframe columns same values.

for example in following dataframe

column name 1, 2, 3, 4, 5             -------------             a, g, h, t, j              b, a, o, a, g             c, j, w, e, q             d, b, d, q, 

when comparing columns 1 , 2 sum of values same 2 (a , b)

thanks

you can use isin , sum achieve this:

in [96]: import pandas pd import io t="""1, 2, 3, 4, 5 a, g, h, t, j  b, a, o, a, g c, j, w, e, q d, b, d, q, i""" df = pd.read_csv(io.stringio(t), sep=',\s+') df  out[96]:    1  2  3  4  5 0   g  h  t  j 1  b   o   g 2  c  j  w  e  q 3  d  b  d  q   in [100]:     df['1'].isin(df['2']).sum()  out[100]: 2 

isin produce boolean series, calling sum on boolean series converts true , false 1 , 0 respectively:

in [101]: df['1'].isin(df['2'])  out[101]: 0     true 1     true 2    false 3    false name: 1, dtype: bool 

edit

to check , count number of values present in columns of interest following work, note dataset there no values present in columns:

in [123]: df.ix[:, :'4'].apply(lambda x: x.isin(df['1'])).all(axis=1).sum()  out[123]: 0 

breaking above down show each step doing:

in [124]:     df.ix[:, :'4'].apply(lambda x: x.isin(df['1']))  out[124]:       1      2      3      4 0  true  false  false  false 1  true   true  false   true 2  true  false  false  false 3  true   true   true  false  in [125]:     df.ix[:, :'4'].apply(lambda x: x.isin(df['1'])).all(axis=1)  out[125]: 0    false 1    false 2    false 3    false dtype: bool 

Comments

Popular posts from this blog

Magento/PHP - Get phones on all members in a customer group -

php - Bypass Geo Redirect for specific directories -

php - .htaccess mod_rewrite for dynamic url which has domain names -