performance - Python: Efficient split column in pandas DF -


suppose have df contains column of form

0     a.1 1     a.2 2     b.3 3     4.c 

and suppose want split columns '.' using element after '.'. naive way

for in range(len(tbl)):   tbl['column_name'].iloc[i] = tbl['column_name'].iloc[i].split('.',1)[1]  

this works. , it's slow large tables. have idea how speed process? can use new columns in df not restricted changing source column (as reuse in example). thanks!

pandas has string methods such things efficiently without loops (which kill performance). in case, can use .str.split:

>> import pandas pd >> df = pd.dataframe({'a': ['a.1', 'a.2', 'b.3', 'c.4']}) >> df     0   a.1 1   a.2 2   b.3 3   c.4 >> df.a.str.split('.').apply(pd.series)     0   1 0     1 1     2 2   b   3 3   c   4 

Comments

Popular posts from this blog

Magento/PHP - Get phones on all members in a customer group -

php - Bypass Geo Redirect for specific directories -

php - .htaccess mod_rewrite for dynamic url which has domain names -