performance - Python: Efficient split column in pandas DF -

February 15, 2011

suppose have df contains column of form

0     a.1 1     a.2 2     b.3 3     4.c

and suppose want split columns '.' using element after '.'. naive way

for in range(len(tbl)):   tbl['column_name'].iloc[i] = tbl['column_name'].iloc[i].split('.',1)[1]

this works. , it's slow large tables. have idea how speed process? can use new columns in df not restricted changing source column (as reuse in example). thanks!

pandas has string methods such things efficiently without loops (which kill performance). in case, can use .str.split:

>> import pandas pd >> df = pd.dataframe({'a': ['a.1', 'a.2', 'b.3', 'c.4']}) >> df     0   a.1 1   a.2 2   b.3 3   c.4 >> df.a.str.split('.').apply(pd.series)     0   1 0     1 1     2 2   b   3 3   c   4

Search This Blog

Script

performance - Python: Efficient split column in pandas DF -

Comments

Post a Comment

Popular posts from this blog

javascript - Bootstrap Popover: iOS Safari strange behaviour -

Magento/PHP - Get phones on all members in a customer group -

spring cloud - How to configure SpringCloud Eureka instance to point to https on non standard port -