python - Is Pandas 0.16.1 groupby().apply() method applying function more than once to the same group? -

January 15, 2010

this question has answer here:

python pandas groupby object apply method duplicates first group 1 answer

i have noticed in cases pandas 0.16.1, apply() function on groupby() being applied more once 1 or more of output groups. here reproduction:

in [1]:  df2 = dataframe ({"a" : ["alpha", "alpha", "alpha", "beta","beta","beta","beta","gamma"]}) df2 ["b"] = series ([i in range(0,len(df2))]) df2  out [1]:       b 0   alpha   0 1   alpha   1 2   alpha   2 3   beta    3 4   beta    4 5   beta    5 6   beta    6 7   gamma   7  in [2]:  def my_func (df):     print(df.index)  in [3]:  df2.groupby("a").apply(my_func)  out [3]: int64index([0, 1, 2], dtype='int64') int64index([0, 1, 2], dtype='int64') int64index([3, 4, 5, 6], dtype='int64') int64index([7], dtype='int64')

notice [0,1,2] index appearing twice in output. seem indicate function applied alpha group twice.

this not huge issue, since it's practice these functions idempotent in first place. however, if functions costly in terms of runtime (think big regression runs, etc.), can more of problem.

am using api incorrectly and/or misinterpreting output, or there possible issue here?

according doc (http://pandas.pydata.org/pandas-docs/dev/generated/pandas.dataframe.apply.html)

in current implementation apply calls func twice on first column/row decide whether can take fast or slow code path.

Search This Blog

Script

python - Is Pandas 0.16.1 groupby().apply() method applying function more than once to the same group? -

Comments

Post a Comment

Popular posts from this blog

javascript - Bootstrap Popover: iOS Safari strange behaviour -

Website Login Issue developed in magento -

Can the constants be defined inside a model file of a framework in PHP? -