python - Is Pandas 0.16.1 groupby().apply() method applying function more than once to the same group? -


i have noticed in cases pandas 0.16.1, apply() function on groupby() being applied more once 1 or more of output groups. here reproduction:

in [1]:  df2 = dataframe ({"a" : ["alpha", "alpha", "alpha", "beta","beta","beta","beta","gamma"]}) df2 ["b"] = series ([i in range(0,len(df2))]) df2  out [1]:       b 0   alpha   0 1   alpha   1 2   alpha   2 3   beta    3 4   beta    4 5   beta    5 6   beta    6 7   gamma   7  in [2]:  def my_func (df):     print(df.index)  in [3]:  df2.groupby("a").apply(my_func)  out [3]: int64index([0, 1, 2], dtype='int64') int64index([0, 1, 2], dtype='int64') int64index([3, 4, 5, 6], dtype='int64') int64index([7], dtype='int64') 

notice [0,1,2] index appearing twice in output. seem indicate function applied alpha group twice.

this not huge issue, since it's practice these functions idempotent in first place. however, if functions costly in terms of runtime (think big regression runs, etc.), can more of problem.

am using api incorrectly and/or misinterpreting output, or there possible issue here?

according doc (http://pandas.pydata.org/pandas-docs/dev/generated/pandas.dataframe.apply.html)

in current implementation apply calls func twice on first column/row decide whether can take fast or slow code path.


Comments

Popular posts from this blog

javascript - Bootstrap Popover: iOS Safari strange behaviour -

Magento/PHP - Get phones on all members in a customer group -

session - Logging Out Using PHP -