python - Create a time series from data -


i have dataframe contains information on defaults within loan portfolio , time origination occurred. each 'observation' pair representing time t in days, , amount of loan default:

df['time_to_default']  #  time origination default df['default_amnt']     #  loan amount defaulted 

i create series represents cumulative amount of defaults given time t. (assume time_to_default evenly divisible t). cannot figure out how create new dataframe element, assign initial value 0 , iterate through series....

it sounds need use groupby cumsum since want running total:

cum_defaults = df.groupby('time_to_default').default_amnt.sum().cumsum() 

you need reindex new series fill in missing days:

cum_defaults = cum_defaults.reindex(index=range(min(cum_defaults.index),                                                 max(cum_defaults.index) + 1),                                      method='ffill') 

with example data:

df = pd.dataframe({'time_to_default': [1, 3, 3, 6],                     'default_amnt': [10, 20, 30, 40]}) >>> cum_defaults time_to_default 1     10 2     10 3     60 4     60 5     60 6    100 name: default_amnt, dtype: int64 

Comments

Popular posts from this blog

Magento/PHP - Get phones on all members in a customer group -

php - Bypass Geo Redirect for specific directories -

php - .htaccess mod_rewrite for dynamic url which has domain names -