python - Create a time series from data -
i have dataframe contains information on defaults within loan portfolio , time origination occurred. each 'observation' pair representing time t in days, , amount of loan default:
df['time_to_default'] # time origination default df['default_amnt'] # loan amount defaulted
i create series represents cumulative amount of defaults given time t. (assume time_to_default evenly divisible t). cannot figure out how create new dataframe element, assign initial value 0 , iterate through series....
it sounds need use groupby
cumsum
since want running total:
cum_defaults = df.groupby('time_to_default').default_amnt.sum().cumsum()
you need reindex new series fill in missing days:
cum_defaults = cum_defaults.reindex(index=range(min(cum_defaults.index), max(cum_defaults.index) + 1), method='ffill')
with example data:
df = pd.dataframe({'time_to_default': [1, 3, 3, 6], 'default_amnt': [10, 20, 30, 40]}) >>> cum_defaults time_to_default 1 10 2 10 3 60 4 60 5 60 6 100 name: default_amnt, dtype: int64
Comments
Post a Comment