python - Pandas, Filling between dates with average change between previous rows -
i think best illustrated examples. lets have dataframe such:
295340 299616 2014-11-02 304.904110 157.123288 2014-12-02 597.303413 305.488493 2015-01-02 896.310372 454.614630 2015-02-02 1192.379580 599.588466 2015-02-04 1211.285484 nan 2015-03-02 nan 726.622932
now let's want reindex this, such:
rng = pd.date_range(df.index[0], df.index[-1]) df.reindex(rng) 295340 299616 2014-11-02 304.904110 157.123288 2014-11-03 nan nan 2014-11-04 nan nan 2014-11-05 nan nan ... 2014-11-29 nan nan 2014-11-30 nan nan 2014-12-01 nan nan 2014-12-02 597.303413 305.488493
now if @ 295340, see difference between values (597.30-304.90) = 292.39.
the amount of days between 2 values 31. average increase 9.43 day.
so want such:
295340 299616 2014-11-02 304.904110 157.123288 2014-11-03 314.336345 nan 2014-11-04 323.768581 nan 2014-11-05 333.200816 nan
the way calculated was:
304.904110 + (((597.303413-304.904110) / 31) * n)
where n 1 first row since row 1, 2 after, etc.
i want columns filled way, 299616 same method , such.
any ideas efficient possible? know of ways this, nothing seems efficient , seems there should type of fillna() or works type of finance related problem.
note: columns not spaced out same. each 1 can have numbers anywhere within range of dates, can't assume next number each column @ x date.
you can use dataframe.interpolate
"time"
method after resample
. (it won't give quite numbers gave, because there 30 days between 2 nov , 2 dec, not 31):
>>> dnew = df.resample("1d").interpolate("time") >>> dnew.head(100) 295340 299616 2014-11-02 304.904110 157.123288 2014-11-03 314.650753 162.068795 [...] 2014-11-28 558.316839 285.706466 2014-11-29 568.063483 290.651972 2014-11-30 577.810126 295.597479 2014-12-01 587.556770 300.542986 2014-12-02 597.303413 305.488493 2014-12-03 606.948799 310.299014 [...] 2014-12-30 867.374215 440.183068 2014-12-31 877.019600 444.993589 2015-01-01 886.664986 449.804109 2015-01-02 896.310372 454.614630 [...] 2015-02-01 1182.828960 594.911891 2015-02-02 1192.379580 599.588466 [...]
the downside here it'll extrapolate using last value @ end:
[...] 2015-01-31 1173.278341 590.235315 2015-02-01 1182.828960 594.911891 2015-02-02 1192.379580 599.588466 2015-02-03 1201.832532 604.125411 2015-02-04 1211.285484 608.662356 2015-02-05 1211.285484 613.199302 2015-02-06 1211.285484 617.736247 [...]
so you'd have decide how want handle that.
Comments
Post a Comment