Conditional group by join in R -
i new r , rather flumoxed following problem. have 2 vectors of dates (the vectors not aligned, nor of same length).
i want find each date in first vector next date in second vector.
veca <- as.date(c('1951-07-01', '1953-01-01', '1957-04-01', '1958-12-01', '1963-06-01', '1965-05-01')) vecb <- as.date(c('1952-01-12', '1952-02-01', '1954-03-01', '1958-08-01', '1959-03-01', '1964-03-01', '1966-05-01'))
in sql write this, cannot find tips in how in r.
select veca.date, min(vecb.date) veca inner join vecb on veca.date < vecb.date group veca.date
the output should this:
start end 1951-07-01 1952-01-12 1953-01-01 1954-03-01 1957-04-01 1958-08-01 1958-12-01 1959-03-01 1963-06-01 1964-03-01 1965-05-01 1966-05-01
here's possible solution using data.table
rolling joins
library(data.table) dt1 <- as.data.table(veca) ## convert `data.table` object dt2 <- as.data.table(vecb) ## convert `data.table` object setkey(dt2) # key in order perform binary join res <- dt2[dt1, vecb, roll = -inf, = .eachi] # run inner join while selecting closest date setnames(res, c("start", "end")) res # start end # 1: 1951-07-01 1952-01-12 # 2: 1953-01-01 1954-03-01 # 3: 1957-04-01 1958-08-01 # 4: 1958-12-01 1959-03-01 # 5: 1963-06-01 1964-03-01 # 6: 1965-05-01 1966-05-01
alternatively, can do:
data.table(veca=vecb, vecb, key="veca")[dt1, roll=-inf]
Comments
Post a Comment