r - Why does summarise on grouped data result in only overall summary in dplyr? -
suppose have following data:
dfx <- data.frame( group = c(rep('a', 8), rep('b', 15), rep('c', 6)), sex = sample(c("m", "f"), size = 29, replace = true), age = runif(n = 29, min = 18, max = 54) ) with old plyr can create little table summarizing data following code:
require(plyr) ddply(dfx, .(group, sex), summarize, mean = round(mean(age), 2), sd = round(sd(age), 2)) the output this:
group sex mean sd 1 f 49.68 5.68 2 m 32.21 6.27 3 b f 31.87 9.80 4 b m 37.54 9.73 5 c f 40.61 15.21 6 c m 36.33 11.33 i'm trying move code dplyr , %>% operator. code takes df group group , sex , summarise it. is:
dfx %>% group_by(group, sex) %>% summarise(mean = round(mean(age), 2), sd = round(sd(age), 2)) but output is:
mean sd 1 35.56 9.92 what doing wrong?
thanks!
the problem here loading dplyr first , plyr, plyr's function summarise masking dplyr's function summarise. when happens warning:
require(plyr) loading required package: plyr ------------------------------------------------------------------------------------------ have loaded plyr after dplyr - cause problems. if need functions both plyr , dplyr, please load plyr first, dplyr: library(plyr); library(dplyr) ------------------------------------------------------------------------------------------ attaching package: ‘plyr’ following objects masked ‘package:dplyr’: arrange, desc, failwith, id, mutate, summarise, summarize so in order code work, either detach plyr detach(package:plyr) or restart r , load plyr first , dplyr (or load dplyr):
library(dplyr) dfx %>% group_by(group, sex) %>% summarise(mean = round(mean(age), 2), sd = round(sd(age), 2)) source: local data frame [6 x 4] groups: group group sex mean sd 1 f 41.51 8.24 2 m 32.23 11.85 3 b f 38.79 11.93 4 b m 31.00 7.92 5 c f 24.97 7.46 6 c m 36.17 9.11 or can explicitly call dplyr's summarise in code, right function called no matter how load packages:
dfx %>% group_by(group, sex) %>% dplyr::summarise(mean = round(mean(age), 2), sd = round(sd(age), 2))
Comments
Post a Comment