r - plyr package writing the same function over multiple columns -
i want write same function multiple columns using ddply function, i'm tried keep writing them in 1 line, want see there better way of doing this?
here's simple version of data:
data<-data.frame(type=as.integer(runif(20,1,3)),a_mean_weight=runif(20,1,100),b_mean_weight=runif(20,1,10)) and want find out sum of columns a_mean_weight , b_mean_weight doing this:
ddply(data,.(type),summarise,mean_a=sum(a_mean_weight),mean_b=sum(b_mean_weight)) but in current data have more 8 "*_mean_weight", , i'm tired of writing them 8 times
ddply(data,.(type),summarise,mean_a=sum(a_mean_weight),mean_b=sum(b_mean_weight),mean_c=sum(c_mean_weight),mean_d=sum(d_mean_weight),mean_e=sum(e_mean_weight),mean_f=sum(f_mean_weight),mean_g=sum(g_mean_weight),mean_h=sum(h_mean_weight)) is there better way write this? thank help!!
the plyr-centred approach use colwise
eg
ddply(data, .(type), colwise(sum)) type a_mean_weight b_mean_weight 1 1 319.8977 60.80317 2 2 621.6745 37.05863 you can pass column names argument .col if want subset
you can use numcolwise or catcolwise act on numeric or categorical columns only.
note use sapply in place of basic use of colwise
ddply(data, .(type), sapply, fun = 'mean') the idiomatic data.table approach use lapply(.sd, fun)
eg
dt <- data.table(data) dt[,lapply(.sd, sum) ,by = type] type a_mean_weight b_mean_weight 1: 2 621.6745 37.05863 2: 1 319.8977 60.80317
Comments
Post a Comment