plyr

Transforming subsets of data in R with by, ddply and data.table

Transforming data sets with R is usually the starting point of my data analysis work. Here is a scenario which comes up from time to time: transform subsets of a data frame, based on context given in one or a combination of columns. As an example I use a data set which shows sales figures by product for a number of years: df <- data.frame(Product=gl(3,10,labels=c("A","B", "C")), Year=factor(rep(2002:2011,3)), Sales=1:30) head(df) ## Product Year Sales ## 1 A 2002 1 ## 2 A 2003 2 ## 3 A 2004 3 ## 4 A 2005 4 ## 5 A 2006 5 ## 6 A 2007 6 I am interested in absolute and relative sales developments by product over time.

Say it in R with "by", "apply" and friends

R is a language, as Luis Apiolaza pointed out in his recent post. This is absolutely true, and learning a programming language is not much different from learning a foreign language. It takes time and a lot of practice to be proficient in it. I started using R when I moved to the UK and I wonder, if I have a better understanding of English or R by now.