avcontentteam — Published On August 3, 2015 and Last Modified On September 16th, 2015
Data Exploration Infographic Intermediate Libraries R


R offers multiple packages for performing data analysis. Apart from providing an awesome interface for statistical analysis, the next best thing about R is the endless support it gets from developers and data science maestros from all over the world. Current count of downloadable packages from CRAN stands close to 7000 packages!

Beyond some of the popular packages such as caret, ggplot, dplyr, lattice, there exist many more libraries which remain unnoticeable, but prove to be very handy at certain stages of analysis. So, we created a comprehensive list of all packages in R.

In order to make the guide more useful, we further did 2 things:

  1. Mapped use of each of these libraries to the stage they generally get used at – Pre-Modeling, Modeling and Post-Modeling.
  2. Created a handy infographic with the most commonly used libraries. Analysts can just print this out and keep handy for reference. The graphic is displayed below:

data science, analytics, useful libraries in R, data mining

Here is a complete guide to powerful R packages, which are categorized into various stages of process of data analysis. Download Here.

If you like what you just read & want to learn more on Big Data, subscribe to our emailsfollow us on twitter or like our facebook page.

7 thoughts on "List of useful packages (libraries) for Data Analysis in R"

MIke S
MIke S says: August 10, 2015 at 5:44 am
Thanks for the pdf, it's very thorough. Reply
Marco says: September 01, 2015 at 7:00 pm
Thanks for another great reference! I would also suggest slam and quanteda for text mining in R. Reply
Balaji Subudhi
Balaji Subudhi says: September 08, 2015 at 8:25 am
Run the code to install all the above library in a singel go. ##This should detect and install missing packages before loading them - hopefully! list.of.packages <- c("dplyr", "plyr", "data.table", "MissForest", "MissMDA", "Outliers", "EVIR", "Features", "RRF", "FactorMiner", "CCP", "ggplot2", "googkleVis", "Rcharts", "car", "randomforest", "Rminer", "CoreLearn", "caret", "BigRF", "CBA", "RankCluster", "forecat", "LTSA", "survival", "Basta", "LSMean", "Comparison", "RegTest", "ACD", "BinomTools", "DAIM", "ClustEval", "SigClust", "PROC", "TimeROC", "Rcpp", "parallel", "xml", "httr", "rjson", "jasonlite", "shiny", "Rmarkdown", "tm", "OpenNLP", "sqldf", "RODBC", "rmonogodb") new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])] if(length(new.packages)) install.packages(new.packages) lapply(list.of.packages,function(x){library(x,character.only=TRUE)}) Reply
Gokul says: October 20, 2015 at 6:02 am
@Balaji Subudhi. Thanks Balaji, I installed everything in single shot with help of your code. Reply
priya says: June 08, 2016 at 4:56 pm
Thanks for sharing. Reply
Steve says: June 17, 2016 at 12:42 am
Some packages I also found useful: zoo (as advanced time series class if needed) imputeTS (for time series missing value imputation) mlr (meta package for classification / regression) VIM (for imputation of multivariate data) randomForest and e1071 (SVMs) - but this two I also rather use via interface packages like mlr, caret, rminer Reply
Meenakshi says: May 21, 2018 at 7:32 pm
Thanks, great help Balaji. I need bit more information.. How to recall packages? Does it include parametric and non-parametric tests of statistics? And correlation and regression? Reply

Leave a Reply Your email address will not be published. Required fields are marked *