Manish Barnwal

...just another human

Tutorial on dplyr- a package for data manipulation in R

R is the most used tool in data science. It has no dearth of packages for specific use cases. There are three packages that I feel can get your most of the work done - ggplot2, dplyr, data.table.

  • ggplot2- Used for visualization. Also known as grammar of graphics. This package ...

Diving into H2O with R

Do you understand the pain when you have to train advanced machine learning algorithms like Random Forest on huge datasets? When there is a factor column that has way too many number of levels? When the time taken to train the model is so huge that you went to your ...

How to install a package of a particular version in R

I recently tried installing caret package in R using

install.packages('caret', dependencies=T)

Normally this installation of package works and I continue to work with the functions associated with the package. When I tried including the package using


I got the following error.

Error in loadNamespace(j ...

When R package is not available across the cluster

When deploying R codes across the cluster, many a times the reason for the failure of the task is unavailability of a particular package across all nodes of the cluster. We wait for someone to get the package installed across all the nodes. This may take some days. Do we ...