Manish Barnwal

...just another human

An illustrated introduction to adversarial validation part 1

You'd have heard about cross-validation - a common technique used in data-science process to avoid overfitting and many a times to tune the optimal parameters. Overfitting is when the model does well on training data but fails drastically on test data. The reason could be one of the following:

  1. The ...

The curse of bias and variance [draft]

Statistics is the field of study where we try to draw conclusions about the population from a sample. Why do we talk about sample? Why can't we get the conclusions about the population directly from the population? Let me illustrate this by an example.

Let us say we want ...

Visualization in ML is under-rated

Visualization is one of the most important pillars of data science. Every one wants to learn Machine learning but if you explain them the little tasks that involve the overall workflow of the process, it turns them off. Everyone just wants to do the cool stuff. They want to build ...

The filmy Secret Santa

Ever heard of secret Santa game? If not, you may not appreciate this article. Wait till you join the corporate world. But if you have, secret Santa may ring some bell.

Dialogues define our cinema. Ever imagined this twist. What if the Secret Santa gets introduced to Bollywood? Presenting, Secret ...

Coldplay-Fix you | Ghar aa jaao

Last month on a mundane weekend, I was practicing this song - Fix you by Coldplay when my friend, Dipu came to the room. He stood there for some time to understand what I was trying to play. I am not that pro on guitar yet. Chances are you may want ...

Improve runtime of Random Forest in R

There are two ways one can write the code to train a random forest model in R. Both the ways are listed below.

A normal and frequent way of writing the command to train the random forest model is something like this.

rfModel <- randomForest(Survived~. , data = trainSample[, -c(6, 8 ...

How to install a package of a particular version in R

I recently tried installing caret package in R using

install.packages('caret', dependencies=T)

Normally this installation of package works and I continue to work with the functions associated with the package. When I tried including the package using


I got the following error.

Error in loadNamespace(j ...

Don’t introduce that bias in your child

A thought crossed my mind yesterday and just when it was about to get lost in the cloud of many thoughts that scatters in mind…I caught hold of it. I felt I had gathered the maximum out of this thought but I wanted to write about it so that ...

Shell commands come in handy for a data scientist

I am no expert of shell commands. I have been using them for quite some time and thought I give an attempt to list down the most common commands. I am writing these mostly from the perspective of a data-science guy. Let us get started.

I will use the file- ...