# How to use Git and Github

I had taken this course - How to use git and github some time last year. This post is an amalgamation of the course notes and other tutorials I have completed in understanding git. I will talk about the most frequently used commands. If you already are confident of your git ...

# An illustrated introduction to adversarial validation part 1

You'd have heard about cross-validation - a common technique used in data-science process to avoid overfitting and many a times to tune the optimal parameters. Overfitting is when the model does well on training data but fails drastically on test data. The reason could be one of the following:

1. The ...

# The curse of bias and variance [draft]

Statistics is the field of study where we try to draw conclusions about the population from a sample. Why do we talk about sample? Why can't we get the conclusions about the population directly from the population? Let me illustrate this by an example.

Let us say we want ...

# Visualization in ML is under-rated

Visualization is one of the most important pillars of data science. Every one wants to learn Machine learning but if you explain them the little tasks that involve the overall workflow of the process, it turns them off. Everyone just wants to do the cool stuff. They want to build ...

# The filmy Secret Santa

Ever heard of secret Santa game? If not, you may not appreciate this article. Wait till you join the corporate world. But if you have, secret Santa may ring some bell.

Dialogues define our cinema. Ever imagined this twist. What if the Secret Santa gets introduced to Bollywood? Presenting, Secret ...

# Coldplay-Fix you | Ghar aa jaao

Last month on a mundane weekend, I was practicing this song - Fix you by Coldplay when my friend, Dipu came to the room. He stood there for some time to understand what I was trying to play. I am not that pro on guitar yet. Chances are you may want ...

# Random Forest explained intuitively

Random Forests algorithm has always fascinated me. I like how this algorithm can be easily explained to anyone without much hassle. One quick example, I use very ...

# Improve runtime of Random Forest in R

There are two ways one can write the code to train a random forest model in R. Both the ways are listed below.

A normal and frequent way of writing the command to train the random forest model is something like this.

rfModel <- randomForest(Survived~. , data = trainSample[, -c(6, 8 ...

# How to install a package of a particular version in R

I recently tried installing caret package in R using

install.packages('caret', dependencies=T)


Normally this installation of package works and I continue to work with the functions associated with the package. When I tried including the package using

library(caret)


I got the following error.

Error in loadNamespace(j ...

# Don’t introduce that bias in your child

A thought crossed my mind yesterday and just when it was about to get lost in the cloud of many thoughts that scatters in mind…I caught hold of it. I felt I had gathered the maximum out of this thought but I wanted to write about it so that ...