Increasing Classification Accuracy with Ensemble Learners

Given a complex classification problem as is often in many industrial challenges – from investment timing, to drug discovery, and fraud detection to recommendation systems, different approaches are applied to the classification problem. However, while in search for a viable solution, various algorithms are applied, sometimes none of them are better than the rest. In such instances, one can elect to keep them all and then select the final classifier, by combining the individual classifiers. The combination of different classifiers to improve predictive accuracy is often referred to as ensemble learning.

Ensemble learning is a machine learning concept that combines multiple learning algorithms. Each of which solves the same original task, to obtain a better combined model, with more accurate and reliable estimates or decisions than can be obtained from using a single model.

Ensemble learning architeceture

How then is the combination achieved?

As a quick answer, I can take the average, or I can apply the different ways of making the most out of my sub-classifiers. This is the idea behind ensemble learning. Divide the decision amongst several classifiers to arrive to a more accurate or representative decision - divide-and-conquer technique of the machine learning world.

Ensemble machine learning methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms,

Ensembles techniques are based on algorithms, some of which are simple and less computational intensive, while others are quite complex and more computation intensive. In any production environment, both accuracy and computation time are important. Thus, in most instances, data scientist need to make compromise between accuracy and time. This compromise is achieved by combining many weak learners to gain a confidence index out of them, which is essential in implementing such a system for real-time application with very high accuracy. The figure below shows the basic architecture of and framework:

Why use ensemble methods?

Under what circumstances does one need to use ensemble methods? You may ask! Although one would argue that there are various reasons of why, more or less thinking outside the box, here I outline some of the general reasons to use ensemble learning, some of which are:

Ensemble Approaches {Next Post]

My next post, will explore some examples for increasing classification accuracy using ensemble learning approaches, bagging, boosting and stacking, which all leverage a crowd of experts. The objective is to improve accuracy by decreasing variance (bagging), bias(boosting), or improve predictions(stacking).

Written on July 27, 2017
Tags: #machine learning #Data science #ensemble learning #bagging #boosting #stacking