TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Follow publication

Member-only story

CatBoost vs. LightGBM vs. XGBoost

Which is the best algorithm?

Kay Jan Wong
TDS Archive
Published in
5 min readMay 5, 2022

--

Photo by Tingey Injury Law Firm on Unsplash
Photo by Tingey Injury Law Firm on Unsplash

CatBoost (Category Boosting), LightGBM (Light Gradient Boosted Machine), and XGBoost (eXtreme Gradient Boosting) are all gradient boosting algorithms. Before diving into their similarity and differences in terms of characteristics and performance, we must understand the term ensemble learning and how it relates to gradient boosting.

Table of Contents

  1. Ensemble Learning
  2. Catboost vs. LightGBM vs. XGBoost Characteristics
  3. Improving Accuracy, Speed, and Controlling Overfitting
  4. Performance Comparison

Ensemble Learning

Ensemble Learning is a technique that combines predictions from multiple models to get a prediction that would be more stable and generalize better. The idea is to average out different models’ individual mistakes to reduce the risk of overfitting while maintaining strong prediction performance.

In regression, overall prediction is typically the mean of individual tree predictions, whereas, in classification, overall prediction is based on a weighted vote with probabilities averaged across all trees, and the class with the highest probability is the final predicted class.

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Kay Jan Wong
Kay Jan Wong

Written by Kay Jan Wong

Data Scientist, Machine Learning Engineer, Software Developer, Programmer | Someone who loves coding, and believes coding should make our lives easier

Responses (5)

Write a response