CatBoost Part 2: Building and Using Trees
StatQuest with Josh Starmer StatQuest with Josh Starmer
1.17M subscribers
15,939 views
0

 Published On Mar 5, 2023

Just like we saw in CatBoost Part 1, Ordered Target Encoding, we're going to use the training data one row at a time to build and calculate the output values from trees. This is part of CatBoot's determined effort to avoid leakage like there is no tomorrow. We'll also learn how CatBoost makes predictions once the trees made.

NOTE: This StatQuest is based on the original CatBoost manuscript... https://arxiv.org/abs/1706.09516
...and an example provided in the CatBoost documentation...
https://catboost.ai/en/docs/concepts/...

English
This video has been dubbed using an artificial voice via https://aloud.area120.google.com to increase accessibility. You can change the audio track language in the Settings menu.

Spanish
Este video ha sido doblado al español con voz artificial con https://aloud.area120.google.com para aumentar la accesibilidad. Puede cambiar el idioma de la pista de audio en el menú Configuración.

Portuguese
Este vídeo foi dublado para o português usando uma voz artificial via https://aloud.area120.google.com para melhorar sua acessibilidade. Você pode alterar o idioma do áudio no menu Configurações.


If you'd like to support StatQuest, please consider...
Patreon:   / statquest  
...or...
YouTube Membership:    / @statquest  

...buying my book, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...
https://statquest.org/statquest-store/

...or just donating to StatQuest!
https://www.paypal.me/statquest

Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
  / joshuastarmer  

0:00 Awesome song and introduction
1:10 Building the first tree
6:05 Quantifying the effectiveness of the first threshold
6:56 Testing a second threshold
9:05 Building the second tree
10:21 The main idea of how CatBoost works
12:15 Making predictions
13:02 Symmetric Decision Trees
14:56 Summary of the main ideas

Corrections:
2:05 Red should have gone into bin 0 instead of bin 1.
7:23 I should have said that the cosine similarity was 0.71.

#StatQuest #CatBoost #DubbedWithAloud

show more

Share/Embed