Complete Kaggle Data Science Project | Machine Learning
The ML Mine The ML Mine
232 subscribers
237 views
0

 Published On Apr 28, 2024

Here is a complete end-to-end Kaggle project implementation (python) using machine learning models for the famous Titanic problem.

If you are a beginner and want to make a data science portfolio, feel free to follow along with the code! You don't need to understand everything that is going on under the hood of the algorithms, for a beginner, learning to implement them should be enough.

We start from scratch, covering all the steps involved in building a machine-learning model. The code is implemented in the notebook platform provided by Kaggle. The whole section is divided into 7 major sections:
1) Understand the data
2) Understand the distribution
3) Feature engineering
4) Data pre-processing
5) Building ML models
6) Hyperparameter tuning
7) Prediction on test data and submission

Python libraries used in the project: Numpy, Pandas, Seaborn, Sklearn (sci-kit)
Machine learning models trained: Naive-Bayes, Logistic Regression, KNeighboursClassifier, Decision trees, Random forest classifier, Support Vector Classification (SVC)

Kaggle notebook: https://www.kaggle.com/code/riteshyad...
Instagram:   / the_ml_mine  

Timestamps
00:00 Introduction
00:18 Project overview (Kaggle)
02:38 Start Notebook
04:39 Understanding the dataset
07:03 Understanding the distribution
20:50 Feature engineering
41:12 Data pre-processing
52:35 Encoder (OneHotEncoder)
56:00 Building ML models
58:55 Model Hyperparameter tuning
01:04:55 Answering the main question
01:05:30 Feature importance
01:11:20 Predictions on test data
01:22:15 Submitting the notebook
01:25:35 Outro

Credits:
Kaggle logo: Databuff, CC BY-SA 3.0 https://creativecommons.org/licenses/..., via Wikimedia Commons

show more

Share/Embed