Build In Public

Tweet what you learn. Get paid for it.

Best posts win prizes worth ₹10,000 every week. Rules, templates, and this week's reads inside.

₹10k / week
for the best #BuildInPublic posts

Swipe to view full content

WEEK 1 WHAT'S IN THERE TASK 1 TASK 2 TASK 3
Day 0Getting Started with Setting Up Your Data Science Environment using Anaconda, Jupyter Notebooks, Google Colab, and Kaggle.

Jupyter Notebook Complete Beginner GuideWindows / Mac / Linux
Day 1Basics of Python and understanding ML.

Beginner Tutorial (up to 30 mins)Moving Ahead (30 min – 1:30 hr)What is ML?
Day 2Basics of Python continued & NumPy overview.

Python continued (1:30 hr to end)Numpy VideoNumpy Notebook
Day 3Gaining an overview of Pandas.

Pandas OverviewKaggle Micro-coursePandas Notebook
Day 4Matplotlib and common ML problems.

Intro to MatplotlibMatplotlib NotebookCommon ML Problems
Day 5Seaborn and Descriptive Statistics.

Seaborn OverviewData Types in StatsCentral Tendencies & Normal Distribution

Swipe to view full content

WEEK 2 WHAT'S IN THERE TASK 1 TASK 2 TASK 3
Day 1
Hey, excited for Week 2? Often the data we deal with can have various issues like missing values, categorical values and outliers. Today we will learn about basic techniques to deal with such issues!

Introduction to Feature Engineering

Outlier Analysis

Handling Missing Values



Practical Handling Missing Values

Day 2 Today’s light on tech—we'll explore ML basics, supervised vs. unsupervised learning, and how to handle categorical data. Optional math refresher included!

Handling Categorical Variables

Supervised and Unsupervised Learning
All Feature Transformations

(Optional) Linear Algebra Refresher (Watch Chapters 1, 2, 3, 4, 9, 14)
Introduction to Process Mining
REQUIRED FOR PROJECT

Complete this task from Celonis Academy & learn about Process Intelligence Fundamentals

Open Task
Day 3 Today we dive into the basics of ML with Linear Regression, Cost Function, and Gradient Descent—simple concepts with powerful impact!
Linear Regression with One Variable

Loss Function and Gradient Descent Explained
Linear Regression Blog

Loss Function Blog

Day 4 Time to level up—today we tackle Linear regression with Multiple features and get hands-on with Scikit-learn, plus a sneak peek into Logistic Regression!

Linear Regression with Multiple Variables (Videos 21 - 24) Linear Regression with Scikit-learn Logistic Regression Blog

Day 5 Today we will be introduced to our first ever classification model, Logistic Regression. Learn it inside out! Logistic Regression
Videos 31 - 36
Logistic Regression with SciKit-learn

Logistic Regression from scratch

Swipe to view full content

WEEK 3 WHAT'S IN THERE TASK 1 TASK 2 TASK 3
Day 1
Since we have covered 2 basic ML models, let us take a break and learn about Overfitting, Underfitting and the Bias-Variance Tradeoff. These can help in telling you the complexity of your model - how well your model has used your data. This will be followed by Regularization.   Bias-Variance Video

Blog

Overfitting and Regularisation (37 - 41)

L1 L2 Regularization


Day 2 Today we will give you an introduction to some evaluation metrics and parameters. AUC - ROC curve

AUC - ROC curve Blog
Confusion Matrix

Confusion Matrix Blog
Evaluation Metrics

Evaluation Metrics Video
Day 3 Today, we deep dive into K nearest neighbours(KNN) and its importance and implementation KNN Video

KNN Blog
KNN implementation
Fundamentals of Process Mining
REQUIRED FOR PROJECT

Complete this task during Days 3–5. Building on the Introduction to Process Mining course, this learning path covers the key concepts and workflows used in Celonis. These fundamentals are required for the upcoming project and will help you get the most out of the hands-on activities.

Day 4 Today, we look into Naive and Gaussian Naive Baye's Algorithms. Naïve Bayes algorithm is a supervised learning algorithm, which is based on Bayes theorem and used for solving classification problems.

Multinomial Naive Bias Classifier Gaussian Naive Bias

Naive Bias implementation
Day 5 Let's have a look at Support Vector Machine (SVM) and their importance and implementation

SVM 1

SVM 2

SVM 3

SVM Implementation

Swipe to view full content

WEEK 4 WHAT'S IN THERE TASK 1 TASK 2 TASK 3
Day 1
A model's performance can be greatly increased by tuning its hyperparameters and at the same time it is also important to look for how accurate our model is. For this, we can use Grid search methods and Cross-Validation. Cross Validation

Code Implementation
What is hyperparameter tuning ?

Stream Processing Fundamentals

Day 2 Today, we shall learn about Decision Trees and Random Forest which will create the foundation for many advanced Machine Learning Algorithms. DecisionTrees (videos 46 -49) Random Forest (videos 52 - 53)

Decision Trees Notebook

Random Forest implementation

Day 3 Let's explore what boosting is and some of its variations Gradient Boosting (videos 59 - 61)

XGBoost

XGBoost continued

Catboost 1

Catboost 2
Day 4 Let's look at some more variations of boosting algorithms and how they can be used for specialized tasks. Adaboost

Kaggle Intermediate microcourse

LightGBM

AdaBoost implementation

Model Drift

Solving Issue of Model Drift

Day 5 Today, Let us go into the foundations of Neural Networks.

Neural Networks (1-2) Neural Networks (3-4)




Coming soon!

LINK DEADLINE INSTRUCTIONS
Week 1 Quiz




June 7, 2026 11:59 PM IST
WHAT'S IN THERELINK(S)
 Kaggle MicrocoursesLink
 Python OOP Full CourseLink
 How to handle imbalanced classes in a datasetLink
Finished today's content? Tweet what you learned. #SummerAnalytics2026 #BuildInPublic @cnaiitg
Post Now →