Data Science Interview Notes (Full-Stack Roadmap)

How to use this site

  • 🟪 1-minute Summary = skim mode
  • 🟦 Core Notes = must-know
  • 🟨 Interview Triggers = what interviewers really test
  • 🟥 Common Mistakes = traps
  • 🟩 Mini Example = quick application

0) Start Here (Read First)


A) Statistics Foundations

A1: Statistics Basics

A2: Probability

A3: Random Variables & Distributions


B) Statistical Inference & Hypothesis Testing

B1: Hypothesis Testing Core (Master Template)

B2: Test Families (Each page repeats the same framework)


C) EDA & Data Preparation

C1: EDA Workflow

C2: Data Cleaning Modules


D) Machine Learning Core

D1: Unsupervised Learning (Clustering)

D2: Supervised Learning (Prediction)


E) Model Evaluation & Model Selection

E1: Regression Evaluation

E2: Classification Evaluation

E3: Model Selection Workflows


F) Generalization, Regularization, and Fit

F1: Fit & Generalization

F2: Regularization (Linear Models)


G) Feature Engineering & Non-Linear Modeling


H) Imbalanced Data Toolkit


I) Ensemble Methods

I1: Ensemble Overview

I2: Bagging & Forests

I3: Boosting Family


Suggested Reading Paths

Path 1 — Interview Sprint (fast catch-up)

  1. A1 → A3 (Stats + distributions)
  2. B1 (Hypothesis testing steps)
  3. C1 + C2 (EDA workflow + cleaning)
  4. D2 (Linear + Logistic + Trees)
  5. E2 (Confusion matrix + ROC/AUC)
  6. F1 + F2 (Over/Under + regularization)
  7. E3 (CV + GridSearch)
  8. I (Ensembles)

Path 2 — Deep Study (build mastery)

Follow pillars A → I in order and do the mini example + interview questions on every page.