Data Science Interview Notes (Full-Stack Roadmap)
How to use this site
- 🟪 1-minute Summary = skim mode
- 🟦 Core Notes = must-know
- 🟨 Interview Triggers = what interviewers really test
- 🟥 Common Mistakes = traps
- 🟩 Mini Example = quick application
0) Start Here (Read First)
A) Statistics Foundations
A1: Statistics Basics
- Descriptive vs Inferential Statistics
- Mean / Median / Mode
- Range / IQR / Variance / Standard Deviation
A2: Probability
A3: Random Variables & Distributions
- Random Variables (Discrete vs Continuous)
- Distributions Overview
- Normal Distribution
- Standard Normal Distribution (Z-score)
B) Statistical Inference & Hypothesis Testing
B1: Hypothesis Testing Core (Master Template)
B2: Test Families (Each page repeats the same framework)
C) EDA & Data Preparation
C1: EDA Workflow
C2: Data Cleaning Modules
D) Machine Learning Core
D1: Unsupervised Learning (Clustering)
D2: Supervised Learning (Prediction)
- Supervised Learning Overview
- Linear Regression
- Logistic Regression
- K-Nearest Neighbors (KNN)
- Decision Tree
E) Model Evaluation & Model Selection
E1: Regression Evaluation
E2: Classification Evaluation
- Confusion Matrix
- Classification Report (Precision/Recall/F1)
- Accuracy
- Precision
- Recall
- FPR (False Positive Rate)
- ROC Curve + AUC
E3: Model Selection Workflows
F) Generalization, Regularization, and Fit
F1: Fit & Generalization
F2: Regularization (Linear Models)
G) Feature Engineering & Non-Linear Modeling
H) Imbalanced Data Toolkit
I) Ensemble Methods
I1: Ensemble Overview
I2: Bagging & Forests
I3: Boosting Family
Suggested Reading Paths
Path 1 — Interview Sprint (fast catch-up)
- A1 → A3 (Stats + distributions)
- B1 (Hypothesis testing steps)
- C1 + C2 (EDA workflow + cleaning)
- D2 (Linear + Logistic + Trees)
- E2 (Confusion matrix + ROC/AUC)
- F1 + F2 (Over/Under + regularization)
- E3 (CV + GridSearch)
- I (Ensembles)
Path 2 — Deep Study (build mastery)
Follow pillars A → I in order and do the mini example + interview questions on every page.