🟪 1-Minute Summary

Hierarchical clustering builds a tree (dendrogram) showing how data points group together at different similarity levels. Two types: Agglomerative (bottom-up: merge) and Divisive (top-down: split). Advantage: don’t need to specify K beforehand, dendrogram visualizes structure. Disadvantage: slow for large datasets. Cut the dendrogram at desired height to get clusters.


🟦 Core Notes (Must-Know)

What is Hierarchical Clustering?

[Content to be filled in]

Agglomerative vs Divisive

[Content to be filled in]

How to Read a Dendrogram

[Content to be filled in]

Linkage Methods

[Content to be filled in]

  • Single linkage
  • Complete linkage
  • Average linkage
  • Ward linkage

When to Use

[Content to be filled in]


🟨 Interview Triggers (What Interviewers Actually Test)

Common Interview Questions

  1. “What’s the advantage of hierarchical clustering over K-Means?”

    • [Answer: Don’t need to specify K, shows clustering at all levels]
  2. “How do you decide where to cut the dendrogram?”

    • [Answer framework to be filled in]
  3. “What’s the time complexity of hierarchical clustering?”

    • [Answer: O(n³) - slow for large datasets]

🟄 Common Mistakes (Traps to Avoid)

Mistake 1: Using hierarchical clustering on very large datasets

[Content to be filled in]

Mistake 2: Not choosing appropriate linkage method

[Content to be filled in]


🟩 Mini Example (Quick Application)

Scenario

[Small dataset hierarchical clustering]

Solution

from scipy.cluster.hierarchy import dendrogram, linkage
import matplotlib.pyplot as plt

# Example to be filled in


Navigation: