Union-Find Pattern: The Cleanest Way to Track Connected Components
Understand disjoint set union, path compression, and union by rank so you can solve connectivity problems efficiently.
Union-Find Pattern: The Cleanest Way to Track Connected Components
Union-Find, also called Disjoint Set Union, is one of those data structures that feels obscure until you hit the exact kind of problem it solves. Then it feels almost unfairly effective.
It is built for one job: efficiently tracking which items belong to the same connected group.
What Problems It Solves Well
Use Union-Find when you repeatedly need to:
- connect two nodes
- check whether two nodes are already connected
- count connected components
- detect cycles in an undirected graph
This appears in networking, clustering, account merging, grid connectivity, and minimum spanning tree algorithms.
The Core Operations
find(x): return the representative of the set containingxunion(a, b): merge the sets containingaandb
If two nodes have the same representative, they are in the same component.
Basic Implementation
class UnionFind:
def __init__(self, n: int):
self.parent = list(range(n))
self.rank = [0] * n
def find(self, x: int) -> int:
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x])
return self.parent[x]
def union(self, a: int, b: int) -> bool:
root_a = self.find(a)
root_b = self.find(b)
if root_a == root_b:
return False
if self.rank[root_a] < self.rank[root_b]:
root_a, root_b = root_b, root_a
self.parent[root_b] = root_a
if self.rank[root_a] == self.rank[root_b]:
self.rank[root_a] += 1
return TrueTwo optimizations matter:
- path compression in
find - union by rank in
union
Together, they make operations extremely fast in practice.
Example: Detect Cycle in an Undirected Graph
For each edge (u, v):
- if
find(u) == find(v), adding the edge forms a cycle - otherwise, union them
That is much simpler than repeatedly traversing the graph from scratch.
Where It Appears in Interviews
- Number of Connected Components in an Undirected Graph
- Redundant Connection
- Accounts Merge
- Kruskal's algorithm for minimum spanning tree
Common Mistakes
- Forgetting that Union-Find is best suited for undirected connectivity
- Implementing
findwithout path compression - Using Union-Find when you actually need full traversal details or shortest paths
- Confusing node count with component count
Union-Find tells you whether things are connected. It does not tell you the path between them.
Complexity
With path compression and union by rank, each operation is almost constant time for practical purposes.
| Operation | Amortized Time |
|---|---|
find | near O(1) |
union | near O(1) |
More formally, it is O(alpha(n)), where alpha is the inverse Ackermann function. For normal input sizes, that is tiny.
Mental Model
Think of each component as a tree with a representative root. find climbs to the root, and union connects two roots. Path compression flattens the tree over time, making future lookups faster.
Practice These Next
- Number of Connected Components in an Undirected Graph
- Redundant Connection
- Accounts Merge
- Min Cost to Connect All Points
Union-Find is a great reminder that the right data structure can make a graph problem feel dramatically simpler.