codebrew.ai

Union-Find Pattern: The Cleanest Way to Track Connected Components

Understand disjoint set union, path compression, and union by rank so you can solve connectivity problems efficiently.

union-find
graphs
connectivity
data-structures

Union-Find Pattern: The Cleanest Way to Track Connected Components

Union-Find, also called Disjoint Set Union, is one of those data structures that feels obscure until you hit the exact kind of problem it solves. Then it feels almost unfairly effective.

It is built for one job: efficiently tracking which items belong to the same connected group.

What Problems It Solves Well

Use Union-Find when you repeatedly need to:

  • connect two nodes
  • check whether two nodes are already connected
  • count connected components
  • detect cycles in an undirected graph

This appears in networking, clustering, account merging, grid connectivity, and minimum spanning tree algorithms.

The Core Operations

  • find(x): return the representative of the set containing x
  • union(a, b): merge the sets containing a and b

If two nodes have the same representative, they are in the same component.

Basic Implementation

python
class UnionFind:
    def __init__(self, n: int):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, x: int) -> int:
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]

    def union(self, a: int, b: int) -> bool:
        root_a = self.find(a)
        root_b = self.find(b)

        if root_a == root_b:
            return False

        if self.rank[root_a] < self.rank[root_b]:
            root_a, root_b = root_b, root_a

        self.parent[root_b] = root_a
        if self.rank[root_a] == self.rank[root_b]:
            self.rank[root_a] += 1

        return True

Two optimizations matter:

  • path compression in find
  • union by rank in union

Together, they make operations extremely fast in practice.

Example: Detect Cycle in an Undirected Graph

For each edge (u, v):

  • if find(u) == find(v), adding the edge forms a cycle
  • otherwise, union them

That is much simpler than repeatedly traversing the graph from scratch.

Where It Appears in Interviews

  • Number of Connected Components in an Undirected Graph
  • Redundant Connection
  • Accounts Merge
  • Kruskal's algorithm for minimum spanning tree

Common Mistakes

  • Forgetting that Union-Find is best suited for undirected connectivity
  • Implementing find without path compression
  • Using Union-Find when you actually need full traversal details or shortest paths
  • Confusing node count with component count

Union-Find tells you whether things are connected. It does not tell you the path between them.

Complexity

With path compression and union by rank, each operation is almost constant time for practical purposes.

OperationAmortized Time
findnear O(1)
unionnear O(1)

More formally, it is O(alpha(n)), where alpha is the inverse Ackermann function. For normal input sizes, that is tiny.

Mental Model

Think of each component as a tree with a representative root. find climbs to the root, and union connects two roots. Path compression flattens the tree over time, making future lookups faster.

Practice These Next

  • Number of Connected Components in an Undirected Graph
  • Redundant Connection
  • Accounts Merge
  • Min Cost to Connect All Points

Union-Find is a great reminder that the right data structure can make a graph problem feel dramatically simpler.