Union Find (Disjoint Set)

Back to DSA & Algorithms/00 - Roadmap Overview

What is Union-Find?

Union-Find (Disjoint Set Union / DSU) tracks which elements belong to which group. It supports two operations:

Find(x): Which group does x belong to? (Returns the "representative" of x's group)
Union(x, y): Merge the groups containing x and y

The Mental Model

Imagine students in a classroom. Initially, everyone is in their own group. When two students become friends, their groups merge. Union-Find efficiently tracks "Are A and B in the same friend group?"

Why Not Just Use DFS/BFS?

Union-Find shines when edges are added dynamically (one at a time). DFS/BFS would need to re-traverse the graph each time. Union-Find handles each new edge in nearly O(1).

Implementation (With Both Optimizations)

java
class UnionFind {
    int[] parent;
    int[] rank;
    int components;

    public UnionFind(int n) {
        parent = new int[n];
        rank = new int[n];
        components = n;         // Number of separate groups
        for (int i = 0; i < n; i++) {
            parent[i] = i;      // Each element is its own parent (self-loop)
        }
    }

    /** Find the representative (root) of x's group. */
    public int find(int x) {
        if (parent[x] != x) {
            parent[x] = find(parent[x]);  // Path compression!
        }
        return parent[x];
    }

    /** Merge groups of x and y. Returns false if already connected. */
    public boolean union(int x, int y) {
        int rootX = find(x);
        int rootY = find(y);

        if (rootX == rootY) {
            return false;  // Already in the same group
        }

        // Union by rank: attach shorter tree under taller tree
        if (rank[rootX] < rank[rootY]) {
            int temp = rootX;
            rootX = rootY;
            rootY = temp;       // Ensure rootX is taller
        }

        parent[rootY] = rootX;

        if (rank[rootX] == rank[rootY]) {
            rank[rootX]++;
        }

        components--;
        return true;
    }

    /** Are x and y in the same group? */
    public boolean connected(int x, int y) {
        return find(x) == find(y);
    }
}

The Two Optimizations (Critical for Performance)

1. Path Compression (in find): When finding the root of x, make every node on the path point directly to the root. Flattens the tree for future lookups.

Before find(4):       After find(4):
    1                     1
    |                   / | \
    2                  2  3  4
    |
    3
    |
    4

2. Union by Rank (in union): Always attach the shorter tree under the taller tree. Keeps trees balanced.

Without rank:          With rank:
  1                      1
  |                    / |
  2                   2  3
  |                   |
  3                   4
  |
  4
  (degenerate)         (balanced)

With both optimizations: nearly O(1) per operation (amortized O(α(n)), where α is the inverse Ackermann function — practically constant).

When to Use Union-Find vs BFS/DFS

Scenario	Best Approach
Static graph, traverse components	DFS/BFS
Dynamically adding edges	Union-Find
Detect cycle in undirected graph	Union-Find
Count connected components	Either
Need shortest path	BFS (Union-Find can't do this)
Kruskal's MST	Union-Find

Common Applications

Redundant Connection (Cycle Detection)

java
// Find the edge that creates a cycle in an undirected graph
public int[] findRedundantConnection(int[][] edges) {
    int n = edges.length;
    UnionFind uf = new UnionFind(n + 1);  // 1-indexed nodes

    for (int[] edge : edges) {
        if (!uf.union(edge[0], edge[1])) {
            return edge;  // This edge connects already-connected nodes → cycle!
        }
    }

    return new int[0];
}

// Why Union-Find is perfect here:
// Process edges one by one. If both endpoints are already in the same group,
// this edge creates a cycle. O(n) with path compression.

Number of Connected Components

java
public int countComponents(int n, int[][] edges) {
    UnionFind uf = new UnionFind(n);
    for (int[] edge : edges) {
        uf.union(edge[0], edge[1]);
    }
    return uf.components;
}

Accounts Merge

java
// Merge accounts that share an email
public List<List<String>> accountsMerge(List<List<String>> accounts) {
    int n = accounts.size();
    UnionFind uf = new UnionFind(n);
    Map<String, Integer> emailToId = new HashMap<>();  // email → account index

    // Union accounts that share emails
    for (int i = 0; i < n; i++) {
        List<String> account = accounts.get(i);
        for (int j = 1; j < account.size(); j++) {
            String email = account.get(j);
            if (emailToId.containsKey(email)) {
                uf.union(i, emailToId.get(email));
            }
            emailToId.put(email, i);
        }
    }

    // Group emails by root account
    Map<Integer, Set<String>> groups = new HashMap<>();
    for (Map.Entry<String, Integer> entry : emailToId.entrySet()) {
        int root = uf.find(entry.getValue());
        groups.computeIfAbsent(root, k -> new TreeSet<>()).add(entry.getKey());
    }

    // Build result
    List<List<String>> result = new ArrayList<>();
    for (Map.Entry<Integer, Set<String>> entry : groups.entrySet()) {
        List<String> merged = new ArrayList<>();
        merged.add(accounts.get(entry.getKey()).get(0));  // Account name
        merged.addAll(entry.getValue());                  // Sorted emails (TreeSet)
        result.add(merged);
    }

    return result;
}

Problems

200. Number of Islands - Medium
323. Number of Connected Components - Medium
684. Redundant Connection - Medium
721. Accounts Merge - Medium
261. Graph Valid Tree - Medium
128. Longest Consecutive Sequence - Medium

Things to Be Aware Of

1. Always Use Both Optimizations

java
// Path compression alone: O(log n) amortized
// Union by rank alone: O(log n) amortized
// Both together: O(α(n)) ≈ O(1) amortized

2. Union-Find Only Works for Undirected Graphs

java
// For directed graphs, use DFS-based cycle detection instead.

3. 0-indexed vs 1-indexed

java
// Some problems use 1-indexed nodes.
// Allocate new UnionFind(n + 1) to handle indices 1..n