TopicsGraphKruskal's Algorithm: Finding Minimum Spanning Trees
📖

Kruskal's Algorithm: Finding Minimum Spanning Trees

Graph
~20 min read
5 practice problems

Kruskal's Algorithm: Finding Minimum Spanning Trees

Introduction

Kruskal's Algorithm is a greedy algorithm used to find the Minimum Spanning Tree (MST) of a connected, undirected, weighted graph. An MST is a subset of the edges that connects all the vertices together, without any cycles, and with the minimum possible total edge weight.

Unlike Prim's algorithm, which grows the MST from a single starting vertex, Kruskal's algorithm builds the MST by adding edges one by one, starting from the smallest weight edge, provided adding that edge does not form a cycle.

Key Concepts

1. The Graph Structure

  • Vertices (Nodes): Represent entities (e.g., cities, computers).
  • Edges: Represent connections with associated weights (e.g., cost, distance).
  • Weighted Graph: Every edge has a numerical value representing the cost of traversing it.

2. Spanning Tree

A spanning tree of a graph is a subgraph that is a tree and includes all the vertices of the original graph. A graph can have many spanning trees.

3. Minimum Spanning Tree (MST)

Among all possible spanning trees of a graph, the MST is the one with the minimum total edge weight.

The Core Mechanism: Union-Find (Disjoint Set Union - DSU)

Kruskal's algorithm relies heavily on the Union-Find data structure to efficiently detect cycles. As we add edges, we need to know if two vertices are already connected.

How Union-Find Works

  1. Initialization: Each vertex is its own parent (its own set).
  2. Find(x): Finds the root parent of a vertex. To optimize, we use Path Compression, which flattens the structure of the tree, making future queries faster.
  3. Union(x, y): Merges the sets containing vertices x and y. To optimize, we use Union by Rank, which attaches the smaller tree under the root of the larger tree to keep the tree flat.

Algorithm Steps

  1. Sort Edges: Sort all edges in the graph in non-decreasing order of their weight.
  2. Initialize: Create an empty set for the MST and initialize the Union-Find structure for all vertices.
  3. Iterate: Iterate through the sorted edges:
  • Pick the edge with the smallest weight.
  • Check if adding this edge creates a cycle using the Union-Find structure.
  • If no cycle is detected (i.e., the vertices belong to different sets), add the edge to the MST and union the two sets.
  1. Repeat: Repeat step 3 until all vertices are connected (i.e., MST has V-1 edges).

Complexity Analysis

Time Complexity

  • Sorting Edges: O(E log E), where E is the number of edges. If E < V, this becomes O(E log V).
  • Union-Find Operations: O(E * α(V)), where α(V) is the Inverse Ackermann function. This is effectively O(1) for all practical purposes.
  • Total Time Complexity: O(E log E).

Space Complexity

  • Storage: O(V + E) to store the graph and the Union-Find arrays.
  • MST Storage: O(V) to store the result.

Implementation (TypeScript)

interface Edge {
  source: number;
  destination: number;
  weight: number;
}

class UnionFind {
  parent: number[];
  rank: number[];

  constructor(size: number) {
    this.parent = Array.from({ length: size }, (_, i) => i);
    this.rank = new Array(size).fill(0);
  }

  find(x: number): number {
    if (this.parent[x] !== x) {
      // Path Compression
      this.parent[x] = this.find(this.parent[x]);
    }
    return this.parent[x];
  }

  union(x: number, y: number): boolean {
    const rootX = this.find(x);
    const rootY = this.find(y);

    if (rootX === rootY) {
      return false; // Already in the same set, cycle detected
    }

    // Union by Rank
    if (this.rank[rootX] < this.rank[rootY]) {
      this.parent[rootX] = rootY;
    } else if (this.rank[rootX] > this.rank[rootY]) {
      this.parent[rootY] = rootX;
    } else {
      this.parent[rootY] = rootX;
      this.rank[rootX]++;
    }
    return true;
  }
}

function kruskalsAlgorithm(edges: Edge[], vertices: number): Edge[] {
  // 1. Sort edges by weight
  edges.sort((a, b) => a.weight - b.weight);

  const mst: Edge[] = [];
  const uf = new UnionFind(vertices);

  // 2. Iterate through sorted edges
  for (const edge of edges) {
    // 3. Check if adding edge creates a cycle
    if (uf.union(edge.source, edge.destination)) {
      mst.push(edge);
      // MST should have V-1 edges
      if (mst.length === vertices - 1) {
        break;
      }
    }
  }

  return mst;
}

// Example Usage
const graphEdges: Edge[] = [
  { source: 0, destination: 1, weight: 4 },
  { source: 0, destination: 2, weight: 4 },
  { source: 1, destination: 2, weight: 2 },
  { source: 1, destination: 3, weight: 6 },
  { source: 2, destination: 3, weight: 3 },
];

const mst = kruskalsAlgorithm(graphEdges, 4);
console.log("Minimum Spanning Tree Edges:", mst);

Common Mistakes to Avoid

  1. Ignoring Disconnected Graphs: Kruskal's algorithm assumes the graph is connected. If the graph is disconnected, the algorithm will produce a Minimum Spanning Forest (a collection of MSTs for each component) rather than a single tree.
  2. Incorrect Union-Find Logic: Forgetting to implement path compression or union by rank will result in a Time Complexity of O(V^2), which is inefficient for large graphs.
  3. Sorting Direction: Ensure you sort edges in ascending order (smallest weight first). Sorting descending order will not work for finding the minimum spanning tree.

Real-World Applications

  • Network Design: Finding the cheapest way to connect a set of computers or cities with cables/roads.
  • Cluster Analysis: Grouping data points into clusters based on similarity (minimum distance).
  • Circuit Design: Minimizing the length of wires needed to connect components.
  • Image Segmentation: Grouping pixels into regions based on color similarity.

Comparison with Prim's Algorithm

| Feature | Kruskal's Algorithm | Prim's Algorithm |

| :--- | :--- | :--- |

| Approach | Greedy (Edge-based) | Greedy (Vertex-based) |

| Data Structure | Union-Find (Disjoint Set) | Priority Queue (Min-Heap) |

| Best For | Sparse graphs (fewer edges) | Dense graphs (more edges) |

| Implementation | Simpler to implement | Slightly more complex |

| Cycle Detection | Built-in via Union-Find | Built-in via Priority Queue |

Summary

Kruskal's Algorithm is a fundamental algorithm for graph theory. By leveraging the Union-Find data structure with path compression and union by rank, it efficiently finds the Minimum Spanning Tree. It is particularly useful for sparse graphs and scenarios where you need to understand the connectivity of a network based on cost.