并查集(Union-Find)

定义 (Definition)

并查集是一种数据结构,用于处理不相交集合(Disjoint Sets)的合并和查找操作。它支持以下两种操作:

  1. 合并操作(Union):将两个不同的集合合并为一个集合
  2. 查找操作(Find):确定元素属于哪个集合

并查集常用于网络连通性问题、最小生成树算法(如 Kruskal 算法)等。

The Union-Find data structure is used to manage a collection of disjoint sets and supports the following operations:

  1. Union Operation: Merges two different sets into one set
  2. Find Operation: Determines which set a particular element is in

Union-Find is commonly used in network connectivity problems and minimum spanning tree algorithms (e.g., Kruskal’s algorithm).

操作 (Operations)

初始化 (Initialization)

每个元素自成一个集合,初始时,每个元素的父节点是其自身。

Initially, each element forms its own set, and each element’s parent is itself.

def initialize(n):
    parent = [i for i in range(n)]
    rank = [0] * n
    return parent, rank

查找操作 (Find Operation)

查找操作使用路径压缩(Path Compression)来加速后续的查找操作。路径压缩将查找路径上的所有节点直接连接到根节点,从而降低树的高度。

The find operation uses path compression to speed up subsequent find operations. Path compression flattens the structure of the tree whenever find is called, making all nodes in the path point directly to the root.

def find(parent, x):
    if parent[x] != x:
        parent[x] = find(parent, parent[x])
    return parent[x]

合并操作 (Union Operation)

合并操作使用按秩合并(Union by Rank)来保持树的平衡。按秩合并将秩较小的树连接到秩较大的树上,从而避免树的高度增加。

The union operation uses union by rank to keep the tree balanced. Union by rank attaches the smaller tree under the root of the larger tree to prevent the tree from becoming too tall.

def union(parent, rank, x, y):
    rootX = find(parent, x)
    rootY = find(parent, y)
    
    if rootX != rootY:
        if rank[rootX] > rank[rootY]:
            parent[rootY] = rootX
        elif rank[rootX] < rank[rootY]:
            parent[rootX] = rootY
        else:
            parent[rootY] = rootX
            rank[rootX] += 1

复杂度 (Complexity)

对于并查集的所有操作,时间复杂度接近常数时间,可以认为是 ,其中 是反阿克曼函数,增长极其缓慢。在实际应用中, 的值不会超过 4。

The time complexity for all operations in Union-Find is nearly constant time, which can be considered as , where is the inverse Ackermann function, which grows extremely slowly. In practical applications, the value of does not exceed 4.

常见用法 (Common Usage)

  • 网络连通性:判断图中两个节点是否连通

  • 最小生成树:在 Kruskal 算法中用于检测环

  • 动态连通性:在动态连通性问题中处理在线查询

  • Network Connectivity: Determine if two nodes in a graph are connected

  • Minimum Spanning Tree: Used in Kruskal’s algorithm to detect cycles

  • Dynamic Connectivity: Handle online queries in dynamic connectivity problems

示例 (Example)

# 初始化并查集
parent, rank = initialize(5)
 
# 合并操作
union(parent, rank, 0, 1)
union(parent, rank, 1, 2)
union(parent, rank, 3, 4)
 
# 查找操作
print(find(parent, 0))  # 输出:0
print(find(parent, 1))  # 输出:0
print(find(parent, 2))  # 输出:0
print(find(parent, 3))  # 输出:3
print(find(parent, 4))  # 输出:3
 
# 检查连通性
print(find(parent, 0) == find(parent, 2))  # 输出:True
print(find(parent, 0) == find(parent, 3))  # 输出:False

注意点 (Points to Note)

  • 路径压缩和按秩合并是提升并查集性能的关键

  • 在实现并查集时,务必同时实现路径压缩和按秩合并,以确保最佳性能

  • 并查集适用于需要频繁合并和查找操作的问题

  • Path compression and union by rank are key to optimizing Union-Find performance

  • Ensure both path compression and union by rank are implemented for optimal performance

  • Union-Find is ideal for problems requiring frequent union and find operations