有向无环图 (Directed Acyclic Graphs, DAGs)

定义 (Definition)

有向无环图(DAG) 是一种特殊的有向图,其中不存在任何有向环。这意味着不存在一个顶点可以通过有向边返回到自身。

A Directed Acyclic Graph (DAG) is a special type of directed graph with no directed cycles. This means there is no way to start at a vertex and follow a sequence of directed edges that eventually loops back to the starting vertex.

形式化定义 (Formal Definition)

一个有向图 被称为有向无环图,当且仅当不存在顶点序列 使得

A directed graph is called a DAG if and only if there is no sequence of vertices such that and .

性质 (Properties)

  1. 拓扑排序 (Topological Sorting):DAG 可以进行拓扑排序,这是将其顶点排序的一种方式,使得对于每一条有向边 ,顶点 在顶点 之前。

    Topological Sorting: A DAG can be topologically sorted, which is an ordering of its vertices such that for every directed edge , vertex comes before .

  2. 源和汇 (Sources and Sinks):在 DAG 中,至少存在一个源(没有入边的顶点)和一个汇(没有出边的顶点)。

    Sources and Sinks: In a DAG, there is at least one source (a vertex with no incoming edges) and one sink (a vertex with no outgoing edges).

  3. 最长路径 (Longest Path):在 DAG 中,可以有效地找到从源到汇的最长路径,这是许多优化问题的基础。

    Longest Path: In a DAG, the longest path from a source to a sink can be found efficiently, which is fundamental for many optimization problems.

应用 (Applications)

  1. 任务调度 (Task Scheduling):DAG 常用于表示任务调度问题,其中顶点表示任务,边表示任务之间的依赖关系。

    Task Scheduling: DAGs are often used to represent task scheduling problems, where vertices represent tasks and edges represent dependencies between tasks.

  2. 表达式求值 (Expression Evaluation):在编译器中,DAG 用于表达式求值,以消除公共子表达式。

    Expression Evaluation: In compilers, DAGs are used for expression evaluation to eliminate common subexpressions.

  3. 版本控制 (Version Control):在版本控制系统中,DAG 表示提交历史,边表示提交之间的关系。

    Version Control: In version control systems, DAGs represent the history of commits, with edges representing relationships between commits.

基本算法 (Basic Algorithms)

拓扑排序 | Topological Sorting

定义 | Definition

拓扑排序是一种线性排序,用于有向无环图 (DAG) 的顶点排序,使得对于图中的每一条有向边 ,顶点 在顶点 之前出现在排序中。

Topological sorting is a linear ordering of vertices in a directed acyclic graph (DAG) such that for every directed edge , vertex comes before vertex in the ordering.

应用 | Applications

  1. 任务调度 | Task Scheduling
    • 在有依赖关系的任务之间确定执行顺序。
    • Determine the execution order of tasks with dependencies.
  2. 编译器构建 | Compiler Construction
    • 确定模块或指令的编译顺序。
    • Determine the compilation order of modules or instructions.
  3. 项目管理 | Project Management
    • 项目中任务的优先级排序。
    • Prioritize tasks in project management.

算法 | Algorithms

基于深度优先搜索的算法 | DFS-based Algorithm
过程 | Procedure
  1. 初始化 | Initialization
    • 创建一个空栈来存储拓扑排序结果。
    • 标记所有顶点为未访问。
    • Create an empty stack to store the topological sorting result.
    • Mark all vertices as unvisited.
  2. 递归访问 | Recursive Visit
    • 对每个未访问的顶点调用递归函数。
    • 在递归函数中,标记当前顶点为已访问。
    • 递归访问所有未访问的邻接顶点。
    • 当前顶点的所有邻接顶点访问完成后,将当前顶点入栈。
    • Call the recursive function for every unvisited vertex.
    • In the recursive function, mark the current vertex as visited.
    • Recursively visit all unvisited adjacent vertices.
    • After all adjacent vertices are visited, push the current vertex onto the stack.
  3. 构建排序结果 | Build the Result
    • 栈中的元素按出栈顺序即为拓扑排序结果。
    • The elements in the stack, when popped, give the topological order.
伪代码 | Pseudocode
def topological_sort_dfs(graph):
    def dfs(node, visited, stack):
        visited.add(node)
        for neighbor in graph[node]:
            if neighbor not in visited:
                dfs(neighbor, visited, stack)
        stack.append(node)
 
    visited = set()
    stack = []
 
    for vertex in graph:
        if vertex not in visited:
            dfs(vertex, visited, stack)
 
    stack.reverse()  # Reverse the stack to get the topological order
    return stack
示例 | Example

假设有以下图结构:

Suppose we have the following graph structure:

    A --> B --> C
    |     |
    v     v
    D --> E

调用 topological_sort_dfs(graph) 的结果可能是:['A', 'D', 'B', 'E', 'C']

基于入度的算法 | Kahn’s Algorithm
过程 | Procedure
  1. 计算入度 | Calculate In-Degree
    • 计算每个顶点的入度(进入该顶点的边的数量)。
    • Compute the in-degree (number of incoming edges) for each vertex.
  2. 初始化队列 | Initialize the Queue
    • 将所有入度为 0 的顶点加入队列。
    • Add all vertices with in-degree 0 to the queue.
  3. 逐步处理队列 | Process the Queue
    • 从队列中取出一个顶点,将其加入拓扑排序结果中。
    • 对每个取出的顶点,减少其所有邻接顶点的入度。
    • 如果某个邻接顶点的入度减少到 0,将其加入队列。
    • Repeat until the queue is empty.
    • Remove a vertex from the queue and add it to the topological order.
    • Decrease the in-degree of all adjacent vertices of the removed vertex.
    • If the in-degree of an adjacent vertex becomes 0, add it to the queue.
    • Repeat until the queue is empty.
  4. 检查环 | Check for Cycles
    • 如果拓扑排序结果中的顶点数小于图中的顶点数,说明图中存在环。
    • If the number of vertices in the topological order is less than the number of vertices in the graph, there is a cycle.
伪代码 | Pseudocode
from collections import deque, defaultdict
 
def topological_sort_kahn(graph):
    in_degree = {u: 0 for u in graph}  # Initialize in-degrees of all vertices to 0
    for u in graph:
        for v in graph[u]:
            in_degree[v] += 1  # Calculate in-degrees
 
    queue = deque([u for u in graph if in_degree[u] == 0])  # Enqueue vertices with in-degree 0
    topo_order = []
 
    while queue:
        u = queue.popleft()
        topo_order.append(u)
        for v in graph[u]:
            in_degree[v] -= 1  # Decrease in-degree by 1 for all adjacent vertices
            if in_degree[v] == 0:
                queue.append(v)  # Enqueue if in-degree becomes 0
 
    if len(topo_order) == len(graph):
        return topo_order
    else:
        raise Exception("Graph has at least one cycle")
 
# Example usage
graph = {
    'A': ['B', 'D'],
    'B': ['C', 'E'],
    'C': [],
    'D': ['E'],
    'E': []
}
 
print(topological_sort_kahn(graph))  # Output: ['A', 'D', 'B', 'E', 'C']

比较 | Comparison

特性DFS-based 拓扑排序Kahn’s Algorithm
数据结构栈 (递归实现)队列
实现复杂度较高较低
检查环隐式检查显式检查
适用性较适用于递归场景较适用于非递归场景

最长路径 (Longest Path) 算法

在 DAG 中查找从源到汇的最长路径,可以使用拓扑排序,然后对每个顶点进行松弛操作。

Finding the longest path from a source to a sink in a DAG can be done using topological sorting followed by relaxing each vertex.

def find_longest_path(graph, start):
    topo_order = dfs_topological_sort(graph)
    distances = {vertex: float('-inf') for vertex in graph}
    distances[start] = 0
 
    for vertex in topo_order:
        for neighbor, weight in graph[vertex].items():
            if distances[neighbor] < distances[vertex] + weight:
                distances[neighbor] = distances[vertex] + weight
    
    return distances

总结 (Summary)

有向无环图(DAG)在计算机科学中有着广泛的应用,如任务调度、表达式求值和版本控制等。理解 DAG 的性质和算法对于解决许多实际问题至关重要,尤其是在处理依赖关系和优化问题时。

Directed Acyclic Graphs (DAGs) have wide applications in computer science, such as task scheduling, expression evaluation, and version control. Understanding the properties and algorithms of DAGs is crucial for solving many real-world problems, especially when dealing with dependencies and optimization problems.