堆 | Heap

定义 | Definition

堆是一种特殊的 完全二叉树,在这棵树中,每个节点的值都大于或等于(最大堆)或小于或等于(最小堆)其子节点的值。

A heap is a special complete binary tree in which the value of each node is greater than or equal to (max heap) or less than or equal to (min heap) the values of its children.

堆的性质 | Properties of Heap

  • 完全二叉树 | Complete Binary Tree: 堆是一种完全二叉树。

  • 堆序性 | Heap Order Property: 最大堆中的每个节点的值大于或等于其子节点的值;最小堆中的每个节点的值小于或等于其子节点的值。

  • 节点位置 | Node Position: 对于一个索引为 的节点,其父节点索引为 ,左子节点索引为 ,右子节点索引为

  • Complete Binary Tree: A heap is a complete binary tree.

  • Heap Order Property: In a max heap, each node’s value is greater than or equal to the values of its children; in a min heap, each node’s value is less than or equal to the values of its children.

  • Node Position: For a node at index , its parent index is , its left child index is , and its right child index is .

堆操作 | Heap Operations

插入 | Insertion

将新元素添加到堆的末尾,然后向上调整堆以保持堆的性质。

Add the new element to the end of the heap and then adjust the heap upwards to maintain the heap property.

def heap_insert(heap, element):
    heap.append(element)
    i = len(heap) - 1
    while i > 0:
        parent = (i - 1) // 2
        if heap[i] <= heap[parent]:  # 对于最小堆 | For min-heap
            break
        heap[i], heap[parent] = heap[parent], heap[i]
        i = parent

删除最大/最小元素 | Deletion of Maximum/Minimum Element

移除堆顶元素,用堆的最后一个元素替换它,然后向下调整堆以保持堆的性质。

Remove the root element, replace it with the last element in the heap, and then adjust the heap downwards to maintain the heap property.

def heap_remove(heap):
    if len(heap) == 0:
        return None
    root = heap[0]
    heap[0] = heap.pop()
    heapify_down(heap, 0)
    return root
 
def heapify_down(heap, i):
    size = len(heap)
    while True:
        left = 2 * i + 1
        right = 2 * i + 2
        largest = i
        if left < size and heap[left] > heap[largest]:
            largest = left
        if right < size and heap[right] > heap[largest]:
            largest = right
        if largest == i:
            break
        heap[i], heap[largest] = heap[largest], heap[i]
        i = largest

构建堆的复杂度分析 | Complexity Analysis of Building a Heap

构建堆的操作通常是指将一个无序数组转化为一个堆,这个过程称为 堆化。堆化的时间复杂度分析如下:

The operation of building a heap usually refers to converting an unordered array into a heap, a process known as heapification. The time complexity analysis of heapification is as follows:

构建堆算法 | Build Heap Algorithm

假设我们有一个包含 个元素的数组,要将其转化为堆,我们从最后一个非叶节点开始,逐步向上对每个节点进行 下沉操作(heapify),直到根节点。

Assume we have an array with elements, and we want to convert it into a heap. We start from the last non-leaf node and perform heapify on each node, moving upwards until the root node.

def build_heap(array):
    n = len(array)
    for i in range(n // 2 - 1, -1, -1):
        heapify(array, n, i)

在此算法中,每次调用 heapify 的时间复杂度取决于树的高度。对于节点 ,它的高度为 ,最坏情况下的时间复杂度为

In this algorithm, each heapify call’s time complexity depends on the height of the tree. For a node , its height is , and the worst-case time complexity is .

总时间复杂度 | Total Time Complexity

通过分析每层节点数与对应高度的关系,整个堆化过程的总时间复杂度可以表示为:

By analyzing the relationship between the number of nodes at each level and the corresponding height, the total time complexity of the heapification process can be expressed as:

具体来说,虽然每次 heapify 的单次操作可能需要 的时间,但由于很多节点位于树的底部,其高度较小,因此整个过程的摊还时间复杂度为

Specifically, while a single heapify operation might take time, many nodes are located at the bottom of the tree with smaller heights, resulting in an overall amortized time complexity of for the entire process.

堆排序 | Heap Sort

堆排序 利用堆这种数据结构来实现排序,其时间复杂度为

Heap sort utilizes the heap data structure to perform sorting, with a time complexity of .

算法步骤 | Algorithm Steps

  1. 构建最大堆 | Build a max heap

  2. 重复以下步骤,直到堆大小为 1:

    • 交换堆顶元素和堆的最后一个元素
    • 减少堆的大小
    • 调整堆以维持最大堆性质
  3. Build a max heap

  4. Repeat the following steps until the heap size is 1:

    • Swap the root element with the last element of the heap
    • Reduce the heap size
    • Adjust the heap to maintain the max heap property
def heap_sort(array):
    def heapify(array, n, i):
        largest = i
        left = 2 * i + 1
        right = 2 * i + 2
        if left < n and array[left] > array[largest]:
            largest = left
        if right < n and array[right] > array[largest]:
            largest = right
        if largest != i:
            array[i], array[largest] = array[largest], array[i]
            heapify(array, n, largest)
 
    n = len(array)
    for i in range(n // 2 - 1, -1, -1):
        heapify(array, n, i)
    for i in range(n - 1, 0, -1):
        array[i], array[0] = array[0], array[i]
        heapify(array, i, 0)

进阶知识 | Advanced Knowledge

二叉堆与斐波那契堆 | Binary Heap vs Fibonacci Heap

  • 二叉堆 | Binary Heap: 插入和删除操作的时间复杂度为 ,适合简单的优先队列实现。
  • 斐波那契堆 | Fibonacci Heap: 提供更快的摊还时间复杂度,插入操作为 ,删除操作为 ,适合于需要频繁合并操作的场景。

斐波那契堆

  • Binary Heap: The time complexity for insertion and deletion operations is , suitable for simple priority queue implementations.
  • Fibonacci Heap: Offers faster amortized time complexity, with insertion operations at and deletion operations at , suitable for scenarios requiring frequent merge operations.

堆在图算法中的应用 | Applications of Heap in Graph Algorithms

堆常用于图算法中,例如 Dijkstra 算法和 Prim 算法,用于高效地管理和访问优先级队列。

Heaps are often used in graph algorithms, such as Dijkstra’s algorithm and Prim’s algorithm, for efficiently managing and accessing the priority queue.

堆在 Dijkstra 算法中的作用 Role of Heaps in Dijkstra’s Algorithm

def dijkstra(graph, start):
    queue = [(0, start)]
    distances = {vertex: float('infinity') for vertex in graph}
    distances[start] = 0
    while queue:
        current_distance, current_vertex = heapq.heappop(queue)
        if current_distance > distances[current_vertex]:
            continue
        for neighbor, weight in graph[current_vertex].items():
            distance = current_distance + weight
            if distance < distances[neighbor]:
                distances[neighbor] = distance
                heapq.heappush(queue, (distance, neighbor))
    return distances

常见问题 | Common Issues

  • 上浮和下沉操作中的错误 | Errors in Upward and Downward Adjustments: 在插入和删除操作中,上浮和下沉操作容易出错,应确保正确地比较和交换节点。

  • 边界条件处理 | Handling Boundary Conditions: 在处理堆顶元素或最后一个元素时,需特别注意边界条件。

  • Errors in Upward and Downward Adjustments: In insertion and deletion operations, errors in upward and downward adjustments are common, ensure correct comparison and swapping of nodes.

  • Handling Boundary Conditions: Special attention is needed when handling the root or the last element of the heap.