有限自动机 (Finite Automata)

正则语言 可以通过有限自动机识别。有限自动机有两种主要类型:

Regular languages can be recognized by finite automata. There are two main types of finite automata:

  • 确定性有限自动机 (Deterministic Finite Automaton, DFA): 对于每一个状态和输入符号,有且仅有一个转移。

    For each state and input symbol, there is exactly one transition.

  • 非确定性有限自动机 (Non-deterministic Finite Automaton, NFA): 对于每一个状态和输入符号,可以有多个转移。

    For each state and input symbol, there can be multiple transitions.

确定性有限自动机 (Deterministic Finite Automaton, DFA)

定义 (Definition)

一个确定性有限自动机 (DFA) 是一个五元组 ,其中:

A Deterministic Finite Automaton (DFA) is a 5-tuple , where:

  • 是状态的有限集合 (a finite set of states)
  • 是输入字母表 (the input alphabet)
  • 是状态转移函数 (the state transition function)
  • 是初始状态 (the initial state)
  • 是接受状态的集合 (the set of accept states)

性质 (Properties)

  • 唯一性 (Uniqueness): 对于每一个状态和输入符号, 函数仅有一个确定的转移状态。

    For each state and input symbol, the function has exactly one transition.

  • 接受 (Acceptance): 一个字符串 被 DFA 接受当且仅当从初始状态 出发,经过 中每个符号的转移,最后到达一个接受状态

    A string is accepted by the DFA if and only if, starting from the initial state , it reads through each symbol of and ends up in an accept state .

判断 (Decision)

  • 成员性 (Membership): 可以在线性时间内(相对于字符串长度)判断一个字符串是否被 DFA 接受。

    Membership can be decided in linear time relative to the length of the string.

操作 (Operations)

  • 并 (Union): 可以通过构建一个新的 DFA 来表示两个 DFA 的语言的并。

    The union of two DFAs can be represented by constructing a new DFA.

  • 交 (Intersection): 可以通过构建一个新的 DFA 来表示两个 DFA 的语言的交。

    The intersection of two DFAs can be represented by constructing a new DFA.

  • 补 (Complement): 通过将所有非接受状态转换为接受状态,接受状态转换为非接受状态来构建补语言的 DFA。

    The complement of a DFA can be constructed by switching the accept and non-accept states.

示例 (Example)

假设 DFA 其转移函数 定义如下:

Let DFA with transition function defined as:

Current StateInputNext State
0
1
0
1

这个 DFA 接受所有包含奇数个 的字符串。

This DFA accepts all strings with an odd number of s.

非确定性有限自动机 (Non-deterministic Finite Automaton, NFA)

定义 (Definition)

一个非确定性有限自动机 (NFA) 是一个五元组 ,其中:

A Non-deterministic Finite Automaton (NFA) is a 5-tuple , where:

  • 是状态的有限集合 (a finite set of states)
  • 是输入字母表 (the input alphabet)
  • 是状态转移函数 (the state transition function)
  • 是初始状态 (the initial state)
  • 是接受状态的集合 (the set of accept states)

NFA 允许在转移函数中使用 -转移,即可以在不消耗输入符号的情况下转移到其他状态。

NFAs allow the use of -transitions in the transition function, which means they can transition to other states without consuming an input symbol.

NFA 也允许在转移函数中返回多个状态,或者转移至空集。

NFAs also allow the transition function to return multiple states or transition to the empty set.

性质 (Properties)

  • 非确定性 (Non-determinism): 对于每一个状态和输入符号, 函数可以返回多个转移状态。

    For each state and input symbol, the function can return multiple transition states.

  • 接受 (Acceptance): 一个字符串 被 NFA 接受当且仅当存在至少一个从初始状态 出发,通过 中每个符号的转移,最后到达一个接受状态 的路径。

    A string is accepted by the NFA if and only if there exists at least one path from the initial state through each symbol of that ends in an accept state .

判断 (Decision)

  • 成员性 (Membership): 可以在非线性时间内(相对于字符串长度)判断一个字符串是否被 NFA 接受,但可以通过转换为 DFA 来解决。

    Membership can be decided in non-linear time relative to the length of the string but can be solved by converting the NFA to a DFA.

操作 (Operations)

  • 并 (Union): 可以通过构建一个新的 NFA 来表示两个 NFA 的语言的并。

    The union of two NFAs can be represented by constructing a new NFA.

  • 交 (Intersection): 通过转换为 DFA 后再进行交运算来表示两个 NFA 的语言的交。

    The intersection of two NFAs can be represented by converting them to DFAs first and then performing the intersection.

  • 补 (Complement): 通过转换为 DFA 后再进行补运算来构建补语言的 NFA。

    The complement of an NFA can be constructed by converting it to a DFA first and then performing the complement.

示例 (Example)

假设 NFA 其转移函数 定义如下:

Let NFA with transition function defined as:

Current StateInputNext State
0
1
1
0
1

这个 NFA 接受所有以 结尾的字符串。

This NFA accepts all strings that end with .

DFA 和 NFA 之间的转换 (Conversion between DFA and NFA)

NFA 到 DFA 的转换 (NFA to DFA Conversion)

-闭包

在将 NFA 转换为 DFA 时,需要计算每个状态的 -闭包,即从当前状态出发,通过 -转移可以到达的所有状态的集合。

When converting an NFA to a DFA, it is necessary to compute the -closure of each state, which is the set of all states that can be reached from the current state through -transitions.

例如,对于一个 NFA,其状态 -闭包为 ,表示从 出发,通过 -转移可以到达

For example, for an NFA, the -closure of state is , indicating that from , one can reach and through -transitions.

子集构造法 (Subset Construction)

通过子集构造法 (subset construction) 或称为幂集构造法 (powerset construction),可以将 NFA 转换为等价的 DFA。这个算法的主要步骤包括:

An NFA can be converted to an equivalent DFA using the subset construction method, also known as the powerset construction. The main steps of this algorithm include:

  1. 初始化 (Initialization): 定义 DFA 的初始状态为 NFA 初始状态的 -闭包。

    Define the initial state of the DFA as the -closure of the initial state of the NFA.

  2. 状态转换 (State Transitions): 对于 DFA 的每一个新状态,计算其在每个输入符号下的 -闭包,并将其作为新的状态。

    For each new state of the DFA, compute its -closure for each input symbol and treat it as a new state.

  3. 接受状态 (Accept States): 如果 DFA 的一个状态包含 NFA 的一个接受状态,则该状态为 DFA 的接受状态。

    If a state of the DFA contains an accept state of the NFA, then that state is an accept state of the DFA.

示例 (Example)

假设有一个 NFA,其转移表如下:

Consider an NFA with the following transition table:

Current StateInputNext State
0
1
1
0
1

通过子集构造法,将其转换为等价的 DFA,其转移表如下:

Using the subset construction, convert it to an equivalent DFA with the following transition table:

DFA StateNFA States01

DFA 和 NFA 与正则表达式的等价性 (Equivalence of DFA, NFA, and Regular Expressions)

  • DFA 到 正则表达式 (DFA to Regular Expression): 通过 状态消除法,可以将 DFA 转换为等价的正则表达式。

    A DFA can be converted to an equivalent regular expression using the state elimination method.

  • NFA 到 正则表达式 (NFA to Regular Expression): 先将 NFA 转换为 DFA,然后将 DFA 转换为正则表达式。

    An NFA can be converted to an equivalent regular expression by first converting the NFA to a DFA and then converting the DFA to a regular expression.

  • 正则表达式 到 NFA (Regular Expression to NFA): 通过 Thompson 构造法,可以将正则表达式转换为等价的 NFA。

    A regular expression can be converted to an equivalent NFA using the Thompson construction method.

状态消除法 (State Elimination Method)

状态消除法是将 DFA 转换为等价的正则表达式的一种方法。这个方法的主要步骤包括重复以下步骤:

The state elimination method is a technique to convert a DFA to an equivalent regular expression. The main steps of this method include:

  1. 消除非接受状态 (Eliminate Non-Accept States):

    • 从 DFA 中删除非接受状态。

    • Remove the non-accept states from the DFA.

  2. 合并状态 (Merge States):

    • 对于每一对状态 ,找到一个状态 ,使得从 的路径等价于从 再到 的路径。

    • 用新的状态替换

    • For each pair of states , find a state such that the path from to is equivalent to the path from to and then to .

    • Replace and with the new state.

  3. 构建正则表达式 (Build Regular Expression):

    • 对于每一对状态 ,找到从 的路径的正则表达式。

    • 用这些正则表达式替换合并的状态。

    • For each pair of states , find the regular expression for the path from to .

    • Replace the merged states with these regular expressions.

示例 (Example)

假设 DFA 其转移函数 定义如下: Suppose DFA , and its transition function is defined as follows:

  • 状态集合 The set of states

  • 输入符号集 The input symbol set

  • 初始状态 Initial state

  • 接受状态集合 The set of accepting states

转移函数 定义如下: The transition function is defined as follows:

Current StateInputNext State

通过状态消除法,将其转换为等价的正则表达式。 Convert it into an equivalent regular expression using the state elimination method.

步骤 1:消除状态 Step 1: Eliminate state
  • 首先,我们找到所有包含状态 的转移,并进行替换: First, we identify all transitions involving state and replace them:

  • 我们可以将其表示为: This can be expressed as:

  • 因此,我们添加一条直接从 的边,其标签为 ,以及一条自环在 ,其标签为 。 Therefore, add a direct edge from to with the label and a self-loop at with the label .

步骤 2:消除状态 Step 2: Eliminate state
  • 找到所有包含状态 的转移,并进行替换: Identify all transitions involving state and replace them:

  • 我们可以将其表示为: This can be expressed as:

  • 因此,我们添加一条直接从 的边,其标签为 ,以及一条直接从 的边,其标签为 。 Therefore, add a direct edge from to with the label and a direct edge from to with the label .

步骤 3:消除状态 Step 3: Eliminate state
  • 找到所有包含状态 的转移,并进行替换: Identify all transitions involving state and replace them:

  • 我们可以将其表示为: This can be expressed as:

  • 因此,我们添加一条直接从 的边,其标签为 ,以及一条自环在 ,其标签为 。 Therefore, add a direct edge from to with the label and a self-loop at with the label .

步骤 4:消除状态 Step 4: Eliminate state
  • 找到所有包含状态 的转移,并进行替换: Identify all transitions involving state and replace them:

  • 我们可以将其表示为:

最终结果 Final Result

由于只剩下一个状态 ,我们可以通过到达该状态的正则表达式来表示整个 DFA 的语言。

现在有两条边,一条是自环,标签为 ,另一条是从 ,标签为

因此,整个 DFA 的语言可以表示为正则表达式

Thompson 构造法 (Thompson Construction Method)

Thompson 构造法是将正则表达式转换为等价的 NFA 的一种方法。其主要步骤包括:

The Thompson construction method is a technique to convert a regular expression to an equivalent NFA. The main steps of this method include:

  1. 基本构造 (Basic Construction):

    • 对于每个字符 ,构建一个 NFA,包含两个状态 ,以及从 的转移

    • 对于空串 ,构建一个 NFA,包含两个状态 ,以及从 -转移。

    • For each character , construct an NFA with two states and and a transition from to .

    • For the empty string , construct an NFA with two states and and an -transition from to .

  2. 连接 (Concatenation):

    • 对于两个 NFA ,将 的初始状态与 的接受状态连接。

    • For two NFAs and , connect the initial state of to the accept state of .

  3. 选择 (Alternation):

    • 对于两个 NFA ,构建一个新的 NFA,包含一个新的初始状态和接受状态,以及 -转移连接

    For two NFAs and , construct a new NFA with a new initial and accept state and -transitions connecting and .

  4. 闭包 (Kleene Closure):

    • 对于一个 NFA ,构建一个新的 NFA,包含一个新的初始状态和接受状态,以及 -转移连接新的初始状态和接受状态,以及从新的初始状态到 的接受状态的 -转移。

    For an NFA , construct a new NFA with a new initial and accept state, -transitions connecting the new initial and accept states, and -transitions from the new initial state to the accept states of .

  5. 最终构造 (Final Construction):

    • 通过递归应用基本构造、连接、选择和闭包操作,将正则表达式转换为等价的 NFA。

    By recursively applying the basic construction, concatenation, alternation, and Kleene closure operations, convert the regular expression to an equivalent NFA.

示例 (Example)

假设有一个正则表达式 。通过 Thompson 构造法,将其转换为等价的 NFA。

Consider a regular expression . Using the Thompson construction method, convert it to an equivalent NFA.

转换后的 NFA 接受由一个或多个 开始,并以 结尾的字符串。

The resulting NFA accepts strings that start with one or more or and end with .

总结 (Summary)

DFA 和 NFA 是用于识别正则语言的两种类型的有限自动机。虽然 DFA 和 NFA 在表达能力上是等价的,但它们在定义和实现上有所不同。通过子集构造法,可以将 NFA 转换为等价的 DFA,并且正则表达式、DFA 和 NFA 之间可以相互转换,用于描述相同的正则语言。