形式语言基本概念(Fundamental Concepts of Formal Languages)

语言 (Language)

定义 (Definition): 在计算机科学中,语言是由有限或无限个字符串(符号序列)组成的集合,这些字符串从某个特定的字母表中选取。

In computer science, a language is a set of strings (sequences of symbols) that are chosen from a specific alphabet, which may be finite or infinite.

集合 (Set)

定义 (Definition): 集合是对象的无序集合,这些对象称为集合的元素。用大写字母表示集合,例如:

A set is an unordered collection of objects, known as elements of the set. Sets are typically denoted by uppercase letters, such as .

示例 (Example):

字母表 (Alphabet)

定义 (Definition): 字母表是一个有限的符号集合,用符号 表示。字母表中的每个符号称为字母。

An alphabet is a finite set of symbols, typically denoted by the symbol . Each symbol in the alphabet is called a letter.

示例 (Example):

形式语言 (Formal Language)

定义 (Definition): 形式语言是从某个字母表 中选取的一些字符串的集合。形式语言用来描述语法结构和计算模型。

A formal language is a set of strings of symbols that are chosen from an alphabet . Formal languages are used to describe syntactic structures and computational models.

示例 (Example): 如果 ,则 表示由 生成的所有可能字符串的集合,包括空串 。例如:

If , then denotes the set of all possible strings generated by , including the empty string . For example:

基本标识方法 (Basic Notation Methods)

联接 (Concatenation)

两个字符串 的联接表示为 ,是将 中的符号依次连接形成的新字符串。

The concatenation of two strings and , denoted by , is a new string formed by appending the symbols of to the symbols of in order.

示例 (Example):,则

If and , then .

Kleene 星 (Kleene Star)

对于字母表 表示 上所有可能字符串(包括空串 )的集合。

For an alphabet , denotes the set of all possible strings over , including the empty string .

长度 (Length)

字符串 的长度表示为 ,表示 中符号的个数。

The length of a string , denoted by , represents the number of symbols in .

示例 (Example):,则

If , then .

子串 (Substring)

字符串 是字符串 的子串,若存在字符串 使得

A string is a substring of a string if there exist strings and such that .

逆序 (Reversal)

字符串 的逆序表示为 ,是将 中的符号按逆序排列的新字符串。

The reversal of a string , denoted by , is a new string formed by reversing the order of the symbols in .

示例 (Example):,则

If , then .

重要概念 (Important Concepts)

空语言 (Empty Language)

空语言表示为 ,它是一个不包含任何字符串的集合。

The empty language, denoted by , is a set that contains no strings.

空串 (Empty String)

空串表示为 ,它是长度为零的字符串。

The empty string, denoted by , is a string of length zero.

幂集 (Power Set)

集合 的幂集表示为 ,它是由 的所有子集组成的集合。

The power set of a set , denoted by , is the set of all subsets of .

示例 (Example):,则

If , then .

语言运算 (Operations on Languages)

并 (Union)

两个语言 的并表示为 ,它是所有属于 的字符串的集合。

The union of two languages and , denoted by , is the set of all strings that belong to or .

交 (Intersection)

两个语言 的交表示为 ,它是所有属于 的字符串的集合。

The intersection of two languages and , denoted by , is the set of all strings that belong to both and .

差 (Difference)

两个语言 的差表示为 ,它是属于 但不属于 的字符串的集合。

The difference of two languages and , denoted by , is the set of all strings that belong to but not to .

连接 (Concatenation)

两个语言 的连接表示为 ,它是所有形如 的字符串的集合,其中

The concatenation of two languages and , denoted by , is the set of all strings of the form where and .

Kleene 闭包 (Kleene Closure)

语言 的 Kleene 闭包表示为 ,它是所有 的字符串的零个或多个连接构成的集合。

The Kleene closure of a language , denoted by , is the set of all strings formed by zero or more concatenations of strings from .

示例 (Example):,则

If , then

前缀 (Prefix)

定义 (Definition): 字符串 是字符串 的前缀,若存在字符串 使得

A string is a prefix of a string if there exists a string such that .

示例 (Example):,则 都是 的前缀。

If , then and are prefixes of .

后缀 (Suffix)

定义 (Definition): 字符串 是字符串 的后缀,若存在字符串 使得

A string is a suffix of a string if there exists a string such that .

示例 (Example):,则 都是 的后缀。

If , then and are suffixes of .

子序列 (Subsequence)

定义 (Definition): 字符串 是字符串 的子序列,若可以通过从 中删除一些字符(可以是零个)得到 ,但不改变剩余字符的顺序。

A string is a subsequence of a string if can be obtained by deleting some (possibly zero) characters from without changing the order of the remaining characters.

示例 (Example):,则 的子序列。

If , then is a subsequence of .

形式语言的属性 (Properties of Formal Languages)

封闭性 (Closure Properties)

形式语言在某些运算下是封闭的,即运算的结果仍然是形式语言。这些运算包括并、交、差、连接和 Kleene 闭包。

Formal languages are closed under certain operations, meaning the result of these operations is still a formal language. These operations include union, intersection, difference, concatenation, and Kleene closure.

封闭性运算 (Closure Operations):

  • 并 (Union): 如果 是形式语言,则 也是形式语言。

    If and are formal languages, then is also a formal language.

  • 交 (Intersection): 如果 是形式语言,则 也是形式语言。

    If and are formal languages, then is also a formal language.

  • 差 (Difference): 如果 是形式语言,则 也是形式语言。

    If and are formal languages, then is also a formal language.

  • 连接 (Concatenation): 如果 是形式语言,则 也是形式语言。

    If and are formal languages, then is also a formal language.

  • Kleene 闭包 (Kleene Closure): 如果 是形式语言,则 也是形式语言。

    If is a formal language, then is also a formal language.

逆 (Reversal)

定义 (Definition): 语言 的逆表示为 ,它包含 中所有字符串的逆序字符串。

The reversal of a language , denoted by , consists of the reversals of all strings in .

示例 (Example):,则

If , then .

形式语法 (Formal Grammar)

定义 (Definition): 形式语法是描述语言结构的一种方法,由一个四元组 组成,其中:

A formal grammar is a method for describing the structure of languages, consisting of a four-tuple , where:

  • 是一组非终结符号 (a set of non-terminal symbols)
  • 是一组终结符号 (a set of terminal symbols)
  • 是一组产生式规则 (a set of production rules)
  • 是开始符号 (the start symbol)

例子 (Example)

假设有一个形式语法 。这个语法生成的语言是

Consider a formal grammar . The language generated by this grammar is .