形式语言基本概念(Fundamental Concepts of Formal Languages)
语言 (Language)
定义 (Definition): 在计算机科学中,语言是由有限或无限个字符串(符号序列)组成的集合,这些字符串从某个特定的字母表中选取。
In computer science, a language is a set of strings (sequences of symbols) that are chosen from a specific alphabet, which may be finite or infinite.
集合 (Set)
定义 (Definition): 集合是对象的无序集合,这些对象称为集合的元素。用大写字母表示集合,例如:。
A set is an unordered collection of objects, known as elements of the set. Sets are typically denoted by uppercase letters, such as .
示例 (Example):
字母表 (Alphabet)
定义 (Definition): 字母表是一个有限的符号集合,用符号 表示。字母表中的每个符号称为字母。
An alphabet is a finite set of symbols, typically denoted by the symbol . Each symbol in the alphabet is called a letter.
示例 (Example):
形式语言 (Formal Language)
定义 (Definition): 形式语言是从某个字母表 中选取的一些字符串的集合。形式语言用来描述语法结构和计算模型。
A formal language is a set of strings of symbols that are chosen from an alphabet . Formal languages are used to describe syntactic structures and computational models.
示例 (Example): 如果 ,则 表示由 生成的所有可能字符串的集合,包括空串 。例如:
If , then denotes the set of all possible strings generated by , including the empty string . For example:
基本标识方法 (Basic Notation Methods)
联接 (Concatenation)
两个字符串 和 的联接表示为 ,是将 和 中的符号依次连接形成的新字符串。
The concatenation of two strings and , denoted by , is a new string formed by appending the symbols of to the symbols of in order.
示例 (Example): 若 且 ,则 。
If and , then .
Kleene 星 (Kleene Star)
对于字母表 , 表示 上所有可能字符串(包括空串 )的集合。
For an alphabet , denotes the set of all possible strings over , including the empty string .
长度 (Length)
字符串 的长度表示为 ,表示 中符号的个数。
The length of a string , denoted by , represents the number of symbols in .
示例 (Example): 若 ,则 。
If , then .
子串 (Substring)
字符串 是字符串 的子串,若存在字符串 和 使得 。
A string is a substring of a string if there exist strings and such that .
逆序 (Reversal)
字符串 的逆序表示为 ,是将 中的符号按逆序排列的新字符串。
The reversal of a string , denoted by , is a new string formed by reversing the order of the symbols in .
示例 (Example): 若 ,则 。
If , then .
重要概念 (Important Concepts)
空语言 (Empty Language)
空语言表示为 ,它是一个不包含任何字符串的集合。
The empty language, denoted by , is a set that contains no strings.
空串 (Empty String)
空串表示为 ,它是长度为零的字符串。
The empty string, denoted by , is a string of length zero.
幂集 (Power Set)
集合 的幂集表示为 ,它是由 的所有子集组成的集合。
The power set of a set , denoted by , is the set of all subsets of .
示例 (Example): 若 ,则 。
If , then .
语言运算 (Operations on Languages)
并 (Union)
两个语言 和 的并表示为 ,它是所有属于 或 的字符串的集合。
The union of two languages and , denoted by , is the set of all strings that belong to or .
交 (Intersection)
两个语言 和 的交表示为 ,它是所有属于 和 的字符串的集合。
The intersection of two languages and , denoted by , is the set of all strings that belong to both and .
差 (Difference)
两个语言 和 的差表示为 ,它是属于 但不属于 的字符串的集合。
The difference of two languages and , denoted by , is the set of all strings that belong to but not to .
连接 (Concatenation)
两个语言 和 的连接表示为 ,它是所有形如 的字符串的集合,其中 且 。
The concatenation of two languages and , denoted by , is the set of all strings of the form where and .
Kleene 闭包 (Kleene Closure)
语言 的 Kleene 闭包表示为 ,它是所有 的字符串的零个或多个连接构成的集合。
The Kleene closure of a language , denoted by , is the set of all strings formed by zero or more concatenations of strings from .
示例 (Example): 若 ,则 。
If , then
前缀 (Prefix)
定义 (Definition): 字符串 是字符串 的前缀,若存在字符串 使得 。
A string is a prefix of a string if there exists a string such that .
示例 (Example): 若 ,则 和 都是 的前缀。
If , then and are prefixes of .
后缀 (Suffix)
定义 (Definition): 字符串 是字符串 的后缀,若存在字符串 使得 。
A string is a suffix of a string if there exists a string such that .
示例 (Example): 若 ,则 和 都是 的后缀。
If , then and are suffixes of .
子序列 (Subsequence)
定义 (Definition): 字符串 是字符串 的子序列,若可以通过从 中删除一些字符(可以是零个)得到 ,但不改变剩余字符的顺序。
A string is a subsequence of a string if can be obtained by deleting some (possibly zero) characters from without changing the order of the remaining characters.
示例 (Example): 若 ,则 是 的子序列。
If , then is a subsequence of .
形式语言的属性 (Properties of Formal Languages)
封闭性 (Closure Properties)
形式语言在某些运算下是封闭的,即运算的结果仍然是形式语言。这些运算包括并、交、差、连接和 Kleene 闭包。
Formal languages are closed under certain operations, meaning the result of these operations is still a formal language. These operations include union, intersection, difference, concatenation, and Kleene closure.
封闭性运算 (Closure Operations):
-
并 (Union): 如果 和 是形式语言,则 也是形式语言。
If and are formal languages, then is also a formal language.
-
交 (Intersection): 如果 和 是形式语言,则 也是形式语言。
If and are formal languages, then is also a formal language.
-
差 (Difference): 如果 和 是形式语言,则 也是形式语言。
If and are formal languages, then is also a formal language.
-
连接 (Concatenation): 如果 和 是形式语言,则 也是形式语言。
If and are formal languages, then is also a formal language.
-
Kleene 闭包 (Kleene Closure): 如果 是形式语言,则 也是形式语言。
If is a formal language, then is also a formal language.
逆 (Reversal)
定义 (Definition): 语言 的逆表示为 ,它包含 中所有字符串的逆序字符串。
The reversal of a language , denoted by , consists of the reversals of all strings in .
示例 (Example): 若 ,则 。
If , then .
形式语法 (Formal Grammar)
定义 (Definition): 形式语法是描述语言结构的一种方法,由一个四元组 组成,其中:
A formal grammar is a method for describing the structure of languages, consisting of a four-tuple , where:
- 是一组非终结符号 (a set of non-terminal symbols)
- 是一组终结符号 (a set of terminal symbols)
- 是一组产生式规则 (a set of production rules)
- 是开始符号 (the start symbol)
例子 (Example)
假设有一个形式语法 。这个语法生成的语言是 。
Consider a formal grammar . The language generated by this grammar is .