# 《Engineering a Compiler, Second Edition》的笔记-第85页

- 章节名：Ch2.4
- 页码：第85页 2014-10-09 13:52:24

SECTION REVIEW Given a regular expression, we can derive a minimal DFA to recognize the language specified by the RE using the following steps: (1) apply Thompson's construction to build an NFA for the RE; (2) use the subset construction to derive a DFA that simulates the behavior of the RE; and (3) use Hopcroft's algorithm to identify equivalent states in the DFA and construct a minimal DFA. This trio of constructions produces an efficient recognizer for any language that can be specified with an RE. Both the subset construction and the DFA minimization algorithm are fixed-point computations. They are characterized by repeated applica-tion of a monotone function to some set; the properties of the domain play an important role in reasoning about the termination and complex-ity of these algorithms. We will see more fixed-point computations in later chapters ... This section discusses three implementation strategies for converting a dfa into executable code: a table-driven scanner, a direct-coded scanner, and a hand-coded scanner. All of these scanners operate in the same manner, by simulating the dfa. They repeatedly read the next character in the input and simulate the dfa transition caused by that character. This process stops when the dfa recognizes a word. As described in the previous section, that occurs when the current state, s, has no outbound transition on the current input character. If s is an accepting state, the scanner recognizes the word and returns a lex- eme and its syntactic category to the calling procedure. If s is a nonaccepting state, the scanner must determine whether or not it passed through an accept- ing state on the way to s. If the scanner did encounter an accepting state, it should roll back its internal state and its input stream to that point and report success. If it did not, it should report the failure. These three implementation strategies, table driven, direct coded, and hand coded, differ in the details of their runtime costs. However, they all have the same asymptotic complexity—constant cost per character, plus the cost of roll back. The differences in the efficiency of well-implemented scanners change the constant costs per character but not the asymptotic complexity of scanning.

9人阅读

## 说明 · · · · · ·

表示其中内容是对原文的摘抄