RednaxelaFX对《The Compiler Design Handbook》的笔记(5)

RednaxelaFX
RednaxelaFX (Script Ahead, Code Behind)

读过 The Compiler Design Handbook

The Compiler Design Handbook
  • 书名: The Compiler Design Handbook
  • 作者: Srikant, Y. N. (EDT)/ Shankar, Priti (EDT)
  • 副标题: Optimizations and Machine Code Generation, Second Edition
  • 页数: 784
  • 出版社: CRC Press
  • 出版年: 2007-12-7
  • 第10页 Chapter 10 Dynamic Compilation

    (页码是10-1) 这章是IBM T.J. Watson的Evelyn Duesterwald写的,关于动态编译器的概述。 它将dynamic binary translator与JIT compiler统一到同一个框架里来介绍,将两者的异同都介绍到了,这点很不错。 可以作为JIT编译器的入门文章读读。

    2014-11-24 09:06:57 回应
  • 第16页 16.2.3.1 Maril
    Maril contains the following three sections: * Declaration: The declaration section describes structural information, such as register files, memory, and abstract hardware resources. Abstract hardware resources include pipeline stages and data buses. They are useful for specifying reservation tables, which are essentially temporal mapping relationships between instructions and architecture resources. A reservation table captures the resource usage of an instruction at every cycle starting with the fetch. It is often a one-to-many mapping since there may be multiple alternative paths for an instruction. Besides hardware structures, the declaration section also contains information such as the range of immediate operands or relative branch offset. This is necessary for the correctness of generated code. * Runtime model: The runtime model section specifies the conventions of the architecture, mostly the function calling conventions. However, it is not intended to be a general framework for specifying these, but instead, is a parameter system for configuring the Marion compiler. Specifying the function calling conventions is a complicated subject and falls beyond the scope of this chapter. Interested readers can refer to [5] for more information. * Instruction: The instruction section describes the instruction set. Each instruction is specified in five parts. The first part is the instruction mnemonics and operands, which specify the assembly format of instructions. The optional second part declares data type constraints of the operation for code selection use. The third part describes an expression that can be used by the Marion code generator as the tree-pattern of the instruction. Only one assignment can occur in the expression since the tree-pattern can have only one root node (and an assignment operator is always a root). As a result, this part cannot specify instruction behaviors such as side effects on machine flags and post-increment memory addressing modes because they all involve additional assignments. This limitation can be worked around for code generation purposes since code generators require simplified semantics specification anyway. But for simulators, the Maril description seems insufficient. The fourth part of the instruction description is a triple of (cost, latency, slot). The cost is used to distinguish actual instructions from the so-called dummy instructions, which are used for type conversion. The latency is the number of cycles before the result of the instruction can be used by other operations. The slot specifies the delay slot count. Instruction encoding information is not provided in Maril. An example of definition integer "Add" instruction is given below [7]. %instr Add r, r, r (int) {$3 = $1+$2;} {IF; ID; EX; MEM; WB} (1,1,0)
    引自 16.2.3.1 Maril

    这个Maril的ADL感觉跟HotSpot C2的ADL颇有相似之处。两者都使用简化的模型来描述具体架构的行为,而且两者都带有tree-pattern,也都因此而使描述能力受到限制——无法描述多赋值的行为(例如add-with-overflow、divmod等)。

    2014-11-24 09:55:28 回应
  • 第16页 Chapter 16 Architecture Description Languages for Retargetable Compilation

    实际页码是16-1。 这一整章是两位作者之前一篇论文的修整版: Architecture Description Languages for Retargetable Compilation http://pdf.aminer.org/000/000/248/architecture_description_languages_for_retargetable_compilation.pdf 内容大体一样。

    2014-11-25 02:51:42 回应
  • 第17页 17.4.3 Hard Coded Bottom-Up Code-Generator Generators

    实际页码为17-22。

    17.4.3 Hard Coded Bottom-Up Code-Generator Generators Hard coded code-generator generators are exemplified in the work of Franser et al. [20] and Emmelmann et al. [17]. They mirror their input specification in the same way that recursive descent parsers mirror the structure of the underlying LL(1) grammar. Examples of such tools are BEG [17] and iburg [20]. Code generators generated by such tools are easy to understand and debug, as the underlying logic is simple. The code generator that is output typically works in two passes on the subject tree. In a first bottom-up, left-to-right pass it labels each node with the set of nonterminals that match at the node. Then in a second top-down pass, it visits each node, performing appropriate semantic actions, such as generating code. The transition table used in the techniques described earlier in this section are thus encoded in the flow of control of the code generator, with cost computations being performed dynamically. [17]: BEG: a generator for efficient back ends http://dl.acm.org/citation.cfm?id=74838 [20]: Engineering a simple, efficient code-generator generator http://dl.acm.org/citation.cfm?id=151640.151642
    引自 17.4.3 Hard Coded Bottom-Up Code-Generator Generators

    这里就跟iburg联系上了。 Cross-link: 《A Retargetable C Compiler》的笔记-第373页 http://book.douban.com/annotation/30843267/

    2014-11-25 15:02:12 回应
  • 第17页 17.6 Related Issues

    实际页码为17-32。 要让后端能给DAG形IR生成好代码真是件头疼的事。

    An important issue is the generator of code for a directed acyclic graph (DAG) where shared nodes represent common subexpressions. The selection of optimal code for DAGs has been shown to be intractable [5], but there are heuristics that can be employed to generate code [6]. The labeling phase of a bottom-up tree parser can be modified to work with DAGs. One possibility is that the code generator could, in the top-down phase, perform code generation in the normal way but count visits for each node. For the first visit it could evaluate the shared subtree into a register and keep track of the register assigned. On subsequent visits to the node it could reuse the value stored in the register. However, assigning common subexpressions to registers is not always a good solution, especially when addressing modes in registers provide free computations associated with offsets and immediate operands. One solution to this problem involves adding a DAG operator to the intermediate language [10]. [5]: Code generation for expressions with common subexpressions http://dl.acm.org/citation.cfm?id=322001 [6]: Compilers: Principles, techniques, and tools [10]: Discussion: Code generator specification techniques http://link.springer.com/chapter/10.1007%2F978-1-4471-3501-2_4
    引自 17.6 Related Issues

    另一篇论文要马克一下的是: Optimal code selection in DAGs, M. Anton Ertl, 1999 http://dl.acm.org/citation.cfm?id=292562

    2014-11-25 15:28:56 回应