《深入理解计算机系统》的原文摘录

  • The biggest speedup you’ll ever get with a program will be when you first get it working. —John K. Ousterhout (查看原文)
    晨星 2012-10-25 10:52:26
    —— 引自第474页
  • Even the best compilers, however, can be thwarted by optimization blockers—aspects of the program’s behavior that depend strongly on the execution environment. Programmers must assist the compiler by writing code that can be optimized readily. (查看原文)
    晨星 2012-10-25 10:52:42
    —— 引自第475页
  • The first step in optimizing a program is to eliminate unnecessary work, making the code perform its intended task as efficiently as possible. This includes eliminating unnecessary function calls, conditional tests, and memory references. These optimizations do not depend on any specific properties of the target machine. (查看原文)
    晨星 2012-10-25 10:52:58
    —— 引自第475页
  • With the understanding of processor operation, we can take a second step in program optimization, exploiting the capability of processors to provide instruction-level parallelism, executing multiple instructions simultaneously. (查看原文)
    晨星 2012-10-25 10:53:21
    —— 引自第475页
  • One useful strategy is to do only as much rewriting of a program as is required to get it to the point where the compiler can then generate efficient code. By this means, we avoid compromising the readability, modularity, and portability of the code as much as if we had to work with a compiler of only minimal capabilities. (查看原文)
    晨星 2012-10-25 10:53:34
    —— 引自第476页
  • The case where two pointers may designate the same memory location is known as memory aliasing. In performing only safe optimizations, the compiler must assume that different pointers may be aliased. (查看原文)
    晨星 2012-10-25 10:53:48
    —— 引自第477页
  • This leads to one of the major optimization blockers, aspects of programs that can severely limit the opportunities for a compiler to generate optimized code. If a compiler cannot determine whether or not two pointers may be aliased, it must assume that either case is possible, limiting the set of possible optimizations. (查看原文)
    晨星 2012-10-25 11:04:03
    —— 引自第478页
  • Most compilers do not try to determine whether a function is free of side effects and hence is a candidate for optimizations. Instead, the compiler assumes the worst case and leaves function calls intact. (查看原文)
    晨星 2012-10-25 11:04:17
    —— 引自第479页
  • Modern compilers employ sophisticated algorithms to determine what values are computed in a program and how they are used. They can then exploit opportunities to simplify expressions, to use a single computation in several different places, and to reduce the number of times a given computation must be performed. (查看原文)
    晨星 2012-10-25 11:04:47
    —— 引自第476页
  • This optimization is an instance of a general class of optimization known as code motion. They involve identifying a computation that is performed multiple times (e.g., within a loop), but such that the result of the computation will not change. We can therefore move the computation to an earlier section of the code that does not get evaluated as often. (查看原文)
    晨星 2012-10-26 10:13:06
    —— 引自第487页
  • As we have seen, procedure calls can incur overhead and also block most forms of program optimization. (查看原文)
    晨星 2012-10-26 10:23:37
    —— 引自第490页
  • Unlike in IA32, where %ebp has special use as a frame pointer, its 64-bit counterpart %rbp can be used to hold arbitrary data. (查看原文)
    晨星 2012-10-29 10:49:57
    —— 引自第492页
  • This is one of the remarkable feats of modern microprocessors: they employ complex and exotic microarchitectures, in which multiple instructions can be executed in parallel, while presenting an operational view of simple sequential instruction execution. (查看原文)
    晨星 2012-10-30 10:19:08
    —— 引自第496页
  • Intel Core i7 processor design, which is often referred to by its project code name “Nehalem”, is described in the industry as being superscalar, which means it can perform multiple operations on every clock cycle, and out-of-order, meaning that the order in which instructions execute need not correspond to their ordering in the machine-level program. (查看原文)
    晨星 2012-10-30 10:19:27
    —— 引自第497页
  • The overall design of Nehalem has two main parts: the instruction control unit (ICU), which is responsible for reading a sequence of instructions from memory and generating from these a set of primitive operations to perform on program data, and the execution unit (EU), which then executes these operations. (查看原文)
    晨星 2012-10-30 10:19:41
    —— 引自第497页
  • Using a technique known as speculative execution, the processor begins fetching and decoding instructions at where it predicts the branch will go, and even begins executing these operations before it has been determined whether or not the branch prediction was correct. (查看原文)
    晨星 2012-10-30 10:20:01
    —— 引自第498页
  • With speculative execution, the operations are evaluated, but the final results are not stored in the program registers or data memory until the processor can be certain that these instructions should actually have been executed. (查看原文)
    晨星 2012-10-30 10:20:13
    —— 引自第499页
  • The most common mechanism for controlling the communication of operands among the execution units is called register renaming. (查看原文)
    晨星 2012-10-30 10:20:24
    —— 引自第500页
  • Latencies increase as the word sizes increase (e.g., from single to double precision), for more complex data types (e.g., from integer to floating point), and for more complex operations (e.g., from addition to multiplication). (查看原文)
    晨星 2012-10-30 10:20:39
    —— 引自第501页
  • The long latency and issue times of division make it a comparatively costly operation. (查看原文)
    晨星 2012-10-30 10:20:53
    —— 引自第501页
<前页 1 2 3 4 5 6 7 8 9 10 ... 17 18 后页>