The HPC clusters also tend to have long-running jobs that keep the server fully utilized, even for weeks at a time, while the utilization of servers in WSCs ranges between 10% and 50% and varies every day. (查看原文)
(2) 存储器寻址。80x86(见附录图A-2)不需要对齐,但如果操作数是对齐的,访问速度通常会更快一些。
(See Figure A.5 on page A-8.) The 80x86 does not require
alignment, but accesses are generally faster if operands are aligned.
(见附录A图A.5) (查看原文)
A difficult decision is whether to make the cache hit time fast, to keep pace with the high clock rate of processors, or to make the cache large to reduce the gap between the processor accesses and main memory accesses. Adding another level of cache between the original cache and memory simplifies the decision (see Figure 2.3). The first-level cache can be small enough to match a fast clock cycle time, yet the second-level (or third-level) cache can be large enough to capture many accesses that would go to main memory. The focus on misses in second-level caches leads to larger blocks, bigger capacity, and higher associativity. Multilevel caches are more power efficient than a single aggregate cache. (查看原文)