David A. Patterson,加州大学伯克利分校计算机科学系教授,美国国家工程院院士,美国国家科学院院士,IEEE和ACM会士。他因为教学成果显著而荣获了加州大学的杰出教学奖、ACM的Karlstrom奖、IEEE的Mulligan教育奖章和本科生教学奖。因为对RISC技术的贡献,他获得IEEE的技术成就奖和ACM的Eckert-Mauchly奖;而在RAID方面的贡献为他赢得了IEEE Johnson信息存储奖。他还和John L. Hennessy分享了IEEE John von Neumann奖章和NEC C&C奖金。Patterson还是美国艺术与科学院院士、美国计算机历史博物馆院士,并被选入硅谷工程名人堂。Patterson身为美国总统信息技术顾问委员会委员,还曾担任加州大学伯克利分校电子工程与计算机科学系计算机科学分部主任、计算机...
David A. Patterson,加州大学伯克利分校计算机科学系教授,美国国家工程院院士,美国国家科学院院士,IEEE和ACM会士。他因为教学成果显著而荣获了加州大学的杰出教学奖、ACM的Karlstrom奖、IEEE的Mulligan教育奖章和本科生教学奖。因为对RISC技术的贡献,他获得IEEE的技术成就奖和ACM的Eckert-Mauchly奖;而在RAID方面的贡献为他赢得了IEEE Johnson信息存储奖。他还和John L. Hennessy分享了IEEE John von Neumann奖章和NEC C&C奖金。Patterson还是美国艺术与科学院院士、美国计算机历史博物馆院士,并被选入硅谷工程名人堂。Patterson身为美国总统信息技术顾问委员会委员,还曾担任加州大学伯克利分校电子工程与计算机科学系计算机科学分部主任、计算机研究协会(CRA)主席和ACM主席。这一履历使他荣获了ACM和CRA颁发的杰出服务奖。
John L. Hennessy,斯坦福大学的第10任校长,从1977年开始在该校电子工程与计算机系任教。Hennessy教授是IEEE和ACM会士,美国国家工程院、美国国家科学院和美国哲学院院士,美国艺术与科学院院士。他获得过众多奖项,如2001年度Eckert-Mauchly奖,表彰他对RISC技术的贡献;2001年度Seymour Cray计算机工程奖;与David Patterson共同获得的2000年度IEEE John von Neumann奖章。他还拥有7个荣誉博士学位。
目录
· · · · · ·
Preface v
About the Author xiii
CHAPTERS
1 Computer Abstractions and Technology 2
1.1 Introduction 3
1.2 Eight Great Ideas in Computer Architecture 11
· · · · · ·
(更多)
Preface v
About the Author xiii
CHAPTERS
1 Computer Abstractions and Technology 2
1.1 Introduction 3
1.2 Eight Great Ideas in Computer Architecture 11
1.3 Below Your Program 13
1.4 Under the Covers 16
1.5 Technologies for Building Processors and Memory 24
1.6 Performance 28
1.7 The Power Wall 40
1.8 The Sea Change: The Switch from Uniprocessors to Multiprocessors 43
1.9 Real Stuff: Benchmarking the Intel Core i7 46
1.10 Fallacies and Pitfalls 49
1.11 Concluding Remarks 52
1.12 Historical Perspective and Further Reading 54
1.13 Exercises 54
2 Instructions: Language of the Computer 60
2.1 Introduction 62
2.2 Operations of the Computer Hardware 63
2.3 Operands of the Computer Hardware 66
2.4 Signed and Unsigned Numbers 73
2.5 Representing Instructions in the Computer 80
2.6 Logical Operations 87
2.7 Instructions for Making Decisions 90
2.8 Supporting Procedures in Computer Hardware 96
2.9 MIPS Addressing for 32-Bit Immediates and Addresses 106
2.10 Parallelism and Instructions: Synchronization 116
2.11 Translating and Starting a Program 118
2.12 A C Sort Example to Put It All Together 126
2.13 Advanced Material: Compiling C 134
2.14 Real Stuff: ARMy7 (32-bit) Instructions 134
2.15 Real Stuff: x86 Instructions 138
2.16 Real Stuff: ARMv8 (64-bit) Instructions 147
2.17 Fallacies and Pitfalls 148
2.18 Concluding Remarks 150
2.19 Historical Perspective and Further Reading 152
2.20 Exercises 153
3 Arithmetic for Computers 164
3.1 Introduction 166
3.2 Addition and Subtraction 166
3.3 Multiplication 171
3.4 Division 177
3.5 Floating Point 184
3.6 Parallelism and Computer Arithmetic: Subword Parallelism 210
3.7 Real Stuff: Streaming SIMD Extensions and Advanced Vector Extensions in x86 212
3.8 Going Faster: Subword Parallelism and Matrix Multiply 213
3.9 Fallacies and Pitfalls 217
3.10 Concluding Remarks 220
3.11 Historical Perspective and Further Reading 224
3.12 Exercises 225
4 The Processor 230
4.1 Introduction 232
4.2 Logic Design Conventions 236
4.3 Building a Datapath 239
4.4 A Simple Implementation Scheme 247
4.5 An Overview ofPipelining 260
4.6 Pipelined Datapath and Control 274
4.7 Data Hazards: Forwarding versus Stalling 291
4.8 Control Hazards 304
4.9 Exceptions 313
4.10 Parallelism via Instructions 320
4.11 Real Stuff: The ARM Cortex-A8 and Intel Core i7 Pipelines 332
4.12 Going Faster: Instruction-Level Parallelism and Matrix Multiply 339
4.13 Advanced Topic: An Introduction to Digital Design Using a Hardware
Design Language to Describe and Model a Pipeline and More Pipelining
Illustrations 342
4.14 Fallacies and Pitfalls 343
4.15 CondudingRemarks 344
4.16 Historical Perspective and Further Reading 345
4.17 Exercises 345
5 Large and Fast: Exploiting Memory Hierarchy 360
5.1 Introduction 362
5.2 Memory Technologies 366
5.3 The Basics of Caches 371
5.4 Measuring and Improving Cache Performance 386
5.5 Dependable Memory Hierarchy 406
5.6 Virtual Machines 412
5.7 Virtual Memory 415
5.8 A Common Framework for Memory Hierarchy 442
5.9 Using a Finite-State Machine to Control a Simple Cache 449
5.10 Parallelism and Memory Hierarchies: Cache Coherence 454
5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks 458
5.12 Advanced Material: Implementing Cache Controllers 458
5.13 Real Stuff: The ARM Cortex-A8 and Intel Core i7 Memory Hierarchies 459
5.14 Going Faster: Cache Blocking and Matrix Multiply 463
5.15 Fallacies and Pitfalls 466
5.16 GoncludingRemarks 470
5.17 Historical Perspective and Further Reading 471
5.18 Exercises 471
6 Parallel Processors from Client to Cloud 488
6.1 Introduction 490
6.2 The Difficulty of Creating Parallel Processing Programs 492
6.3 SISD, MIMD, SIMD, SPMD, and Vector 497
6.4 Hardware Multithreading 504
6.5 Multicore and Other Shared Memory Multiprocessors 507
6.6 Introduction to Graphics Processing Units 512
6.7 Clusters, Warehouse Scale Computers, and Other Message-Passing Multiprocessors 519
6.8 Introduction to Multiprocessor Network Topologies 524
6.9 Communicating to the Outside World: Cluster Networking 527
6.10 Multiprocessor Benchmarks and Performance Models 528
6.11 Real Stuff: Benchmarking Intel Core i7 versus NVIDIA Tesla GPU 538
6.12 Going Faster: Multiple Processors and Matrix Multiply 543
6.13 Fallacies and Pitfalls 546
6.14 Concluding Remarks 548
6.15 Historical Perspective and Further Reading 551
6.16 Exercises 551
APPENDICES
A Assemblers, Linkers, and the SPiM Simulator A-2
A.1 Introduction A-3
A.2 Assemblers A-IO
A.3 Linkers A-18
A.4 Loading A-19
A.5 Memory Usage A-20
A.6 Procedure Call Convention A-22
A.7 Exceptions and Interrupts A-33
A.8 Input and Output A-38
A.9 SPIM A-40
A.10 MIPS R2000 Assembly Language A-45
A.11 Concluding Remarks A-81
A.12 Exercises A-82
B TH-2 High Performance Computing System B-2
B.1 Introduction B-3
B.2 Compute Node B-3
B.3 The Frontend Processors B-5
B.4 The Interconnect B-6
B.5 The Software Stack B-7
B.6 LINPACK Benchmark Run (HPL) B-7
B.7 Concluding Remarks B-8
F Networks-on-Chip F-2
F.1 Introduction F-3
F.2 Communication Centric Design F-3
F.3 The Design Space Exploration ofNoCs F-5
F.4 Router Micro-architecture F-8
F.5 Performance Metric F-9
F.6 Concluding Remarks F-9
Index I-1
· · · · · · (收起)
Civilization advances by extending the number of important operations which we can perform without thinking about them.
Alfred North Whitehead, An Introduction to Mathematics, 1911 (查看原文)
While programmers could ignore the advice and rely on computer architects, compiler writers, and silicon engineers to make their programs run faster without change, that era is over.
...
While the goal of many researchers is to make it possible for programmers to be unaware of the underlying parallel nature of the hardware they are programming, it will take many years to realize this vision. (查看原文)
第四章211页,第二行“必须考虑复制时存储指令后紧跟着的是装载指令的情况”,原文为“ However, consider loads immediately followed by stores, useful when performing memory-to-memory copies in the MIPS architecture. ”,应该翻译为“但是应当考虑到,在MIPS架构中...
(展开)
0 有用 Franklif1 2021-06-05 17:59:14
@2021-03-14 18:58:18 @2021-06-05 17:59:14 @2021-03-14 18:58:18 @2021-06-05 17:59:14 @2021-03-14 18:58:18
1 有用 dumlamb 2023-12-31 13:43:25 上海
这本书可以作为 computer architecture的入门,非常详细的讲解了计算机体系结构的基本知识,还有结合RTL设计例子,让人从底层就理解计算机硬件是如何实现的冰并且对软硬件接口也会有非常深的认识。版本有MIPS版,ARM版,RISC-V版,随便看哪一版都是可以的。看完之后再看同作者的《计算机体系结构:量化研究》,基本上就能说彻底理解计算机了吧。
0 有用 tears 2020-12-18 07:05:00
可以说是计算机体系结构入门者必读的好书
21 有用 曾思未知 2020-06-15 06:36:35
来美国读研第一个学期的computer architecture课程用的书(当时是第四版)。第一节课上,老师说这个学期的project分为十次作业一步步来,是设计一个32位的MIPS架构CPU,并且最后要在FPGA上实现各种指令。当时的我连ISA这个概念都不知道,无疑是崩溃的。通过一周周的学习,一次次的作业到期末终于实现的时候,我觉得我在理工科方面实现了一次思维的质变(当然也是因为我在国内本科学的... 来美国读研第一个学期的computer architecture课程用的书(当时是第四版)。第一节课上,老师说这个学期的project分为十次作业一步步来,是设计一个32位的MIPS架构CPU,并且最后要在FPGA上实现各种指令。当时的我连ISA这个概念都不知道,无疑是崩溃的。通过一周周的学习,一次次的作业到期末终于实现的时候,我觉得我在理工科方面实现了一次思维的质变(当然也是因为我在国内本科学的并不好)。以后的课程,即使好些概念不懂,但不会慌了,而是觉得“总是会理解的,project最终会实现的”。我现在在硅谷工作四年多了,一直以来都很感谢这门课的教授。 (展开)
1 有用 猫成一团 2015-08-05 11:39:57
新版看起来比较顺眼
1 有用 dumlamb 2023-12-31 13:43:25 上海
这本书可以作为 computer architecture的入门,非常详细的讲解了计算机体系结构的基本知识,还有结合RTL设计例子,让人从底层就理解计算机硬件是如何实现的冰并且对软硬件接口也会有非常深的认识。版本有MIPS版,ARM版,RISC-V版,随便看哪一版都是可以的。看完之后再看同作者的《计算机体系结构:量化研究》,基本上就能说彻底理解计算机了吧。
1 有用 BLACKJACK 2021-11-05 17:42:25
一共六章读了四章cpu实现就讲到这我没想到一直看的比老师讲的快cache和multicore没看 其实没看的很有用我挺想学会了去打个龙芯杯啥的用上我的数据压缩宝刀拉俩队友但傻逼的是最近搞砸了两个关系唉和人相处真是很难想起教父开头的几段突然又觉得自己有时候做的不错你说这人傻不傻逼我想做的好多都是一腔热血刚看了一个八十来岁的物理博士的故事 我觉得这条路挺好的 希望如果可以我也能让寿命悠长让老了的人再一... 一共六章读了四章cpu实现就讲到这我没想到一直看的比老师讲的快cache和multicore没看 其实没看的很有用我挺想学会了去打个龙芯杯啥的用上我的数据压缩宝刀拉俩队友但傻逼的是最近搞砸了两个关系唉和人相处真是很难想起教父开头的几段突然又觉得自己有时候做的不错你说这人傻不傻逼我想做的好多都是一腔热血刚看了一个八十来岁的物理博士的故事 我觉得这条路挺好的 希望如果可以我也能让寿命悠长让老了的人再一次追逐曾经的梦想 说回这个书 外国人是真的烦 整数啊指令啊谁都会的写老长 还不给数学表述 控制信号非得整几片纸说怎么设 中断之类的一笔带过 都知道你按字数收费 但你别简单的写多多复杂了就带过 难看 但我还是得谢谢你教会我cpu设计 以及real stuff我认为确实是很好的章节 认识处理器深入了很多 (展开)
0 有用 psxf 2021-06-27 23:59:31
怀念计组和OS课的日子... 想想自己做一个R3000的PlayStation2都觉得爽,可惜MIPS没落了
0 有用 Franklif1 2021-06-05 17:59:14
@2021-03-14 18:58:18 @2021-06-05 17:59:14 @2021-03-14 18:58:18 @2021-06-05 17:59:14 @2021-03-14 18:58:18
1 有用 z 2021-04-28 12:17:37
不错