A Philosophy of Software Design
Preface
- The most fundamental problem in computer science is problem decomposition: how to take a complex problem and divide it up into pieces that can be solved independently.
1 Introduction
### 1.1 How to use this book
- The best way to use this book is in conjunction with code reviews.- One of the best ways to improve your design skills is to learn to recognize red flags: signs that a piece of code is probably more complicated than it needs to be.
- When applying the ideas from this book, it’s important to use moderation and discretion.- This means that the greatest limitation in writing software is our ability to understand the systems we are creating.
- There are two general approaches to fighting complexity, both of which will be discussed in this book.
- The first approach is to eliminate complexity by making code simpler and more obvious.- The second approach to complexity is to encapsulate it, so that programmers can work on a system without being exposed to all of its complexity at once.- agile development, in which the initial design focuses on a small subset of the overall functionality.- the waterfall model, in which a project is divided into discrete phases such as requirements definition, design, coding, testing, and maintenance.- If software developers should always be thinking about design issues, and reducing complexity is the most important element of software design, then software developers should always be thinking about complexity.
2 The Nature of Complexity
### 2.1 Complexity defined- Complexity is anything related to the structure of a software system that makes it hard to understand and modify the system.- If a software system is hard to understand and modify, then it is complicated; if it is easy to understand and modify, then it is simple.- The overall complexity of a system(C)is determined by the complexity of each part p(cp)weighted by the fraction of time developers spend working on that part(tp).### 2.2 Symptoms of complexity- Complexity manifests itself in three general ways, which are described in the paragraphs below. - Change amplification: The first symptom of complexity is that a seemingly simple change requires code modifications in many different places. - One of the goals of good design is to reduce the amount of code that is affected by each design decision, so design changes don’t require very many code modifications. - Cognitive load: The second symptom of complexity is cognitive load, which refers to how much a developer needs to know in order to complete a task. - Sometimes an approach that requires more lines of code is actually simpler, because it reduces cognitive load. - The third symptom of complexity is that it is not obvious which pieces of code must be modified to complete a task, or what information a developer must have to carry out the task successfully. - Of the three manifestations of complexity, unknown unknowns are the worst.- One of the most important goals of good design is for a system to be obvious.### 2.3 Causes of complexity- Complexity is caused by two things: dependencies and obscurity. - a dependency exists when a given piece of code cannot be understood and modified in isolation; the code relates in some way to other code, and the other code must be considered and/or modified if the given code is changed. - However, one of the goals of software design is to reduce the number of dependencies and to make the dependencies that remain as simple and obvious as possible. - Dependencies lead to change amplification and a high cognitive load. - Obscurity occurs when important information is not obvious. - Obscurity is often associated with dependencies, where it is not obvious that a dependency exists. - Inconsistency is also a major contributor to obscurity - The best way to reduce obscurity is by simplifying the system design. - Obscurity creates unknown unknowns, and also contributes to cognitive load.### 2.4 Complexity is incremental- Complexity isn’t caused by a single catastrophic error; it accumulates in lots of small chunks.- n order to slow the growth of complexity, you must adopt a“zerotolerance” philosophy
3 Working Code Isn’t Enough
### 3.1 Tactical programming- In the tactical approach, your main focus is to get something working, such as a new feature or a bug fix.- The problem with tactical programming is that it is short-sighted.- The tactical tornado is a prolific programmer who pumps out code far faster than others but works in a totally tactical fashion.### 3.2 Strategic programming- The first step towards becoming a good software designer is to realize that working code isn’t enough.- Rather than taking the fastest path to finish your current project, you must invest time to improve the design of the system.### 3.3 How much to invest?- The ideal design tends to emerge in bits and pieces, as you get experience with the system.- Thus, the best approach is to make lots of small investments on a continual basis.### 3.4 Startups and investment- The best way to lower development costs is to hire great engineers: they don’t cost much more than mediocre engineers but have tremendously higher productivity.- Facebook changed its motto to“Movefast with solid infrastructure” to encourage its engineers to invest more in good design.
4 Modules Should Be Deep
### 4.1 Modular design- In an ideal world, each module would be completely independent of the others: a developer could work in any of the modules without knowing anything about any of the other modules.- a module is any unit of code that has an interface and an implementation.- In order to manage dependencies, we think of each module in two parts: an interface and an implementation. - Typically, the interface describes what the module does but not how it does it.- The best modules are those whose interfaces are much simpler than their implementations. - First, a simple interface minimizes the complexity that a module imposes on the rest of the system. - Second, if a module is modified in a way that does not change its interface, then no other module will be affected by the modification.### 4.2 What’s in an interface?- The interface to a module contains two kinds of information: formal and informal. - The formal parts of an interface are specified explicitly in the code, and some of these can be checked for correctness by the programming language. - The informal parts of an interface include its high-level behavior, such as the fact that a function deletes the file named by one of its arguments. - The informal aspects of an interface can only be described using comments, and the programming language cannot ensure that the description is complete or accurate1.- One of the benefits of a clearly specified interface is that it indicates exactly what developers need to know in order to use the associated module.### 4.3 Abstractions- An abstraction is a simplified view of an entity, which omits unimportant details.- The more unimportant details that are omitted from an abstraction, the better.- The key to designing abstractions is to understand what is important, and to look for designs that minimize the amount of information that is important.### 4.4 Deep modules- The best modules are those that provide powerful functionality yet have simple interfaces. - The best modules are deep: they have a lot of functionality hidden behind a simple interface. - A deep module is a good abstraction because only a small fraction of its internal complexity is visible to its users.- The benefit provided by a module is its functionality. The cost of a module(interms of system complexity) is its interface.### 4.5 Shallow modules- A shallow module is one whose interface is complicated relative to the functionality it provides.### 4.6 Classitis- Classitis may result in classes that are individually simple, but it increases the complexity of the overall system.### 4.7 Examples: Java and Unix I/O- interfaces should be designed to make the common case as simple as possible
5 Information Hiding(andLeakage)
### 5.1 Information hiding- The most important technique for achieving deep modules is information hiding. - The basic idea is that each module should encapsulate a few pieces of knowledge, which represent design decisions. - The knowledge is embedded in the module’s implementation but does not appear in its interface, so it is not visible to other modules.- If you can hide more information, you should also be able to simplify the module’s interface, and this makes the module deeper.- The best form of information hiding is when information is totally hidden within a module, so that it is irrelevant and invisible to users of the module.### 5.2 Information leakage- The opposite of information hiding is information leakage. Information leakage occurs when a design decision is reflected in multiple modules.- One of the best skills you can learn as a software designer is a high level of sensitivity to information leakage.### 5.3 Temporal decomposition- In temporal decomposition, the structure of a system corresponds to the time order in which operations will occur.- When designing modules, focus on the knowledge that’s needed to perform each task, not the order in which tasks occur.### 5.5 Example: too many classes- information hiding can often be improved by making a class slightly larger.### 5.7 Example: defaults in HTTP responses- Whenever possible, classes should“dothe right thing” without being explicitly asked.- If the API for a commonly used feature forces users to learn about other features that are rarely used, this increases the cognitive load on users who don’t need the rarely used features.### 5.8 Information hiding within a class- if you can reduce the number of places where a variable is used, you will eliminate dependencies within the class and reduce its complexity.- ry to design the private methods within a class so that each method encapsulates some information or capability and hides it from the rest of the class.### 5.9 Taking it too far- As a software designer, your goal should be to minimize the amount of information needed outside a module
6 General-Purpose Modules are Deeper
### 6.1 Make classes somewhat general-purpose- In my experience, the sweet spot is to implement new modules in a somewhat general-purpose fashion. - The phrase“somewhatgeneral-purpose” means that the module’s functionality should reflect your current needs, but its interface should not. - The word“somewhat”is important: don’t get carried away and build something so general-purpose that it is difficult to use for your current needs. - The most important(andperhaps surprising) benefit of the general-purpose approach is that it results in simpler and deeper interfaces than a special-purpose approach.### 6.2 Example: storing text for an editor- One of the goals in class design is to allow each class to be developed independently, but the specialized approach tied the user interface and text classes together.### 6.4 Generality leads to better information hiding- One of the most important elements of software design is determining who needs to know what, and when.### 6.5 Questions to ask yourself- If a method is designed for one particular use, such as the backspace method, that is a red flag that it may be too special-purpose.- If you have to write a lot of additional code to use a class for your current purpose, that’s a red flag that the interface doesn’t provide the right functionality.
7 Different Layer, Different Abstraction
- In a well-designed system, each layer provides a different abstraction from the layers above and below it; if you follow a single operation as it moves up and down through layers by invoking methods, the abstractions change with each method call.- If a system contains adjacent layers with similar abstractions, this is a red flag that suggests a problem with the class decomposition.### 7.1 Pass-through methods- A pass-through method is one that does nothing except pass its arguments to another method, usually with the same API as the pass-through method. - Pass-through methods make classes shallower - Pass-through methods also create dependencies between classes### 7.2 When is interface duplication OK?- It is fine for several methods to have the same signature as long as each of them provides useful and distinct functionality.- A dispatcher is a method that uses its arguments to select one of several other methods to invoke; then it passes most or all of its arguments to the chosen method.### 7.3 Decorators- A decorator object takes an existing object and extends its functionality; it provides an API similar or identical to the underlying object, and its methods invoke the methods of the underlying object.- The motivation for decorators is to separate special-purpose extensions of a class from a more generic core.- Sometimes decorators make sense, but there is usually a better alternative.### 7.4 Interface versus implementation- Another application of the“differentlayer, different abstraction” rule is that the interface of a class should normally be different from its implementation: the representations used internally should be different from the abstractions that appear in the interface.### 7.5 Pass-through variables- Pass-through variables add complexity because they force all of the intermediate methods to be aware of their existence, even though the methods have no use for the variables.- One approach is to see if there is already an object shared between the topmost and bottommost methods.- Another approach is to store the information in a global variable- The solution I use most often is to introduce a context object- A context stores all of the application’s global state(anythingthat would otherwise be a pass-through variable or global variable).- o reduce the number of methods that must be aware of it, a reference to the context can be saved in most of the system’s major objects.- Contexts may also create thread-safety issues; the best way to avoid problems is for variables in a context to be immutable.- Each piece of design infrastructure added to a system, such as an interface, argument, function, class, or definition, adds complexity, since developers must learn about this element.- The“differentlayer, different abstraction” rule is just an application of this idea: if different layers have the same abstraction, such as pass-through methods or decorators, then there’s a good chance that they haven’t provided enough benefit to compensate for the additional infrastructure they represent.### 7.6 Conclusion
8 Pull Complexity Downwards
- Most modules have more users than developers, so it is better for the developers to suffer than the users.- As a module developer, you should strive to make life as easy as possible for the users of your module, even if that means extra work for you.- it is more important for a module to have a simple interface than a simple implementation.- Thus, you should avoid configuration parameters as much as possible.- Ideally, each module should solve a problem completely; configuration parameters result in an incomplete solution, which adds to system complexity.
9 Better Together Or Better Apart?
- When deciding whether to combine or separate, the goal is to reduce the complexity of the system as a whole and improve its modularity.- Here are a few indications that two pieces of code are related:- They share information- They are used together- They overlap conceptually- It is hard to understand one of the pieces of code without looking at the other.### 9.3 Bring together to eliminate duplication- One approach is to factor the repeated code out into a separate method and replace the repeated code snippets with calls to the method.- If the snippet is only one or two lines long, there may not be much benefit in replacing it with a method call.### 9.4 Separate general-purpose and special-purpose code- In general, the lower layers of a system tend to be more general-purpose and the upper layers more special-purpose.### 9.8 Splitting and joining methods- However, length by itself is rarely a good reason for splitting up a method.- When designing methods, the most important goal is to provide clean and simple abstractions. Each method should do one thing and do it completely.
10 Define Errors Out Of Existence
- The key overall lesson from this chapter is to reduce the number of places where exceptions must be handled; in many cases the semantics of operations can be modified so that the normal behavior handles all situations and there is no exceptional condition to report### 10.1 Why exceptions add complexity- A recent study found that more than 90% of catastrophic failures in distributed data-intensive systems were caused by incorrect error handling### 10.2 Too many exceptions- classes with lots of exceptions have complex interfaces, and they are shallower than classes with fewer exceptions.### 10.4 Example: file deletion in Windows- In Unix, if a file is open when it is deleted, Unix does not delete the file immediately. Instead, it marks the file for deletion, then the delete operation returns successfully.### 10.5 Example: Java substring method- Overall, the best way to reduce bugs is to make software simpler.### 10.10 Taking it too far- The best way to do this is by redefining semantics to eliminate error conditions.
11 Design it Twice
- Try to pick approaches that are radically different from each other; you’ll learn more that way.- After you have roughed out the designs for the alternatives, make a list of the pros and cons of each one.- Unfortunately, I often see smart people who insist on implementing the first idea that comes to mind, and this causes them to underperform their true potential(italso makes them frustrating to work with).
12 Why Write Comments? The Four Excuses
- Documentation also plays an important role in abstraction; without comments, you can’t hide complexity.- Finally, the process of writing comments, if done correctly, will actually improve a system’s design.### 12.1 Good code is self-documenting- If users must read the code of a method in order to use it, then there is no abstraction: all of the complexity of the method is exposed.### 12.2 I don’t have time to write comments- Thus, if you allow documentation to be de-prioritized, you’ll end up with no documentation.- Good comments make a huge difference in the maintainability of software, so the effort spent on them will pay for itself quickly.### 12.3 Comments get out of date and become misleading- Code reviews provide a great mechanism for detecting and fixing stale comments.### 12.5 Benefits of well-written comments- The overall idea behind comments is to capture information that was in the mind of the designer but couldn’t be represented in the code.
13 Comments Should Describe Things that Aren’t Obvious from the Code
- comments should describe things that aren’t obvious from the code.- The idea of an abstraction is to provide a simple way of thinking about something, but code is so detailed that it can be hard to see the abstraction just from reading the code.- Developers should be able to understand the abstraction provided by a module without reading any code other than its externally visible declarations.### 13.1 Pick conventions- Most comments fall into one of the following categories - Interface: a comment block that immediately precedes the declaration of a module such as a class, data structure, function, or method. - Data structure member: a comment next to the declaration of a field in a data structure, such as an instance variable or static variable for a class. - Implementation comment: a comment inside the code of a method or function, which describes how the code works internally. - Cross-module comment: a comment describing dependencies that cross module boundaries.- Every class should have an interface comment, every class variable should have a comment, and every method should have an interface comment.### 13.2 Don’t repeat the code- The most common reason is that the comments repeat the code: all of the information in the comment can easily be deduced from the code next to the comment.- Another common mistake is to use the same words in the comment that appear in the name of the entity being documented:- A first step towards writing good comments is to use different words in the comment from those in the name of the entity being described.### 13.3 Lower-level comments add precision- Comments augment the code by providing information at a different level of detail. - Some comments provide information at a lower, more detailed, level than the code; these comments add precision by clarifying the exact meaning of the code. - Other comments provide information at a higher, more abstract, level than the code; these comments offer intuition,- When documenting a variable, think nouns, not verbs. In other words, focus on what the variable represents, not how it is manipulated.### 13.4 Higher-level comments enhance intuition- But, great software designers can also step back from the details and think about a system at a higher level. - This means deciding which aspects of the system are most important, and being able to ignore the low-level details and think about the system only in terms of its most fundamental characteristics. - Comments of the form“howwe get here” are very useful for helping people to understand code.- If you want code that presents good abstractions, you must document those abstractions with comments.### 13.5 Interface documentation- Interface comments provide information that someone needs to know in order to use a class or method; they define the abstraction.- If interface comments must also describe the implementation, then the class or method is shallow.- The interface comment for a method includes both higher-level information for abstraction and lower-level details for precision:- The main goal of implementation comments is to help readers understand what the code is doing### 13.6 Implementation comments: what and why, not how- In addition to describing what the code is doing, implementation comments are also useful to explain why.- However, if the variable is used over a large span of code, then you should consider adding a comment to describe the variable.- When documenting variables, focus on what the variable represents, not how it is manipulated in the code.### 13.8 Conclusion- When writing comments, try to put yourself in the mindset of the reader and ask yourself what are the key things he or she will need to know.
14 Choosing Names
- Good names are a form of documentation: they make code easier to understand.### 14.1 Example: bad names cause bugs- It took six months, but I eventually found and fixed the bug.### 14.2 Create an image- When choosing a name, the goal is to create an image in the mind of the reader about the nature of the thing being named.- Names are a form of abstraction: they provide a simplified way of thinking about a more complex underlying entity.- the best names are those that focus attention on what is most important about the underlying entity while omitting details that are less important.### 14.3 Names should be precise- Good names have two properties: precision and consistency.- For example, it’s fine to use generic names like i and j as loop iteration variables, as long as the loops only span a few lines of code.- If you find it difficult to come up with a name for a particular variable that is precise, intuitive, and not too long, this is a red flag.- If it’s hard to find a simple name for a variable or method that creates a clear image of the underlying object, that’s a hint that the underlying object may not have a clean design.- If you use names such as i and j for loop variables, always use i in outermost loops and j for nested loops.### 14.4 Use names consistently- For each of these common usages, pick a name to use for that purpose, and use the same name everywhere.- Consistency has three requirements: - first, always use the common name for the given purpose; - second, never use the common name for anything other than the given purpose; - third, make sure that the purpose is narrow enough that all variables with the name have the same behavior.### 14.5 A different opinion: Go style guide- readability must be determined by readers, not writers.- The greater the distance between a name’s declaration and its uses, the longer the name should be.
15 Write The Comments First
- The best time to write comments is at the beginning of the process, as you write the code.### 15.2 Write the comments first- The comments-first approach has three benefits. - First, it produces better comments. - The second, and most important, benefit of writing the comments at the beginning is that it improves the system design. - The third and final benefit of writing comments early is that it makes comment-writing more fun.### 15.3 Comments are a design tool- Comments provide the only way to fully capture abstractions, and good abstractions are fundamental to good system design.- If a method or variable requires a long comment, it is a red flag that you don’t have a good abstraction.- comments are only a good indicator of complexity if they are complete and clear.### 15.4 Early comments are fun comments### 15.5 Are early comments expensive?- Writing the comments first will mean that the abstractions will be more stable before you start writing code.- If you haven’t ever tried writing the comments first, give it a try. Stick with it long enough to get used to it.
16 Modifying Existing Code
- the design of a mature system is determined more by changes made during the system’s evolution than by any initial conception.### 16.1 Stay strategic- The tactical approach very quickly leads to a messy system design.- If you want to maintain a clean design for a system, you must take a strategic approach when modifying existing code.- Ideally, when you have finished with each change, the system will have the structure it would have had if you had designed it from the start with that change in mind.### 16.2 Maintaining comments: keep the comments near the code- The best way to ensure that comments get updated is to position them close to the code they describe- Spread them out, pushing each comment down to the narrowest scope that includes all of the code referred to by the comment.### 16.3 Comments belong in the code, not the commit log- A common mistake when modifying code is to put detailed information about the change in the commit message for the source code repository, but then not to document it in the code.### 16.4 Maintaining comments: avoid duplication- Instead, try to document each design decision exactly once.- If information is already documented someplace outside your program, don’t repeat the documentation inside the program; just reference the external documentation.
17 Consistency
- Consistency creates cognitive leverage: once you have learned how something is done in one place, you can use that knowledge to immediately understand other places that use the same approach.### 17.1 Examples of consistency### 17.2 Ensuring consistency- The best way to enforce conventions is to write a tool that checks for violations, and make sure that code cannot be committed to the repository unless it passes the checker.- The most important convention of all is that every developer should follow the old adage“Whenin Rome, do as the Romans do.”- Don’t change existing conventions.- Your new idea may indeed be better, but the value of consistency over inconsistency is almost always greater than the value of one approach over another.
18 Code Should be Obvious
### 18.1 Things that make code more obvious- Judicious use of white space. The way code is formatted can impact how easy it is to understand.- Blank lines are also useful to separate major blocks of code within a method,- This approach works particularly well if the first line after each blank line is a comment describing the next block of code: the blank lines make the comments more visible.### 18.2 Things that make code less obvious- If the meaning and behavior of code cannot be understood with a quick reading, it is a red flag.- software should be designed for ease of reading, not ease of writing.
19 Software Trends
### 19.1 Object-oriented programming and inheritance- Thus, implementation inheritance should be used with caution.- One of the key elements of object-oriented programming is inheritance.### 19.2 Agile development- One of the most important elements of agile development is the notion that development should be incremental and iterative.- the increments of development should be abstractions, not features.### 19.5 Design patterns- Design patterns represent an alternative to design: rather than designing a new mechanism from scratch, just apply a well-known design pattern.
20 Designing for Performance
### 20.1 How to think about performance- If you try to optimize every statement for maximum speed, it will slow down development and create a lot of unnecessary complexity.- The key is to develop an awareness of which operations are fundamentally expensive.- The best way to learn which things are expensive is to run micro-benchmarks(smallprograms that measure the cost of a single operation in isolation).- For example, when storing a large collection of objects that will be looked up using a key value, you could use either a hash table or an ordered map. Both are commonly available in library packages, and both are simple and clean to use.- In general, simpler code tends to run faster than complex code.- Deep classes are more efficient than shallow ones, because they get more work done for each method call.### 20.2 Measure before modifying- Before making any changes, measure the system’s existing behavior. - First, the measurements will identify the places where performance tuning will have the biggest impact. - The second purpose of the measurements is to provide a baseline, so that you can re-measure performance after making your changes to ensure that performance actually improved.### 20.3 Design around the critical path- One of the most important things that happens in this process is to remove special cases from the critical path- When redesigning for performance, try to minimize the number of special cases you must check.
Summary of Design Principles
Summary of Red Flags引自 全部章节