Software Carpentry

[Agans 2002]: David J. Agans: Debugging. American Management Association, 2002, 0814471684.
Its first sentence says, “This book tells you how to find out what's wrong with stuff, quick,” and that's exactly what it does. In fifteen (very) short chapters, the author presents nine simple rules to help you track down and fix problems in software, hardware, or anything else. His war stories are entertaining (although I think one or two are urban myths), and his advice is eminently practical.
[Andrews & Whittaker 2006]: Mike Andrews and James A. Whittaker: How to Break Web Software. Addison-Wesley, 2006, 0321369440.
This practical companion to [Whittaker 2003] catalogs things you can do to break web-based applications.
[Beck & Cunningham 1989]: Kent Beck and Ward Cunningham: "A Laboratory for Teaching Object-Oriented Thinking", SIGPLAN Notices, vol. 24, no. 10, pp. -, 1989.
The first description of CRC cards.
[Boehm 1988]: Barry Boehm: "A Spiral Model of Software Development and Enhancement", IEEE Computer, vol. , no. , pp. -, 1988.
Boehm's landmark description of spiral software development.
[Brand 1995]: Stewart Brand: How Buildings Learn. Penguin USA, 1995, 0140139966.
This beautiful, thought-provoking book starts with the observation that most architects spend their time re-working or extending existing buildings, rather than creating new ones from scratch. Of course, if Brand had written “program” instead of “building”, and “programmer” where he'd written “architect”, everything he said would have been true of computing as well. A lot of software engineering books try to convey the same message about allowing for change, but few do it so successfully. By presenting examples ranging from the MIT Media Lab to a one-room extension to a house, Brand encourages us to see patterns in the way buildings change (or, to adopt Brand's metaphor, the way buildings learn from their environment and from use). Concurrently, he uses those insights to argue that since buildings are always going to be modified, they should be designed to accommodate unanticipated change.
[Brooks 1995]: Frederick P. Brooks: The Mythical Man Month: Essays on Software Engineering. Addison-Wesley, 1995, 0201835959.
The classic text in software engineering, most famous for its discussion of how adding people to a project that's late will only make it later.
[Castro 2002]: Elizabeth Castro: HTML for the World Wide Web. Peachpit Press, 2000, 0321130073.
A clean, clear, comprehensive guide to creating HTML for the web, with good coverage of Cascading Style Sheets (CSS).
[Castro 2000]: Elizabeth Castro: XML for the World Wide Web. Peachpit Press, 2000, 0201710986.
Like other books in Peachpit's Visual Quickstart series, this one is beautifully designed, and easy to read without ever being condescending. Its 16 chapters and 4 appendices are organized into 1- and 2-page explanations of particular topics, from writing non-empty elements to namespaces, schemas, and XML transformation. Throughout, Castro strikes a perfect balance between “what”, “why”, and “how”, and provides a surprising amount of detail without ever overwhelming the reader.
[Chase & Simon 1973]: W.G. Chase and H.A. Simon: "Perception in chess", Cognitive Psychology, vol. 4, no. , pp. 55-81, 1973.
The original paper comparing the performance of novice and master chess players when confronted with actual and random positions.
[Collins-Sussman et al 2004]: Ben Collins-Sussman, Brian W. Fitzpatrick, and C. Michael Pilato: Version Control with Subversion. O'Reilly, 2004, 0596004486.
A good tutorial and reference guide for Subversion, which is also Version Control with Subversion.
[Doar 2005]: Matt Doar: Practical Development Environments. O'Reilly, , 0596007965.
Matt Doar has produced a practical guide to what should be in every team's toolbox, how competing entries stack up, and how they ought to be used. This book covers everything from configuration management tools like CVS and Subversion, to build tools (make, GNU's Autotools, Ant, Jam, and SCons), various testing aids, bug tracking systems, documentation generators, and we're still only at the halfway mark. He names names, provides links, and treats free and commercial offerings on equal terms. My copy currently has 28 folded-down corners, which is 28 more than most books get.
[Eick et al 2001]: Stephen G. Eick, Todd L. Graves, Alan F. Karr, J.S. Marron, and Audris Mockus: "Does Code Decay? Assessing the Evidence from Change Management Data", IEEE Transactions on Software Engineering, vol. 27, no. 1, pp. -, 2001.
Analyzes the evolution of several million lines of telephone switching software over fifteen years to show that code quality, comprehensibility, and maintainability decline over time.
[Fagan 1986]: Michael E. Fagan: "Advances in Software Inspections", IEEE Transactions on Software Engineering, vol. 12, no. 7, pp. -, 1986.
Empirical data showing that code reviews are the most effective way known to find bugs.
[Fehily 2006]: Chris Fehily: Python. Peachpit Press, 2006, 0321423135.
A gentle introduction to Python, beautifully typeset, with lots of helpful examples.
[Fehily 2003]: Chris Fehily: SQL. Peachpit Press, 2003, 0321118030.
This very readable book describes the subset of SQL that covers most real-world needs. While the book moves a little slowly in some places, the examples are exceptionally clear.
[Feldman 1979]: Stuart I. Feldman: "Make—A Program for Maintaining Computer Programs", Software: Practice and Experience, vol. 9, no. 4, pp. 255-265, 1979.
The original description of Make. Last time I checked, Stu Feldman was a vice president at IBM, which shows you just how far a good tool can take you…
[Feathers 2005]: Michael C. Feathers: Working Effectively with Legacy Code. Prentice-Hall PTR, 2005, 0131177052.
Most programmers spend most of their time fixing bugs, porting to new platforms, adding new features—in short, changing existing code. If that code is exercised by unit tests, then changes can be made quickly and safely; if it isn't, they can't, so your first job when you inherit legacy code should be to write some. That's where this book comes in. What to know three different ways to inject a test into a C++ class without changing the code? They're here. Want to know which classes or methods to focus testing on? Read his discussion of pinch points. Need to break inter-class dependencies in Java so that you can test one module without having to configure the entire application? That's in here too, along with dozens of other useful bits of information. Everything is illustrated with small examples, all of them clearly explained and to the point. There are lots of simple diagrams, and a short glossary; all that's missing is hype.
[Fogel 2005]: Karl Fogel: Producing Open Source Software. O'Reilly, 2005, 0596007590.
A community is more than just a bunch of people. It's a shared set of values, and rules for how to behave. By this standard, the open source community isn't just what some programmers choose to do with their time, and why; it's also how they do it. This book is an excellent guide to that “how”. Every page offers practical advice; every point is made clearly and concisely, and clearly draws upon the author's extensive personal experience. Want to know how to earn commit privileges on a project? It's here. Do you and other project members have irreconcilable differences? Fogel explains when and how to fork, and what the pros and cons are. Want to get your project more attention? Want to take something closed, and open it up? It's all here, and much more.
[Fowler 1999]: Martin Fowler: Refactoring. Addison-Wesley Professional, 1999, 0201485672.
Like architects, most programmers spend most of their time renovating, rather than creating something completely new on a blank sheet of paper. This book presents and analyzes patterns that come up again and again when programs are being reorganized. Some of these are well-known, such as placing common code in a utility method. Others, such as replacing temporary objects with queries, or replacing constructors with factory methods, are subtler, but no less important. Each entry includes a section on motivation, the mechanics of actually carrying out the transformation, and an example in Java.
[Friedl 2002]: Jeffrey E. F. Friedl: Mastering Regular Expressions. O'Reilly, 2002, 0596002890.
The definitive programmer's guide to regular expressions.
[Gamma et al 1995]: Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides: Design Patterns. Addison-Wesley, 1995, 0201633612.
The book that started the software design patterns movement. Much of the discussion has been superseded by more recent books, and the use of C++ and Smalltalk for examples feels a little dated, but it is still a landmark in programming.
[Glass 2002]: Robert L. Glass: Facts and Fallacies of Software Engineering. Addison-Wesley Professional, 2002, 0321117425.
I really wish someone had given me something like this book when I took my first programming job. If nothing else, it would have been a better way to start thinking about the profession I had stumbled into than the “everybody knows” factoids that I soaked up at coffee time. Some of what he says is well-known: good programmers are up to N times better than bad ones (his value for N is 28), reusable components are three times harder to build than non-reusable ones, and so on. Other facts aren't part of the zeitgeist, though they should be. For example, most of us know that maintenance consumes 40-80% of software costs, but did you know that roughly 60% of that is enhancements, rather than bug fixes? Or that if more than 20-25% of a component has to be modified, it is more efficient to re-write it from scratch? Best of all, Glass backs up every statement he makes with copious references to the primary literature; if you still disagree with him, you'd better be sure you have as much evidence for your point of view as he has for his.
[Goerzen 2004]: John Goerzen: Foundations of Python Network Programming. APress, 2004, 1590593715.
This book looks at how to handle several common protocols, including HTTP, SMTP, and FTP. Goerzen also doesn't delve as deeply into their internals, but instead on how to build clients that use them. His approach is to build solutions to complex problems one step at a time, explaining each addition or modification along the way. He occasionally assumes more background knowledge than most readers of this book are likely to have, but only occasionally, and makes up for it by providing both clear code, and clear explanations of why this particular function has to do things in a particular order, or why that one really ought to be multithreaded.
[Good 2005]: Nathan A. Good: Regular Expression Recipes. APress, 2005, 159059441X.
A great how-to for regular expressions, with examples in many different languages.
[Gunderloy 2004]: Mike Gunderloy: Coder to Developer. Sybex, 2004, 078214327X.
This practical, readable book is subtitled “Tools and Strategies for Delivering Your Software”, and that's exactly what it's about. Project planning, source code control, unit testing, logging, and build management are all there. Importantly, so are newer topics, like building plugins for your IDE, code generation, and things you can do to protect your intellectual property. Everything is clearly explained, and illustrated with well-chosen examples. While the focus is definitely on .NET, Gunderloy covers a wide range of other technologies, both proprietary and open source. I'm already using two new tools based on references from this book, and plan to make the chapter on “Working with Small Teams” required reading for my students.
[Hammond 1994]: Nick Hammond: "Software Carpentry --- A Tool-Based Approach to Monte Carlo Radiation Transport", Proc. 8th Int'l Conference on Radiation Shielding, vol. , no. , pp. -, 1994.
A prior use of the phrase “software carpentry”.
[Harold 2004]: Elliotte Rusty Harold: Effective XML. Addison-Wesley, 2004, 0321150406.
This book explains which of XML's many features should be used when: Item 12 tells you to store metadata in attributes, and then spends six pages explaining why, while Item 24 analyzes the strengths and weaknesses of various schema languages, and Item 38 covers character set encodings. It's more than most developers will ever want to know, but when you need it, you really need it.
[Hock 2004]: Roger R. Hock: Forty Studies that Changed Psychology. Prentice Hall, 2004, 0131147293.
In forty short chapters, Hock describes the turning points in our understanding of how our minds work. The book isn't just about psychology; you'll also learn a lot about how science gets done, and about the scientists who do it.
[Humphrey 1996]: Watts S. Humphrey: Introduction to the Personal Software Process. Addison-Wesley, 1996, 0201548097.
A methodology for improving programmers' productivity by having them record and track just about everything they do. The idea has a lot of merit, but in practice, the cost of record keeping can outweigh the benefits.
[Hunt & Thomas 1999]: Andrew Hunt and David Thomas: The Pragmatic Programmer. Addison-Wesley, 1999, 020161622X.
This book is about those things that make up the difference between typing in code that compiles, and writing software that reliably does what it's supposed to. Topics range from gathering requirements through design, to the mechanics of coding, testing, and delivering a finished product. The second section, for example, covers “The Evils of Duplication”, “Orthogonality”, “Reversibility”, “Tracer Bullets”, “Prototypes and Post-It Notes”, and “Domain Languages”, and illuminates each with plenty of examples and short exercises.
[Johnson 2000]: Jeff Johnson: GUI Bloopers. Morgan Kaufmann, 2000, 1558605827.
Most books on GUI design are long on well-meaning aesthetic principles, but short on examples of what it means to put those principles into practice. In contrast, GUI Bloopers presents case study after case study: what's wrong with this dialog? What should its creators have done instead. And, most importantly, why? The net effect is to teach all of the same principles that other books try to, but in a grounded, understandable way.
[Kernighan & Pike 1984]: Brian W. Kernighan and Rob Pike: The Unix Programming Environment. Prentice Hall, 1984, 013937681X.
I have long believed that this book is the real secret to Unix's success. It doesn't just show readers how to use Unix—it explains why the operating system is built that way, and how its “lots of little tools” philosophy keeps simple tasks simple, while making hard ones doable.
[Kernighan & Ritchie 1998]: Brian W. Kernighan and Dennis Ritchie: The C Programming Language. Prentice Hall PTR, 1998, 0131103628.
The classic description of the one programming language every serious programmer absolutely, positively has to learn.
[Knuth 1998]: Donald E. Knuth: The Art of Programming. Addison-Wesley, 1998, 0201485419.
The lifework of the man who invented many of the basic concepts of algorithm analysis, these massive tomes are like Everest: awe-inspiring, but not for the weak of heart. Most readers will find [Sedgewick 2001] much more approachable.
[Langtangen 2004]: Hans P. Langtangen: Python Scripting for Computational Science. Springer-Verlag, 2004, 3540435085.
The book's aim is to show scientists and engineers with little formal training in programming how Python can make their lives better. Regular expressions, numerical arrays, persistence, the basics of GUI and web programming, interfacing to C, C++, and Fortran: it's all here, along with hundreds of short example programs. Some readers may be intimidated by the book's weight, and the dense page layout, but what really made me blink was that I didn't find a single typo or error. It's a great achievement, and a great resource for anyone doing scientific programming.
[Lutz & Ascher 2003]: Mark Lutz and David Ascher: Learning Python. O'Reilly, 2003, 0596002815.
This is not only the best introduction to Python on the market, it is one of the best introductions to any programming language that I have ever read. Lutz and Ascher cover the entire core of the language, and enough of its advanced features and libraries to give readers a feeling for just how powerful Python is. In keeping with the spirit of the language itself, their writing is clear, their explanations lucid, and their examples well chosen.
[Margolis & Fisher 2002]: Jane Margolis and Allan Fisher: Unlocking the Clubhouse. MIT Press, 2002, 0262133989.
This book describes a project at Carnegie-Mellon University that tried to figure out why so few women become programmers, and what can be done to correct the imbalance. Its first six chapters describe the many small ways in which we are all, male and female, are conditioned to believe that computers are “boy's things”. Sometimes it's as simple as putting the computer in the boy's room, because “he's the one who uses it most”. Later on, the “who needs a social life?” atmosphere of undergraduate computer labs drives many women away (and many men, too). The last two chapters describe what the authors have done to remedy the situation at high schools and university. This work proves that by being conscious of the many things that turn women off computing, and by viewing computer science from different angles, we can attract a broader cross-section of society, which can only make our discipline a better place to be. The results are impressive: female undergraduate enrolment at CMU rose by more than a factor of four during their work, while the proportion of women dropping out decreased significantly.
[Martelli 2005]: Alex Martelli, Anna Ravenscroft, and David Ascher: Python Cookbook. O'Reilly, 2005, 0596007973.
A useful reference for every serious Python programmer, this book is a collection of tips and tricks, some very simple, others so complex that they require careful line-by-line reading. The book's companion web site is updated regularly.
[Mason 2005]: Mike Mason: Pragmatic Version Control Using Subversion. Pragmatic Bookshelf, 2005, 0974514063.
Yet another book from the folks at Pragmatic, this one is everything you'll ever need to know about Subversion, which is on its way to becoming the version control system of choice for open source development.
[McConnell 2004]: Steve McConnell: Code Complete. Microsoft Press, 2004, 0735619670.
This classic is a handbook of do's and don'ts for working programmers. It covers everything from how to avoid common mistakes in C to how to set up a testing framework, how to organize multi-platform builds, and how to coordinate the members of a team. In short, it is everything I wished someone had told me before I started my first full-time programming job.
[McConnell 1996]: Steve McConnell: Rapid Development. Microsoft Press, 1996, 1556159005.
This book describes what it takes to develop robust code quickly, what mistakes are often made in the name of rapid development, and how to identify and analyze potential risks. It includes a list of 25 best practices, and discusses things that most other books leave out (like recovering from disasters and dealing with impossible demands). Unlike most “how to do it better” books, it isn't try to sell any particular practice or style, which adds even more weight to McConnell's carefully balanced opinions.
[Pilgrim 2004]: Mark Pilgrim: Dive Into Python. APress, 2004, 1590593561.
A good introduction to Python, which is also available on-line at Dive Into Python.
[Prechelt 2000]: Lutz Prechelt: "An Empirical Comparison of Seven Programming Languages", IEEE Computer, vol. 33, no. 10, pp. 23-29, 2000.
Some hard data on the relative effectiveness of C, C++, Java, Perl, Python, Rexx, and Tcl.
[Ray & Ray 2003]: Deborah S. Ray and Eric J. Ray: Unix. Peachpit Press, 2003, 0321170105.
A gentle introduction to Unix, with many examples.
[Robinson 2005]: Evan Robinson: "Why Crunch Mode Doesn't Work: 6 Lessons", (viewed 2006-02-26).
An incisive summary of the effect of fatigue on human productivity, the conclusion of which is that crunch mode winds up making projects later.
[Rosen 2005]: Lawrence Rosen: Open Source Licensing: Software Freedom and Intellectual Property Law. Prentice Hall PTR, 2005, 0131487876.
If you're involved in open source software in any way, shape, or form, then this book is a useful read. Its author is intimately familiar with the field; here, he lays out a general background for discussion of intellectual property, and the history of free/open source software, then discusses what various popular licenses actually mean. The book closes with chapters on topics such as how to choose a license, litigation, and standards. The writing is clear—exceptionally so by legal standards—and he takes time to explain terms and assumptions that most software developers won't have encountered before. What's more, he doesn't seem to have any particular axes to grind: the book is US-centric, but his treatment of the various options open to today's developers is very even-handed.
[Royce 1970]: W. W. Royce: "Managing the Development of Large Software Systems", Proceedings of IEEE WESCON, vol. , no. , pp. -, 1970.
The original description of the waterfall model of software development.
[Schneier 2003]: Bruce Schneier: Beyond Fear. Springer, 2003, 0387026207.
A thought-provoking look at how we are encouraged to think about security, and how much security is actually desirable. For example, he explains why security systems must not just work well, but fail well, and why secrecy often undermines security instead of enhancing it.
[Schneier 2005]: Bruce Schneier: Secrets and Lies. Wiley, 2005, 0471453803.
Having written the standard book on cryptography, Schneier now argues that technology alone can't solve most real security problems. The book covers systems and threats, the technologies used to protect and intercept data, and strategies for proper implementation of security systems. Rather than blind faith in prevention, Schneier advocates swift detection and response to an attack, while maintaining firewalls and other gateways to keep out the amateurs.
[Sedgewick 2001]: Robert Sedgewick: Algorithms in C, Parts 1-5. Addison-Wesley Professional, 2001, 0201756080.
Far too many programmers still think and code as if resizeable vectors and string-to-pointer hash tables were the only data structures ever invented. These books are a guide to all the other conceptual tools that working programmers ought to have at their fingertips, from sorting and searching algorithms to different kinds of trees and graphs. The analysis isn't as deep as that in Knuth's monumental The Art of Programming, but that makes the book far more accessible. And while the author's use of C may seem old-fashioned in an age of Java and C#, it does ensure that nothing magical is hidden inside an overloaded operator or virtual method call.
[Skoudis 2004]: Ed Skoudis: Malware. Prentice-Hall, 2004, 0131014056.
This 647-page tome is a survey of harmful software, from viruses and worms through Trojan horses, root kits, and even malicious microcode. Each threat is described and analyzed in detail, and the author gives plenty of examples to show exactly how the attack works, and how to block (or at least detect) it. The writing is straightforward, and the case studies in Chapter 10 are funny without being too cute.
[Spinellis 2006]: Diomidis Spinellis: Code Quality. Addison-Wesley, 2006, 0321166078.
A companion to the same author's earlier [Spinellis 2003], this book concentrates on what distinguishes good code from bad. The first one was great; this one is even better.
[Spinellis 2003]: Diomidis Spinellis: Code Reading. Addison-Wesley, 2003, 0201799405.
The book's preface says it best: “The reading of code is likely to be one of the most common activities of a computing professional, yet it is seldom taught as a subject or formally used as a method for learning how to design and program.” Spinellis isn't the first person to make this point, but he is the first person I know of to do something about it. In this book, he walks through hundreds of examples of C, C++, Java, and Perl, drawn from dozens of Open Source projects such as Apache, NetBSD, and Cocoon. Each example illustrates a point about how programs are actually built. How do people represent multi-dimensional tables in C? How do people avoid nonreentrant code in signal handlers? How do they create packages in Java? How can you recognize that a data structure is a graph? A hashtable? That it might contain a race condition? And on, and on, real-world issue after real-world issue, each one analyzed and cross-referenced. There's also a section on additional documentation sources, and a chapter on tools that can help you make sense of whatever you've just inherited.
[Steele 1999]: Guy L. Steele Jr.: "Growing a Language", Journal of Higher-Order and Symbolic Computation, vol. 12, no. 3, pp. 221-236, 1999.
The best (and wittiest) discussion ever published of how programming languages ought to evolve.
[Spolsky 2004]: Joel Spolsky: Joel on Software. APress, 2004, 1590593898.
Joel on Software collects some of the witty, insightful articles Spolsky has blogged over the past few years. His observations on hiring programmers, measuring how well a development team is doing its job, the API wars, and other topics are always entertaining and informative. Over the course of forty-five short chapters, he ranges from the specific to the general and back again, tossing out pithy observations on the commoditization of the operating system, why you need to hire more testers, and why NIH (the not-invented-here syndrome) isn't necessarily a bad thing.
[Thompson & Chase 2005]: Herbert H. Thompson and Scott G. Chase: The Software Vulnerability Guide. Charles River Media, 2005, 1584503580.
My current favorite guide to computer security for programmers, this books walks through each major family of security holes in turn: faulty permission models, bad passwords, macros, dynamic linking and loading, buffer overflow, format strings and various injection attacks, temporary files, spoofing, and more.
[Ullman & Liyanage 2004]: Larry Ullman and Marc Liyanage: C Programming. Peachpit Press, 2004, 0321287630.
A gentle introduction to C, with many examples.
[Whittaker 2003]: James A. Whittaker: How to Break Software. Addison-Wesley, 2003, 0201796198.
A slim catalog of things testers can do to break software.
[Whittaker & Thompson 2004]: James A. Whittaker and Herbert H. Thompson: How to Break Software Security. Addison-Wesley, 2004, 0321194330.
This practical companion to [Whittaker 2003] catalogs things you can do to test (and break) security measures in programs.
[Williams & Kessler 2003]: Laurie Williams and Rober Kessler: Pair Programming Illuminated. Addison-Wesley, 2003, 0201745763.
A combination of an instruction manual, a summary of the authors' empirical studies of pair programming's effectiveness, and advocacy, this book is the reference guide for anyone who wants to introduce pair programming into their development team.
[Wilson 2005]: Greg Wilson: Data Crunching. Pragmatic Bookshelf, 2005, 0974514071.
Every day, all around the world, programmers have to recycle legacy data, translate from one vendor's proprietary format into another's, check that configuration files are internally consistent, and search through web logs to see how many people have downloaded the latest release of their product. It may not be glamorous, but knowing how to do it efficiently is essential to being a good programmer. This book describes the most useful data crunching techniques, explains when you should use them, and shows how they will make your life easier.
[Zeller 2006]: Andreas Zeller: Why Programs Fail: A Guide to Systematic Debugging. Morgan Kaufmann, 2006, 1558608664.
This well-written, copiously-illustrated book from the creator of DDD (a graphical front end for the GNU debugger) is a survey of current and next-generation debugging tools. Some are old friends, like bug trackers and symbolic debuggers. Others are new: there's a detailed look at the pros and cons of replay debugging, an automatic divide-and-conquer tool that can strip test cases down to their essentials, and a whole chapter on how dependency analysis and program slicing can be used to isolate faults. If, ten years from now, debuggers have taken a much-needed leap forward, much of the credit will go to this book.