Development, Debugging and Optimization

Conrad Huang

April 16, 2008

Introduction

There's more to building a house than nailing boards together
- Have to make sure the pipes are put in before the drywall goes up
- Satisfy building code regulations
- Make sure everyone on the team is productive (not just busy)
This lecture covers the equivalent topics for small-team software development
- 12×12: up to a dozen people, working for up to a year
- All of these ideas apply to people working on their own for two weeks or more

Design vs. Agility

Two camps currently dominate the debate about software development
Big Design Up Front (BDUF): measure twice, cut once
- Think through users' needs, design, and possible problems before starting to code
Agile: lots of small steps, with continuous testing and refactoring
- “No battle plan ever survives contact with the enemy.” (Helmuth von Moltke)
Both are responses to Boehm's Curve
- BDUF: prevent problems from happening at all
  - The cheapest bug to fix is one that doesn't exist
- Agile: catch problems while you're still at the low-cost end of the curve
Differences in practice are much less than the differences in rhetoric

Project Lifecycle

Very few individuals or teams stick to textbook rules
- Teams always adapt processes to local needs and personalities
- Remember: reality matters more than rulebooks
No matter what the official process is, most well-run medium-sized projects follow a similar path

Figure 26.2: Project Lifecycle

Step 0: Vision

A vision statement is a one- or two-sentence summary of the project
- Also called an elevator pitch
- Helps keep everyone pointed in the same direction
- A good way for project members to introduce themselves at conferences and trade shows
Exercise: have everyone on the team replace the bits in italics with words of their own
- Do this independently, then compare answers

Part	Boilerplate	Example
Problem statement	The problem of	only being able to simulate invasion percolation on regular 2D grids
Target market	affects	scientists who work with composite materials,
Impact	who currently	have to extrapolate from regular models.
Solution	Our solution,	a set of enhancements to InvPerc,
Key technical feature		handles any structure that can be represented as non-overlapping regions.
Competition	Unlike	PI2D and other simulators,
Differentiator	it	can read standard CAD files as well as IP2-format grid files.
Table 26.1: Vision Statement Template

Just as important for solo projects!

Step 1: Gathering Requirements

Single biggest cause of project failure is failing to get the requirements right
- Boehm's Curve again: building the wrong thing is the most expensive mistake you can make
Start by asking what problem the software is supposed to solve
- What do you want to be able to do that you can't right now?
- What does the existing software do that you don't want it to?
- What does it make you do that you don't want to?
Organize requirements as point-form list
- Give each one a unique name
- And keep the list under version control

What Requirements Are and Aren't

Good requirements are complete and unambiguous
- “The system will reformat data files as they are submitted“ is neither
Instead:
- Only users who have logged in by providing a valid user name and password can upload files
- The system must allow users to upload files via a secure web form
- The system must accept files up to 16MB in size
- The system must accept files in PDB and RJCS-1 format
- The system must convert files to RJCS-2 format before storing them
- The system must present users with an error message page if an uploaded file cannot be parsed
- etc.
A contract amongst the various stakeholders
- Overly formal for two-person research prototypes
- But essential for distributed teams

Step 2: From Requirements to Features

Figure out what features you need
- What do you have to build in order to accomplish XYZ?
- How will you tell that it's working?
- Yet another point-form list…
Relationship between requirements and features can be very complex
- One feature can (help) satisfy many requirements
- One requirement may require many features
Traceability once again:
- Why does each feature exist?
- How is each requirement being satisfied?
- Who said so? When?

Waterfalls And Why Not

Figure 26.3: The Waterfall Model

Pause for a moment…
This looks like the start of the waterfall model [Royce 1970]
- Describes development as flowing through several distinct phases
- Requirements analysis to design to implementation to testing to maintenance
But:
- If different people are responsible for different phases, then no one has to deal with the consequences of their mistakes
- Whoever is responsible for testing has to make up all the lost time from the previous phases
- Time lag: it can take a long time for changes in requirements to filter through to the finished product
- No one actually ever works this way in real life anyway

The Spiral Model

Figure 26.4: The Spiral Model

The spiral model [Boehm 1988] wraps this around itself
- Go through the waterfall cycle over and over again, each time on a larger scale
- Royce actually advocated doing this too, but most people have forgotten that
Key ideas:
- The code teaches you about the problem
- Customers can only find out what they actually want by playing with a working system
But Boehm still envisaged:
- Cycles lasting from six months to two years
- And division of labor

Enter the Extremists

Extreme Programming (XP) arose in the 1990s to cope with:
- Ever-changing requirements
- Internet time
  - Six-month iterations were longer than the lifespan of the average dot-com
- Web-based delivery: it's possible to “ship” a new version whenever you want
Basic ideas:
- Each spiral lasts only one to four weeks
- So requirements gathering can be reduced to collecting a few stories from users, then implementing them
- Rely on test-driven development, pair programming, and constant refactoring to ensure code quality

Pitfalls

Requires a lot of self-discipline to stop it degenerating into pure hackery
Funding agencies are understandably reluctant to fund a project whose deliverables will be made up along the way
- On the other hand, this is a good description of research…
Not as well suited to large projects or teams
- But this is changing as web-based collaboration tools improve

Step 3: Analysis & Estimation

Next step is analysis & estimation (A&E)
- How can each feature be implemented?
- And how long will it take?
Where possible, investigate two or more options
- Plan A: only solve three quarters of the problem, but can be implemented in a week
- Plan B: does everything and more, but will take three months
Write throw-away code to become familiar with new libraries and tools
- Keep it under version control
- But do not let it find its way into the application

Step 4: Prioritization

Figure 26.5: Ranking Features

Now it's time to prioritize
- Which features are most cost-effective to develop?
- There's never time to do them all
Usual way to do this is to build a 3×3 grid
- Rank each feature low-medium-high on importance and effort
- More honest than the false accuracy of a 1-10 scale

Step 5: Scheduling

Can now draw up a schedule
- Throw out everything below the diagonal of the priority matrix
Only big choice remaining is whether to do big items first, or little ones
- Remember to take dependencies into account
End result is a list of who's doing what, when
- Schedule people at 80% of capacity to allow for sick time, interruptions, etc.
Yes, it contains a lot of guesswork…
- …but it's better than nothing…
- …and estimates improve with practice

Step 6: Development

Now it's time to test and code
- Remember to do them in this order
Expect to refine design during early stages of construction
- If you're still refining the design a week before you're due to ship, something has gone wrong
Take time to refactor old code while adding new stuff
- Your skills (and coding style) improve over time
  - Or the person working on the feature in Version 3.2 knows something the Version 3.1 author didn't
- The problem changes over time
  - A good solution to last year's requirements may not be a good solution to this year's
Describe day-to-day activities in teamware

Tracking Progress

Make sure the schedule is always up to date
- Every developer writes a few bullet points every week
- Doing this at 9:00 a.m. Monday works better than asking for it at 4:45 on Friday
Describe tasks in terms of verifiable deliverables
- Things that other people can inspect or test
Always mark tasks as “done” or “not done”, rather than “X% complete“
- If you allow percentages, then many tasks will be 90% done for 90% of the lifetime of the project
- Instead, break tasks down into subtasks that are at most a few days long, and either are or are not completed

Step 7: Finishing

Stop adding new features three-quarters of the way through the project
- No matter how much testing you do as you go along, you'll need time to fix things at the end
Shift resources into integration testing and documentation
If you're only starting to build the installer now, you've left it too late
- Installation and upgrade code can be as complex as the application itself
- Do design, and budget time, when writing A&E
Do not ask for a “big push”
- People can only be productive for 40 hours a week [Robinson 2005]
- Any more than that, and the mistakes they make will actually cost you time overall

After the Party's Over

Always do a post mortem after the project finishes
- What went right (that you want to do again)?
- What went wrong (that you want to avoid next time)?
- Often helps to bring in an outsider to facilitate
  - Feedback is only as useful as it is honest
Update the A&Es to reflect what was actually built
- Forces team members to examine what they got wrong (so that they can improve)
- Provides a starting point for the next round of development

Development Summary

BDUF and XP are diametrically opposed, but both improve productivity
So either the way most people develop software is the worst possible…
…or what really matters is having a process—any process—so that you have some rules to play by…
…and something to improve

Debugging

You're going to spend half your professional life debugging
- So you should learn how to do it systematically
Talk about some simple rules
Then two common debugging tools

Agans' Rules

Many people make debugging harder than it needs to be by:
- Not going about it systematically
- Becoming impatient
- Using inadequate tools
Agans' Rules [Agans 2002] describe how to apply the scientific method to debugging
- Observe a failure
- Invent a hypothesis explaining the cause
- Test the hypothesis by running an experiment (i.e., a test)
- Repeat until the bug has been found

Rule 0: Get It Right the First Time

The simplest bugs to fix are the ones that don't exist
Design, reflect, discuss, then code
- “A week of hard work can sometimes save you an hour of thought.”
Design and build your code with testing and debugging in mind
- Minimize the amount of “spooky action at a distance”
- Minimize the number of things programmers have to keep track of at any one time
- Train yourself to do things right, so that you'll code well even when you're tired, stressed, and facing a deadline
“Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?” (Brian Kernighan)

Rule 1: What Is It Supposed to Do?

First step is knowing what the problem is
- “It doesn't work” isn't good enough
- What exactly is going wrong?
- How do you know?
- You will learn a lot by following execution in a debugger and trying to anticipate what the program is going to do next
Requires you to know how the software is supposed to behave
- Is this case covered by the specification?
- If not:
  - Do you have enough knowledge to extrapolate?
  - Do you have the right to do so?
Try not to let what you want to see influence what you actually observe
- It's harder than you'd think [Hock 2004]

Rule 2: Is It Plugged In?

Are you actually exercising the problem that you think you are?
- Are you giving it the right test data?
- Is it configured the way you think it is?
- Is it the version you think it is?
- Has the feature actually been implemented yet?
- Why are you sure?
  - Maybe the reason you can't isolate the problem is that it's not there
Another argument in favor of automatic regression tests
- Guaranteed to rerun the test the same way each time
Also a good argument against automatic regression tests
- If the test is wrong, it will generate the same misleading result each time

Rule 3: Make It Fail

You can only debug things when they go wrong
So find a test case that makes the code fail every time
- Then try to find a simpler one
- Or start with a trivially simple test case that passes, then add complexity until it fails
Each experiment becomes a test case
- So that you can re-run all of them with a single command
- How else are you going to know that the bug has actually been fixed?
Use the scientific method
- Formulate a hypothesis, make a prediction, conduct an experiment, repeat
- Remember, it's computer science, not computer flip-a-coin

Rule 4: Divide and Conquer

The smaller the gap between cause and effect, the easier the relationship is to see
So once you have a test that makes the system fail, use it isolate the faulty subsystem
- Examine the input of the code that's failing
- If that's wrong, look at the preceding code's input, and so on
Use assert to check things that ought to be right
- “Fail early, fail often”
- A good way to stop yourself from introducing new bugs as you fix old ones
When you do fix the bug, see whether you can add assertions to prevent it reappearing
- If you made the mistake once, odds are that you, or someone, will make it again
Another argument against duplicated code
- Few things are as frustrating as fixing a bug, only to have it crop up again elsewhere

Rule 5: One Change at a Time, For a Reason

Replacing random chunks of code unlikely to do much good
- If you got it wrong the first time, what makes you think you'll get it right the second? Or the ninth?
- So always have a hypothesis before making a change
Every time you make a change, re-run all of your tests immediately
- The more things you change at once, the harder it is to know what's responsible for what
- And the harder it is to keep track of what you've done, and what effect it had
- Changes can also often uncover (or introduce) new bugs

Rule 6: Write It Down

Science works because scientists keep records
- “Did left followed by right with an odd number of lines cause the crash? Or was it right followed by left? Or was I using an even number of lines?”
Records particularly useful when getting help
- People are more likely to listen when you can explain clearly what you did

Rule 7: Be Humble

If you can't find it in 15 minutes, ask for help
- Just explaining the problem aloud is often enough
- “Never debug standing up.” (Gerald Weinberg)
Don't keep telling yourself why it should work: if it doesn't, it doesn't
- Never debug while grinding your teeth, either…
Keep track of your mistakes
- Just as runners keep track of their time for the 100 meter sprint
- “You cannot manage what you cannot measure.” (Bill Hewlett)
And read [Zeller 2006] to learn more

Common Debugging Tools

Print statement
- Easy to use, but…
Symbolic debugger
- Very powerful, but…

What's Wrong with Print Statements

Many people still debug by adding print statements to their programs
It's error-prone
- Adding print statements is a good way to add typos
- Particularly when you have to modify the block structure of your program
And time-consuming
- All that typing…
- And (if you're using Java, C++, or Fortran) all that recompiling…
And can be misleading
- Moves things around in memory, changes execution timing, etc.
- Common for bugs to hide when print statements are added, and reappear when they're removed
But can be extremely effective
- May be added using the same tools as programming
- Can collect lots of data in a single run

Symbolic Debuggers

A debugger is a program that runs another program on your behalf
- Sometimes called a symbolic debugger because it shows you the source code you wrote, rather than raw machine code
While the target program (or debuggee) is running, the debugger can:
- Pause, resume, or restart the target
- Display or change values
- Watch for calls to particular functions, changes to particular variables, etc.
Do not need to modify the source of the target program!
- Depending on your language, you may need to compile it with different flags
And yes, the debugger modifies the target's layout in memory, and execution speed…
- …but a lot less than print statements…
- …with a lot less effort from you

Debugging Summary

Debugging is not a black art
Like medical diagnosis, it's a skill that can be studied and improved
You're going to spend a lot of time doing it: you might as well learn how to do it well

Optimization

Does my program work properly?
- Think about optimization during design
- Get your program to work before optimizing
How do I make my program run faster?
- Where is my program spending all its time?
- What can I do about it?
- Is it worth your time?
First rule of optmization: “Measure, measure, measure.”
- Don't guess!
- Performance bottlenecks are often in unexpected parts of the code
- It's not just how slow a particular function is, but also how many times that function is called
- If you improve code that takes 10% of run time by a factor of ten, you get a 9% increase in performance; if you improve code that takes 50% of run time by a factor of two, you get a 25% increase in performance
- Moral: optimize the right section of code

Execution Profile

The execution profile of a program is a description of its run-time behavior
- Different inputs generate different profiles
- Sections of code that consume more computation time than others are known as “hot spots”
Use a profiler to identify hot spots for optimization
- A profiler collects statistics on the execution profiles, e.g.,
  - counts the number of times a function is called
  - tracks how long the calls take
- Data collection makes profiled program run slower than normal, sometimes a lot slower
Python 2.5 has three different profilers
- profile
- cProfile
- hotshot

Speeding Things Up

Replace hot spots with faster code

If a collection of data will be searched repeatedly, instead of a linear list, use a sorted list or a dictionary
There is usually a tradeoff between bookkeeping overhead and search speed when using “faster” data structures, e.g.,
- Keeping a list sorted
- Managing a dictionary
Which data structure is “best” depends on the data

Restructure the entire program
- Take a different approach to solving your problem
- Throw hardware at “embarrassingly parallel” problems
  - If you have access to a computing cluster, and the problem can be partitioned into multiple jobs easily, use one CPU per job to improve performance
  - Embarrassingly parallel tasks include:
    - Processing multiple independent data sets
    - Repeating simulations with different initial conditions
  - Write a shell script to:
    - Partition big job into a bunch of little jobs
    - Run the jobs, either directly or submitting it into a batch job queue
    - Wait for all jobs to complete
    - Collate results

Optimization Summary

Measure, Measure, Measure
Use efficient data structures when possible
Take advantage of multiple processors

BMI 219

Scientific Software Development

Development, Debugging and Optimization

Conrad Huang

April 16, 2008

Introduction

Design vs. Agility

Project Lifecycle

Step 0: Vision

Step 1: Gathering Requirements

What Requirements Are and Aren't

Step 2: From Requirements to Features

Waterfalls And Why Not

The Spiral Model

Enter the Extremists

Pitfalls

Step 3: Analysis & Estimation

Step 4: Prioritization

Step 5: Scheduling

Step 6: Development

Tracking Progress

Step 7: Finishing

After the Party's Over

Development Summary

Debugging

Agans' Rules

Rule 0: Get It Right the First Time

Rule 1: What Is It Supposed to Do?

Rule 2: Is It Plugged In?

Rule 3: Make It Fail

Rule 4: Divide and Conquer

Rule 5: One Change at a Time, For a Reason

Rule 6: Write It Down

Rule 7: Be Humble

Common Debugging Tools

What's Wrong with Print Statements

Symbolic Debuggers

Debugging Summary

Optimization

Execution Profile

Speeding Things Up

Optimization Summary