Evolution of a Project

You work in a lab that writes software applications and distributes them for free. All you ask in return is that they register if they find the software useful.

In the Beginning...

You want to:

  • Create a web form for users to register, with the data being saved to a tab-separated-value (TSV) file, one record per line.
  • Collect the e-mail addresses into a mailing list for announcements such as new versions of applications.
  • Create a time plot to see whether registration rate trends.

Then You Want to Know More...

Detailed information, such as what types of computers they use, may help you prioritize development projects.

  • Change registration to collect more data.
  • Create a pie chart to see distribution of computer types.
  • Create a multi-line time plot to see changing popularity of different types of computers.

Then Your Funding Agency Wants to Know More...

Your funding agency wants to know how many of your users are also funded by them.

  • Change registration to collect even more data.
  • Generate a report of funding source, split by user e-mail domains.

Of Course...

  • With every additional requirement, you must keep all the programs working, both the new and old.
  • There will be more changes coming.

The Brute Force Solution

Approach

Write a script for every task.

Benefits

It is a simple workflow.

Issues

For every change, all scripts must be updated to accommodate the latest design. The more scripts you have, the bigger this task gets.

The Modular Solution

Approach

Divide and conquer:

  1. Identify subtasks that can be implemented independently (e.g., as Python modules). In our registration example, we might choose:
    • Data management, including version handling
    • Plot generation, for different media
  2. Design interfaces for each subtask.
    • Subtasks only use interfaces to access other subtasks.
    • Interfaces provide an abstraction of what the subtask does, not how it is done.
    • In fact, an interface may be implemented in one of several possible methods. Other subtasks cannot tell which method is being used, nor should they care.
  3. Write a script for each task by using interfaces of subtasks.

Benefits

  • Subtasks may be shared by multiple tasks without duplicating code.
  • Each subtask is self-contained and may be tested independently (unit testing).
  • Each task script can be written using subtask interfaces (e.g., "the date of the registration records"), without having to deal with subtask implementation details (e.g., "the TSV file is turned into a list of tuples").
  • When a change only affects a subtask, only the subtask needs to to be updated. As long as the subtask interface is maintained, no other subtask even knows that anything has changed.

Issues

  • To get the most benefits, you need to start with a modular approach.
  • It is possible to switch over from a brute-force approach (by refactoring existing code), but it is more work.
  • There is more initial effort in the design.