30 Dec 2016
We Spend Most of Our Time on Maintenance
Look at the budget spent on your software projects. Most of it goes towards maintenance. The Mythical Man-Month by Fred Brooks states that over 90% of the costs of a typical system arise in the maintenance phase, and that any successful piece of software will inevitably be maintained, and Facts and Fallacies of Software Engineering by Robert L. Glass reports that maintenance typically consumes 40% to 80% (averaging 60%) of software costs.
From our own experience and the literature, we can conclude that maintenance is perhaps the most important part of developing software. In this article we'll explore why Haskell shines in maintenance.
The Five Bases of Maintenance
Based on the article Software Maintenance by Chris Newton, I'm going to write about five bases for doing software maintenance:
- Readability: The source code is comprehensible, and describes the domain well to the reader.
- Testability: The code is friendly to being tested, via unit tests, integration tests, property tests, code review, static analysis, etc.
- Preservation of knowledge: Teams working on the software retain the knowledge of the design and functioning of the system over time.
- Modifiability: The ease with which we can fix, update, refactor, adapt, and generally mechanically change.
- Correctness: The software is constructed in a self-consistent way, by using means of combination that rule out erroneous cases that maintainers shouldn't have to deal with.
We'll see below what Haskell brings to the table for each of these bases.
The source code is comprehensible, and describes the domain well to the reader.
Reduce state: Developers must hold a "state of the world" in their head when understanding imperative and object-oriented source code. In Haskell, which is a pure functional language, developers only have to look at the inputs to a function, making it far easier to consider a portion of code and to approach working on it.
Narrowing the problem space: A rich type system like Haskell's guides less-experienced developers, or newcomers to the project, to the right places. Because the domain can be modeled in types, which formally narrow down the problem. Developers can literally define problems away, turning their attention to the real problems of your business's domain.
Coupling where it counts: Haskell's type system supports modeling cases of a problem, coupling the case (such as: logged in/logged out) with the values associated with that state (such as: user session id/no session id). Developers can work with fewer variables to hold in their head, instead concentrating on your business logic.
Encapsulation: Like in object oriented languages (Java, C++, Ruby, Python), encapsulation in Haskell allows developers to hide irrelevant details when exposing the interfaces between modules, leaving other developers fewer details to worry about.
The code is friendly to being tested, via unit tests, integration tests, property tests, code review, static analysis, etc.
Explicit inputs: Haskell programs are the easiest to write tests for, because they are composed of pure functions, which either require no conditions under which your developers should run them, or the conditions are explicitly defined inputs to the function.
Mock the world: With excellent support for embedded domain-specific languages (DSLs), Haskell empowers developers to write programs in an imperative fashion which can then be interpreted as a real world program (interacting with file I/O, using time, etc.) or as a mock program which does nothing to the real world but compute a result. This is valuable for testing the business logic of the software without having to setup a whole real environment just to do so.
Automatically test properties: Haskell's unique type system supports trivially generating thousands of valid inputs to a function, in order to test that every output of the function is correct. Anything from parsers, financial calculations, state machine transformations, etc. can be generated and tested for.
Static analysis: It may go without saying, but Haskell's static type system brings substantial potential for eliminating whole classes of bugs, and maintaining invariants while changing software, as a continuous feedback to the developer. A level-up from Java or C++ or C#, Haskell's purity and rich type system is able to check a far greater region of source code and to greater precision.
Taking testing seriously: Haskell has a large number of testing libraries which range from standard unit testing (like JUnit or RSpec), web framework-based testing, property-based testing (like QuickCheck) and other randomly generated testing, testing documentation, concurrency testing, and mock testing.
Preservation of knowledge
Teams working on the software retain the knowledge of the design and functioning of the system over time.
Model the domain precisely: Because Haskell's rich type system lets your developers model the domain precisely and in a complete way, it's easier for the same developers to return months or a year from now, or new developers to arrive, and gain a good grasp of what's happening in the system.
The ease with which we can fix, update, refactor, adapt, and generally mechanically change.
Automatic memory management: Haskell is high-level with automatically managed memory, like Python or Ruby, and does not suffer from memory corruption issues or leaks, like C or C++, which can arise from developers making changes to your system and mistakenly mismanaging memory manually.
Automate completeness: As mentioned in the readability section, Haskell allows developers to define data types as a set of cases that model the business domain logic. From simple things like results (success/fail/other), to finite state machines, etc. Along with this comes the ability for the compiler to statically determine and tell your developers when a case is missing, which they need go to and correct. This is extraordinarily useful when changing and extending a system.
Break up the problem: Haskell's pure functions only depend on their parameters, and so any expression can be easily factored out into separate functions. Breaking a problem down into smaller problems helps maintainers deal with smaller problems, taking fewer things into account.
Encapsulate: As encapsulation allows developers to hide irrelevant details when exposing the interfaces between Haskell modules, this allows developers to change the underlying implementation of modules without consumers of that module having to be changed.
Decouple orthogonal concepts: In Haskell, unlike in popular object oriented languages like Java or C++, data and behavior are not coupled together: a photograph is a photograph, and a printer knows how to print it, it's not that a photograph contains printing inside it. The data is the photograph, and the behavior is printing a photograph. In Haskell, these two are decoupled, allowing developers to simply define the data that counts, and freely add more behaviors later, without getting lost in object hierarchies and inheritance issues.
The software is constructed in a self-consistent way, by using means of combination that rule out erroneous cases that maintainers shouldn't have to deal with.
Correct combination: In Python, a whole new version of the language, Python 3, had to be implemented to properly handle Unicode text in a backwards-incompatible way. This broke lots of existing Python code and many large projects have still not upgraded. In Haskell, text and binary data are unmixable data types. They cannot be mistakenly combined, as in Python and many other languages. This throws a whole class of encoding issues out of the window, which is less for your developers to worry about.
Avoid multiple writers: In concurrent code, developers have to be very careful when more than one thread changes the same data. Imperative languages tend to allow any thread to change anything, so it's frighteningly easy to make mistakes. In Haskell, data structures are immutable, and a mutable "box" has to be created to share data between threads, ruling out a plethora of potential bugs.
Maintenance is our biggest activity when developing successful software. There are five bases that really make maintenance work better, and this is where Haskell really shines:
- Readability: Haskell's purity and type system lend themselves perfectly to comprehensible code.
- Testability: Haskell code is inherently more testable, due to being pure, safely statically typed, and coming with a variety of testing packages.
- Preservation of knowledge: A rich type system like Haskell's can model the domain so well that developers have to remember less, and educate each-other less, saving time.
- Modifiability: Haskell's strong types, completeness analysis and purity assure that when you break something, you know it sooner.
- Correctness: Developers can work within a consistent model of your domain, removing whole classes of irrelevant problems. Concurrent code is easier to maintain too.
All in all, Haskell really shines in maintenance, and, while it has other novel features, it's really for this reason that developers and companies are increasingly switching to it.