Notes on Working Effectively with Legacy Code

These are my notes on Michael Feathers’ excellent book Working Effectively with Legacy Code.

Definition of Legacy Code

Legacy code is code without tests. Why such a strange definition?

The biggest problem in adding a feature to unfamiliar, poorly-organized code is not in ensuring your new feature works, but in ensuring that the existing features still work.

A unit test serves as a fast, localized regression test. You could have a giant system test but it would be neither fast nor localized. It would take ages to run and, if it went wrong, you wouldn’t be able to pinpoint the bug. A unit test, on the other hand, runs fast and, when it fails, tells you exactly which function had the problem.

This regression testing alone gives you a safety net within which to make changes, whether adding features or refactoring the design.

But that’s not all. Testing forces you to break unnecessary dependencies between different parts of the system. In other words, it forces you to minimize the interface between modules.

How come? A large interface will lead to a correspondingly large pain when trying to set up a test instance. For example, if your calculator’s arithmetic evaluation function reads directly from the GUI and writes directly to it, you’re going to have a tough time sending in the input or getting back the output. You would have to figure out how set up the GUI in a test environment and programmatically send in inputs (so that you don’t have to click on buttons yourself) and programmatically get outputs. This will naturally encourage you to break it up so that the evaluation function takes in a simple string (better yet, an abstract syntax tree) and returns a number. That way you can programmatically send in your test input AST and compare against the expected output number. In short, what you wanted to test was the arithmetic evaluation, but what you were forced to test in the large-interface code was all kinds of code before and after your target function. So, testing probably makes you deal with your target code as a first-class item instead of something buried deep within other code. (Why is being buried deep in the context of other code such a bad thing?)

Question: How are you “breaking dependencies” when you minimize code before and after your target code?

Much more to come.

Testing Code Smells

Test: Nested functions are a pain to test!

Take this nested definition of n-queens:

let rec queens_positions n =
  let rec helper m = match m with
    | 0 -> [[]]
    | m -> let rest = helper n (m-1)
           in
           let len = if rest = [] then 0 else rest |> hd |> length
           in
           map (fun x -> len + 1, x) (range [] n)
           |> fun xs -> apply_over_lists (fun x y -> x::y) xs rest []
           |> filter (fun p -> match p with | [] -> false | pos::others -> for_all (is_safe pos) others)
  in
  helper n |> map (map snd)
;;

I kept getting the output of queens_positions 2 as [[1; 2]] when it should have been []. Clearly, the problem lay somewhere in the helper function. Because this was a recursive function, I looked at queens_positions 0 and queens_positions 1, both of which worked as expected. And when I walked through the code for queens_positions 2 using the known values for the recursive calls, I got the right answer. But when I asked the interpreter, it gave the wrong answer.

What was happening was that helper was giving different values for the recursive calls. I suspect it was because of the captured variable n. In any case, I couldn’t test it for different inputs because it was hidden within queens_positions. I had to rely on guess about the inputs and outputs of helper. Pain in the ass.

Contrast that to:

let rec helper n m = match m with
  | 0 -> [[]]
  | m -> let rest = helper n (m-1)
         in
         let len = if rest = [] then 0 else rest |> hd |> length
         in
         map (fun x -> len + 1, x) (range [] n)
         |> (fun xs -> apply_over_lists (fun x y -> x::y) xs rest [])
         |> filter (fun p -> match p with | [] -> false | pos::others -> for_all (is_safe pos) others);;

let rec queens_positions n = helper n n |> map (map snd);;

Here, helper is a first-class function.

Created: July 28, 2019

Last modified: August 24, 2019

Status: in-progress notes

Tags: notes, legacy