How to Keep your Essay Clean

Small Changes, Big Mess

To get a solid grasp of a subject, it’s not just enough to write down your thoughts about it. You have to organize them well. That, I feel, is what consumes a lot of time and is thus precisely what I don’t do. I have over a hundred thousand words in just three draft essays that cover most of the important ideas I’ve figured out in the last couple of years. However, it’s all a giant mess. I’ve repeated ideas over and over without even realizing it, I’ve outgrown some of them, and I’ve routinely scratched my head over problems I had already solved a year ago. This is basically a smelly code base and I need to clean it up.

What do I want from my essays? Getting ideas across to other people is secondary to me. What I really want is to be able to use these ideas whenever I need them in my own life. Currently, I fail to apply my own techniques because I don’t know which one to use in a given situation. I believe that organizing my ideas will give me a clear idea of what to do and reviewing flashcards based on these ideas will help me memorize all of them.

Disorder, Duplication, and Dilapidation

The main problem with organizing my essays, though, is that I get new ideas at random times, like when I’m working on a project and realize that I’m slowed down because I don’t have a REPL in which I can experiment with the language. I usually can’t afford the time to fit each new idea into its rightful place in my essay (and thus in my mental model). I always underestimate how much time it will take me to organize my thoughts in an essay and end up taking hours. Therefore, I instinctively shirk from such a time-consuming task and simply add my new idea to the end of my essay and thus end up with a bunch of sections that all make sense in isolation but don’t flow together.

Similarly, if I come up with a better version of an idea, I may not be able to afford the time to go back and update all my old prose. So, I end up with conflicting explanations within the same essay. Worse, I sometimes end up with duplicate explanations because I don’t even realize I’ve repeated myself. (And why does that happen? Because I haven’t organized my ideas so that they are easy to search through. It’s a vicious cycle.)

Lastly, and this one is a bit of a fantasy, I’d like to be forced to rigorously test my ideas. Too often, I read a book and jot down a simplistic idea like “Oral people recognize only what they can recall. Literate people recognize what they can refer to (given that they have organized that information well).” This is fine in the moment because I kind of understand what I mean and need to move forward with the text while I have some momentum. What is not fine is relaxing in the belief that I’ve cracked the problem entirely. I haven’t. I have yet to flesh out the categories beneath “recognize” and “recall” and “refer to” and so on. I need to make fine distinctions between oral speech and TV and radio and books and tweets. There has to be some mechanism that tells me how far I am from actually cracking a problem and forces me to test each branch of my causal model.

So, I need a solution that would allow me to add new ideas with ease, while preserving the structure of my model, updating any obsolete ideas, and thoroughly testing each new idea.

Test upon each Change

I suspect that running some quick, incremental tests upon each new addition will keep your essay clean while not imposing too much when you make a small change. This is basically the idea behind continuous integration.

For example, if one of the tests you run on each release is a refactoring check, where you make sure your code is well-organized and free of duplication, then each release will ensure that your code is still well-organized and free of duplication. If another thing you check is whether the new code has been well-tested, each release will again ensure that your entire code base is well-tested.

Here are some tests I think will help clean up my essays if I run them every time I commit a new idea. Let’s see how it works.

Structure Checklist

Check whether you’ve clustered the relevant factors for the same technique and the same output dimension. Remove any duplicates. Organize techniques for easy lookup using the output dimension and input factors.

(Note that this kind of style and structure analysis is called “linting” in the programming community.)

This will solve the problem of duplicating ideas due to the difficulty of searching for existing ideas because the only type of change I will be able to make is to add a new relevant factor for a known algorithm and output dimension or add a new algorithm or output dimension. So, there will be just one, easy-to-find home for each exemplar or factor or algorithm or output dimension I come up with. If the exemplar or factor is new, I will put it in its rightful place, and if it’s old, I won’t change anything.

Testing Checklist

Check whether you’ve given a positive exemplar and close negative exemplar for something you claim is a relevant factor. If your output isn’t binary, give a positive exemplar for each output category. Also check if you’ve got a sufficient set of factors (somehow; haven’t yet figured this out).

(This is called “equivalence partitioning” and “boundary testing” in the programming community.)

This will solve the problem of rigorously testing my ideas, i.e., not accepting something as a causal factor when it has not been toggled in isolation.

Flashcard Checklist

You should have written a flashcard for every input-output configuration you’ve added.

This will help me remember my ideas when I’m away from my computer or when I need solutions quickly.

Automate the Checklist

I’ve automated this by modifying the Git commit template so that I’m forced to think about the above dimensions every time I commit.

Observation: [2019-01-20 Sun] It’s been fantastically successful! I’ve got a controlled experiment and flashcards for every new idea I’ve added. That’s in contrast to the zero controlled experiments and zero flashcards I had for years. (The only wrinkle is the “clustered” part because it’s usually not clear to me whether or not I’ve inserted the new idea in the right place. I need to clarify that.)

Created: May 19, 2018

Last modified: September 28, 2019

Status: in-progress

Tags: clean, writing