Exposing the Right Details: A Tale of Two Audiences
A few days ago I had a discussion on twitter with @jacobian (Jacob Kaplan-Moss) about the supposed implicit failure of using auto generated documentation tools in Ruby; RDoc, YARD, and the like. The originating quote was from Jacob based on his talk at Github’s recent CodeConf (transcribed by @titanous):
Jonathan, Jacob and I had a nice discussion on Twitter following that post that led to some startling discoveries about the way others view, or misinterpret, API docs. I didn’t get to see his talk, so I can only go by what was said on Twitter, but I’m fairly sure I got the gist of it (git it?). I should point out that I think it’s a huge step in the right direction that there are people that are even interested in talking about documentation failures, whether they are right or wrong. That said, I think the thoughts expressed by the anti-auto-generated-doc-tool crowd are a little misguided, and here’s why:
Guide Mode vs. Reference Mode
The very simple answer is that there is not just one form of documentation. In fact, there are at least two. The form of documentation Jacob is pushing is "guide" style documentation. High level docs that describe the overall behaviour of a system. Generally, guides are read from "cover to cover", so to speak, in a linear fashion. They tell a story, and ultimately show you what the tool is capable of. Indeed, making good looking, readable (and "usable") documentation is a great way to go about writing high level guides for your tools, but there is a fairly big flaw with overusing this technique, and that flaw has to do with the information guides don’t cover.
This is where API docs come in. API docs describe what each component in the system is responsible for in a comprehensive manner. Their purpose is not to answer when or why each component is used, but rather, how they are used. API docs are not read like guides, and in fact, API docs are rarely ever read in full. They act as references for users of "code", not users of "libraries" or "tools". API docs answer the question: "what methods are in class X?", not "what does library Foo do?"
These are two very distinct types of documentation intended for two very distinct audiences. You have people who need to know what to use your tool for, and you have people who need to know how to use your tool. Neither form of documentation captures both of these details, and these two audiences are not often interested in both topics. You can think of these users as being in guide mode or reference mode (respectively) depending what information they are looking for. I’ll be using these two terms from here on out to describe both of our audiences.
Jacob is certainly making a valid point that we need more (and better) high level guides, but the problem is that he doesn’t seem to believe there is any value to reference documentation. The way he sees it, these auto generated doc tools are just creating poor guides. Well he’s right, they are. The problem is they’re not making guides, they’re making API docs. Without acknowledging this distinction, it’s easy to confuse the problem and assume that the solution is just to write more prose. The problem is that writing more prose helps you with audience A ("guide mode" users), but does absolutely nothing for audience B ("reference mode" users). You have to help them both.
The real answer is that it’s not about choosing one or the other form; ultimately, we need both guides and API docs to create proper documentation for both of our audiences. My goal is to remind the budding documenteurs out there that your library or tool likely has more than one kind of audience, but before we start talking solutions, let’s look at what we really lose by ignoring either type of user.
A hundred years ago (pre-internet, in case you forgot), when someone wanted to find out when Abraham Lincoln was born, they didn’t buy a Lincoln biography from the book store. Most likely, they went to a library and looked it up in an encyclopedia (of course assuming they had no interest in being social and just asking somebody else). They, quite literally, didn’t want to hear his life story, but just one small piece of data. They were in "reference mode". This analogy summarizes the different types of users we deal with in code as well.
Just as with somebody searching through an encyclopedia for specific information, there are many users of code that are only interested in specific details of your library / framework / tool. Reading through an entire guide, just like reading through an entire biography, is neither efficient nor easy if all you need is one sentence. And while Lincoln’s birthday is likely on page 1 of any of his biographies, the readers only looking to learn of the specific details of his assassination likely didn’t get much information until much later on in the book. Similarly, a guide might be very effective at providing the obvious details up front, but they are much less efficient when a user needs more obscure information. More importantly, and unlike anything relating to Lincoln’s life, code has many seemingly obscure facets that many different people want to know about. If a biography omitted something obscure like Abraham Lincoln’s favourite food, not too many readers would care. On the other hand, omitting something such as the mixins used by a class might seem trivial to the author, but can be lost vital information for a user who is attempting to subclass a major component in your library.
The question becomes: if we are deciding what our users need to know, how do we know what they need to know? It sounds like an insane question, but from my own experience I’ve found that the person writing the documentation is often the last person who knows what the user needs to know. Forcing your users to rely solely on guides, where you control the information they can pull out, can spell really bad news for the documentability of your tool / library.
The Wrong Details
There is something innately beautiful about designing software with OOP. The goal is to describe modular, re-usable components; components that are meant to be repurposed, extended, subclassed, formed in arbitrary composition relationships, and so on. Ultimately, every highly re-usable component you write is in itself an API that is meant for the whole world to see. Yes, even those internal classes that you might think are completely irrelevant to the overall functionality of your system; you never know when someone will find a way to re-use that code. I’ve always looked at the code I write in this respect. The idea that you’re designing components for the whole world to use leads to better design, and by extension, better quality; oddly enough it also tends to yield better documentation! On the flipside, hiding those components by omitting them from documentation robs your users of functionality. Even if the functionality of the overall system is nothing like the components that make it up, each component provides a unique functionality to the users of your tool / library.
But if we write everything to be re-used, how do we document this potentially enormously complex body of code? Will guides really capture all of the right details? It’s fairly clear from existing projects that guides don’t capture these critical OO details that many developers rely on. Jacob brought in the Python templating framework Jinja as his example of beautiful documentation. It certainly has beautiful guide-based docs, and has some great information about how to use the library, but let’s say I put my hacker hat on and I wanted to extend or subclass a certain arbitrary core class— how would it be done? What dependency classes would I expect to come across? Is that information guaranteed to be present in the guides? Admittedly, I don’t know enough about Jinja to claim whether it does a good or bad job at this, but I can only guess that there are a number of cases where the existing high level docs wouldn’t give me what I want (say, for example, giving me all inherited methods of a class I might be extending).
Managing Complexity, or, "It Doesn’t Scale"
In the end, writing guides manually works out great for small black box tools. In fact, in this sense, Jacob’s views mesh very well with Tom Preston-Werner’s thoughts on "README driven development". Unsurprisingly, I had similar woes about RDD. The problem with writing READMEs and guides is it assumes your solution is simple enough to define in natural language prose. As we will see, not all of the information people look for in documentation is effectively communicated in natural language.
Once you start developing a larger more open ended white box system (say a framework like Rails) where every OO component is fair game for re-use, each individual component starts to matter more than (or equal to) the overall functionality. In traditional black-box ORM libraries, we wouldn’t care how the library code hit the database internally. However, you don’t need to look any further than ActiveRecord in Rails to see that an inordinate amount of developers have found value in digging into its guts and repurposing its internals. AR is hardly a black box to most plugin developers. Adapting the library to work with different backends has been the goal of too many plugins to count. The most important part? When these plugins were created, there were no Rails guides devoted to this topic. Simply put, the Rails maintainers didn’t expect their framework to be used in this fashion. Certainly they could not have predicted the popularity of Redis or MongoDB. And even if they did, writing a guide on "extending AR" is a huge beast. Where do you begin? Explaining every feature of AR’s internals with fancy prose is just way too much effort. API docs, on the other hand, are fairly cheap to put together, and at least users have a decent overview of logical components. I guess that explains why Rails still does not have guides on extending AR, even when it is commonly extended.
Just Go "Read The Source"!
Now before you say, "read the source", let’s clarify one thing: this is exactly why we have API docs. API docs give an overview of source code so that we don’t have to dig through implementation details. Just like guides, source code also exposes the wrong details. Code focuses on method bodies and algorithms, but that’s not what we’re looking for. We want to see the exposed methods (preferably only the "public" ones), inheritance chain, inherited methods, parameter and return types. It’s very inefficient to look for these details by digging through source files (especially when it comes to things like inheritance chains and inherited methods). This is where automated tooling comes into play. Tools like YARD or RDoc do the digging for us and compile that information in one place, offering a much more informative overview. Personally, I’ll go with the exact words Jacob used to describe his CodeConf talk: "if it’s not documented, I won’t use it". A logical extension is, "if it’s not documented, I won’t dig through your source code to find it".
So where does that leave guides? Clearly they can’t solve all of our documentation problems. But why? What’s the problem with all that writing? More writing is supposed to always be good, no?
It’s not Just About Prose, Bro’s
To understand why more is not always better, we have to understand who is reading our docs. Again, while new users will likely look to guides for high level overviews that answer the "what" question, users who are more comfortable with the codebase will come back to the docs looking for how. These people aren’t interested in another spiel about "what" (presumably they already read a lot of it), they are in reference mode. They want efficient access to information. For these users, less prose is better.
Of course, the reason we don’t often use words in traffic signals is mostly due to the economy of space, but it’s also fairly well known that symbols, shapes, and colours, often communicate short bits of information much more effectively than text or speech. For example, the term "context" tends to be a fairly ambiguous term used to describe any state encapsulation. Jinja’s library code uses a Context class to perform this type of state encapsulation, but for what? We could read the prose, but just reading the method list tends to give us a pretty good idea of what it does. At least in my experience, reading a list of methods in a class can often give me a very good idea of what the responsibility of a class is; often more quickly, and occasionally even more easily understandable than the prose that accompanies it.
Here’s a good example. Can you guess what kind of class is described by the following diagram?
I’ll give you some time to guess! Here’s a puppy:
Did you say Active Record (or some kind of "model" class)? If so, you’d be right. You probably got that from the fact that there are insert and update methods alongside properties. Heck, you could probably even figure out what the class name is: Person. Actually, you probably already knew this if you’ve ever picked up Martin Fowler’s Patterns of Enterprise Application Architecture book, because that image is right from his site describing, you guessed it, the Active Record pattern. You probably even know exactly how those insert and update methods work (and what they do); and I didn’t give you a single word of prose describing that class. Cool, huh?
This kind of trick certainly doesn’t work for every class, but I’ll bet it would work out for a surprising number of them. In fact, as a fun exercise, try this out! Pick a project you don’t know very well, load up a random file (don’t peek at the title!), try to avoid looking at the class name, and figure out what the class does just by looking at the methods. Describe, in your own words, what the class does, and what it should be named. Reply on Twitter if you got some surprising results!
Clearly we don’t always need natural language to understand a concept. It often helps, and I’m certainly not arguing that we should be writing less documentation, but we as the prose-writers have to realize that auto-generated portions like method lists, class hierarchies, and other data dumps often give us just as much, if not more information than we need. Continue to document your classes, modules, and methods, but remember that in API docs, they are meant to accompany (and explain) the real meat: that auto-generated goodness we see in our documentation tools. API docs are not meant to tell the user "what" the library does— at this point they’ve likely read the relevant guides and already know. API docs is where we tell them "how".
Hopefully I was able to clarify the distinction between the types of documentation your users are looking for. Some users want guides, but others want references. You shouldn’t mix these up. More importantly, you shouldn’t assume that API docs are guides, or that guides are a replacement for API docs. They’re not. They are capturing different details for different people. If you take anything away from this, it should be:
1. Know Your Audience
Not all information is created equal to all users. Your new users will likely look to guides to get started. But, and this applies to larger white box systems, users will eventually come back for more details in reference mode. They shouldn’t have to dig through source code for this information. You should have some second set of documentation for this separate audience. Make sure that you write for each respective audience in each respective mode. It’s understandable that you don’t need to bore users with information about inheritance chains in high level guides. However, users looking for reference material are likely looking for this kind of data. Your guides should be focused on the "what", and your API docs should be focused on the "how". You can assume the users of your API docs know how your system is structured; if not, you should refer them to your high level guides for this overview.
2. You Likely Need Both Forms of Documentation
Again, if your system is a small black box tool, this probably does not apply to you as significantly. However, remember that we write modular software with re-use in mind. Don’t intentionally rob your users of this re-use. Realize that even if your tool is a tiny black box, if it has a large user base, you will inevitably have users who want to decompose and re-purpose the most obscure of components. Make sure you support these users too, especially since they are the ones helping you push your tool forward.
3. It’s Not Just About Looking Good
Avoid the temptation to merge these two forms together in a single guide style document. Rocco does this, and Jinja’s docs, as we saw, tries to do this too. It’s fine to describe classes and methods in a fair amount of detail in guides, but don’t try to convince yourself that this is sufficient. Remember, to put that information in your guides, you had to throw away a lot of information that might have cluttered your guide and made it less readable. This is lost information that some users might be relying on.
It’s not supposed to be easy. As you may have noticed from reading this, writing good documentation is littered with many edge cases, users who need obscure information, and authors who have no idea what users need to know in the first place. It’s a complicated iterative process. Ultimately, you need to intimately know who you’re writing for, and find out what they want to know. What works for your project may not include any API docs, or, might not even require any guides at all! One thing is clear though: putting a blanket statement over RDoc-style documentation tools and saying that they never capture the right details is wrong. Write more guides if that’s what your users want, but don’t throw out your API docs unless you can really prove they have no value. In the end, good documentation is not just about how easy it is to read through, or how pretty the page was. Good documentation is about how much value your users gained from what they read, in whatever form it was presented.
Who Is Doing It Right?
If you want a model for good documentation, look at Apple’s Cocoa docs. I’ve always been a fan of their documentation, and YARD’s templates are heavily inspired by their Obj-C docs. They maintain both guides and references for many different topics. Their guides show high level details and are full of the prose we come to expect from good technical documentation, but they don’t skimp on the details when it comes to the API. Take a look at NSString. It’s fairly light on the prose, but very in depth on the details of the class. Note that relevant guides are linked right at the top of their API docs, and the guides link back to these API docs. Apple understands that it’s not about having one form of documentation or the other, it’s about having both. This is a model we should adopt in our respective communities, be it Ruby, Python, or other.