YARD Object Oriented Diffing

By Loren Segal on June 06, 2010 at 626:419:526 PM

Yesterday I tweeted a bunch about a new command in the upcoming YARD 0.6 release called yard diff which lets you perform object-aware diffs across two versions of a project/library/gem. I wanted to summarize exactly what’s going on and why I wrote it in a short article on the subject.

Object Diffing Use Case

While writing docs for 0.6 I realized that there were many new methods added in this version that apply to the API. For a plugin developer, this is great news, but for a developer trying to maintain backwards compatibility with older versions of YARD (if that’s your thing), knowing what’s new and what’s been around since version X (and what’s gone) is important information.

To be fair, it’s not that much of a big deal with YARD, since it’s easy to get someone to upgrade the gem. However, we’ve seen this a lot in Ruby 1.8.6 vs. 1.8.7. The 1.8.7 core library introduced a ton of new methods, things like improved Enumeration, Symbol#to_proc, a bunch of Array methods. Problem is, if you’re targeting 1.8.6, you have to be very careful about what methods you choose; the Ruby docs don’t tell you where they’re from.

Testing is great, but that’s a really slow way to find out you have a bug. A better way would be to immediately know if the method you’re about to use is going to break under a version of Ruby (or library) you’re supporting. Prevention versus detection, and all that jazz.

Using the @since Tag

So in YARD 0.6 I’m going to be going through every method in the last year of releases and tag them with @since tags. I’m embarrassed to admit, although YARD has supported this tag since the first alpha in 2007, I’ve never used it. I guess it’s not that strange; new projects don’t really have a need for such a tag. But now that YARD is turning 0.6, it’s matured quite a lot, to the point where there are multiple sets of APIs that differ slightly per major release. Now’s the right time to start adding the tags.

But I haven’t been keeping track of these methods, how do I find them?

Finding Added Methods, Modules and Classes

I quickly wrote the yard diff tool to solve just that problem. It performs a diff on two versions of (in my case) the yard library but at an OOP level instead of a file level. This means I don’t have to dig through file diffs to find added objects manually.

The other cool thing about this tool is that you don’t need to have the library installed (you can point at a .yardoc or even .gem file). Heck, you don’t even have to have the library on your system. The tool will automatically fetch the library from the canonical rubygems.org source if you don’t. Let’s see what it looks for comparing the diff in a library like rack (1.1.0 to 1.2.1):

yard (0.6-master)$ yard diff --verbose rack-1.1.0 rack-1.2.1
[info]: Searching for .yardoc db at rack-1.1.0/.yardoc
[info]: Searching for .yardoc db at rack-1.1.0
[info]: Searching for installed gem rack-1.1.0
[info]: Found rack-1.1.0
[info]: Searching for .yardoc db at rack-1.2.1/.yardoc
[info]: Searching for .yardoc db at rack-1.2.1
[info]: Searching for installed gem rack-1.2.1
[info]: Searching for local gem file rack-1.2.1.gem
[info]: Searching for remote gem file http://rubygems.org/downloads/rack-1.2.1.gem
[info]: Expanding rack-1.2.1.gem to /var/folders/iM/iMnHTcihFnSx5AEP308nO++++TM/-Tmp-/rack-1.2.1.gem...
[info]: Cleaning up /var/folders/iM/iMnHTcihFnSx5AEP308nO++++TM/-Tmp-/rack-1.2.1.gem...
[info]: Found rack-1.2.1
Added objects:

  Rack::ETag#digest_body
  Rack::Handler::WEBrick.shutdown
  Rack::Lint#verify_content_length
  Rack::Recursive#_call
  Rack::Request#options?
  Rack::Request#trace?
  Rack::Utils#rfc2822
  Rack::Utils::ESCAPE_HTML
  Rack::Utils::ESCAPE_HTML_PATTERN

Removed objects:

  ActiveModel (...)
  ActiveSupport (...)
  Array (...)
  Benchmark (...)
  BigDecimal (...)
  BlankSlate (...)
  Builder (...)
  ...

yard (0.6-master)$

As you can see from the verbose output, the tool searches for the gems in various places including our local installed gems repo as well as online. As you can also see, a few new methods were added to Rack since 1.1.0, and a bunch of things removed (passing -a shows all the methods inside those removed “things”).

If we have a library that depends on rack for one of these methods, we know we can’t support 1.1.0 unless we change our code. Similarly, if our code uses one of those removed methods/modules, we won’t be able to use the new library until we update our codebase. That’s important for our library’s documentation (and gem version requirements). This is also an output that could very well be included in our version ChangeLogs. A library could theoretically run a small script that prepends this diff to each ChangeLog before a release. That would be a pretty good idea.

$ yard diff . lastversion.gem | cat - ChangeLog > /tmp/c && mv /tmp/c ChangeLog

Conclusion

That’s one of the many cool little feature additions in YARD 0.6. With the new yard pluggable command tool, it should be easier for developers to write custom YARD commands that can do neat little auditing tasks like diffing, lint, testing, etc., all in one command. Although the ability to diff is nothing new (YARD was able to do this since many versions back), the integration should make it a lot easier to use. Also the fact that it will be coming out-of-the-box with YARD 0.6 should give developers better tools to document with, and no excuse not to improve their docs.

Questions? Comments? Follow me on Twitter (@lsegal) or email me.