YARD Milkshakes (0.2.3) Released

By Loren Segal on June 06, 2009 at 67:444:226 PM

Today marks the first release of YARD in almost a year. Although technically speaking the new release isn’t that much more featureful than 0.2.2, this release marks a pretty huge milestone in YARD’s development. It turned out to be extremely difficult (for me) to keep up development with a full school schedule. YARD sets out to accomplish a heck of a lot more than one person can handle on a part-time basis, but the fact that it’s starting to come together is really what is worth celebrating here.

New Parser, Stronger Code

That said, there are still some significant improvements in the 0.2.3 release. Plenty of stability and performance fixes were added, and a completely new parser for Ruby 1.9 was implemented based on the new Ripper parser that comes with the 1.9 standard library. Those who have been using YARD know that it has its own homegrown parser built from the ground up. While I’m impressed that I was even able to write a Ruby parser that works 80% of the time, it was quite lacking in both speed and robustness. Although Natalie Weizenbaum (of Haml fame) contributed some great refactoring and quality improvement patches to the parser, it still has some pretty obvious limitations.

The new 1.9-only parser will be the face of YARD looking forward, especially as people will start migrating to Ruby 1.9 themselves. The 1.8.x parser is still supported and maintained as far as bug fixes go, though new functionality probably should not be expected.

If you’re not yet convinced that Ruby 1.9 is worth the trouble for your project, consider these benchmark results of the new ripper parser parsing the YARD codebase twice:

yard (master)$ ruby -v
ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-darwin9.6.0]
yard (master)$ ruby benchmarks/ripper_parser.rb 
Rehearsal -------------------------------------------------
<strong>rip-parser      1.930000   0.020000   1.950000 (  2.004864)</strong>
yard-parser     4.700000   0.020000   4.720000 (  4.712828)
---------------------------------------- total: 6.670000sec

                    user     system      total        real
<strong>rip-parser      1.760000   0.000000   1.760000 (  1.763015)</strong>
yard-parser     4.720000   0.020000   4.740000 (  4.755574)

The comparison here is between the new parser used when you run YARD in Ruby 1.9 versus my old pure-ruby implementation. The difference is significant; a speed improvement of nearly 2.7x. Realize again that these are both running in 1.9, so we’re not even comparing the speed benefit you get from using YARV over MRI. That result is actually:

yard (master)$ ruby -v
ruby 1.8.7 (2009-01-28 patchlevel 99) [i686-darwin9.6.0]
yard (master)$ ruby benchmarks/ripper_parser.rb 
Rehearsal -------------------------------------------------
yard-parser     7.820000   0.030000   7.850000 (  7.859455)
---------------------------------------- total: 7.850000sec

                    user     system      total        real
yard-parser     8.070000   0.020000   8.090000 (  8.114952)

Note that I had to modify the benchmark script to ignore the ripper parser here.

So, in Ruby 1.8.x, the fastest YARD can parse its own codebase twice is roughly 8 seconds. Compare this to running YARD in Ruby 1.9.1 (previous results), which can parse nearly over 4 times faster. This becomes significant on large projects like Rails, which takes 11 seconds to parse all of code in 1.9, but roughly 65 seconds under Ruby 1.8.7. 65 seconds is a long time to wait to parse code; and that doesn’t even count HTML generation, which takes an even more significant amount of time.

Plenty of New Documentation

I’ll be the first to admit, for a documentation tool, YARD is surely lacking in documentation. I take this seriously, but at the same time, I’m only one person and there’s a lot to write about. I made it a point this time around to spend this entire last week brushing up documentation to make YARD easier to jump into. YARD now has a great Getting Started guide which should cover the really basic aspects of how to get around. More importantly, as a developer tool, there is also a hefty Technical Overview guide covering all of the major components of the project and how a developer could take advantage of them. In this release alone I’ve written roughly 7,948 new words of documentation, and I’ve barely scratched the surface. YARD has tremendous potential and tons of use cases, so writing about it all will take time, but it will happen.

Looking Forward

So in the end, this release was very important. The parser was something that was on the table for a long time, dating back to my first roadmap to 1.0 post last year. Now that it’s complete, we can start looking ahead. As more of the groundwork is properly laid out, it becomes easier to move forward. The next release will be focusing on drastically improving the quality of the HTML templates, so expect bigger changes in the coming months.

By the way, if you didn’t notice, YARD got a new site face-lift, so check out the site.

Questions? Comments? Follow me on Twitter (@lsegal) or email me.