Archive for February, 2008

Practical YARD Examples 0001: Generating class diagrams with YARD & Graphviz

By Loren Segal on February 29th, 2008 at 2:15 AM

Tags: , , , , ,

So, recently I was looking at refactoring the current internal structure of YARD to improve the developer API and general stability improvements and I figured I would draw out the class diagram as a first step in figuring out what needed to stay and what needed to go. Problem was, having been away from the project for so long, I completely forgot what YARD looked like on the inside. I did not want to dig through the source; I wanted to see the project from afar to keep my design from being influenced by the current layout (in terms of implementation details). I really needed to visualize the project from a high level.

Using the raw data file that yardoc generates when it is run, I mocked up a quick tool called yard-graph which allowed me to visualize my project using Graphviz. Note that this is the same functionality that I described any developer could take advantage of in YARD’s very first release. Doing things like this is what YARD is all about. The results aren’t quite perfect due to both the incompleteness of the graph tool and the limitations of Graphviz, like how hierarchical mode is extremely strict about hierarchy and likes to make extremely wide graphs, but they’re highly suitable for my needs; actually, I would call them quite excellent for my needs. Take a look!

Click for the full diagram

As an excercise, I also implemented the ability to show module dependencies (via include) and the public class API. This is what that looks like:

Click for the full class diagram. This one's a doozy.

For those interested, this code lives in the diagram branch in Github (yea, I got a free invite and decided to try it out; I forgot / have no idea who invited me– please email me so I can thank you, by the way). After a rake install of the gem, you can run (in your root project directory, or maybe lib/ to ignore tests):

yard-graph --dependencies --empty-mixins --full | dot -T pdf -o diagram.pdf

This requires dot (part of Graphviz) and generates the diagrams with all the above options, module dependencies (--dependencies) and --full for the class/instance methods, constants and attributes info. The --empty-mixins flag shows empty modules as nodes instead of subgraphs (the delimiting boxes), since by default Graphviz ignores empty subgraphs. If you use modules as mixins and not just a way to namespace classes, you’ll want this on or you won’t see your modules.

I also highly suggest using a vector format for output like pdf as in the above example. Bitmap files can get really big. If you want proof, check out the full class diagram of the rails source I generated and imagine that as a bitmap.

Hopefully I can get some more practical uses of YARD up in the next coming weeks for people to see how awesome it is!

YARD can parse Rails

By Loren Segal on February 25th, 2008 at 3:59 AM

Tags: , , ,

With just a few tweaks, I just ran successfully YARD over actionpack, activesupport, and activerecord with just one error (repeated in a few spots) in the template generation process. I’m starting to realize that the parser isn’t as bad as I thought– there was one noticeable parser issue, which will be fixed in the coming weeks. I am still, however, looking closely at the ruby2ruby and ruby_parser projects– ruby_parser is useless to yard until it starts supporting line numbers, but I’m waiting anxiously to see where Ryan Davis is going to get with it.

You can check out the docs over here but remember they’re not supposed to look nice yet.

And for those interested, the Subversion repository for YARD is at

svn co http://soen.ca/svn/projects/ruby/yard/trunk

No Dice? No Problem.

By Loren Segal on February 23rd, 2008 at 7:09 PM

Tags: , , ,

This was my day: Star Wars Monopoly

I invited a friend (who doesn’t happen to be very geeky) over to play some Star Wars monopoly (okay, maybe she is) this afternoon because she had recently realized she had the box lying around her house. We opened up the box and started setting up the game. Got the pieces? Check. Got all the money? Check. Properties? Check. Everything seemed to be going smooth until everything was ready and we were ready to roll the dice… the dice!

Now, I don’t play boardgames often, so it’s not like I can just go to my trivial pursuit box and steal them for a bit. The only other games I have in my house were downstairs hidden behind years of legos and children’s toys (legos are not children’s toys). So, like any geek, the first thing I thought was, "Hmm, what are good ways to generate a random number from 1 to 6 without having to get up?". I looked at my mac…

Requirements Phase

Now first it’s important that we list some base requirements for our Dice application. It was very important that we not have to get up, and since the computer was across the room, this limited our input device. The monitor was also far away, so if the output was text-based, it would have to be displayed appropriately large to be seen across the room.

Design & Implementation

Given the requirements, it seemed that the best choice would be to have the output be spoken out loud by the machine. This would avoid any issues with displaying output from far, and Macs have this stuff built in. Now we were able to choose a language best suited for the task. I thought quickly about doing this in Ruby but remembered that this is really something that AppleScript was built for. Interfacing with the speech module is dead easy, and so is our code:

The AppleScript Dice code

Now we need a way to trigger our dice. Voice recognition also comes built-in on a mac but I couldn’t figure out how to take advantage of it quickly enough so I abandoned that. I tried to think about other input devices that worked from far and settled on the Apple Remote. Now, the remote by default comes with about zero functionality, but I remembered that many people have hacked it up nicely to do pretty much everything under the sun, so I did a little googling and came up with an application called Remote Buddy, which pretty much does just that. Instead of figuring out how to interface specifically with the Script Editor application, I just moved my mouse over the run button (in photo above) and hooked up the play button on the remote to perform a left click:

The Remote Buddy Configuration 

And there we have it. A dice machine. Keep in mind, this all sounds complex but it pretty much happened in under 5 minutes. I go to great lengths to play monopoly.

Update: February 24th 2008: I did a little more googling today while I had more time and came up with the complete-speech-recognition version of my dice app. Turns out the code isn’t that much more complex. Note that we wrap our code in a try because the SpeechRecognitionServer likes to time out.

repeat while true
	try
		tell application "SpeechRecognitionServer"
			set resp to listen for {"roll", "quit"}
			if resp is "quit" then
				return
			else
				set x to ((((random number) * 5) + 1) as integer)
				set y to ((((random number) * 5) + 1) as integer)
				say "You rolled " & x & ", and " & y & ", the total is, " & (x + y)
			end if
		end tell
	on error errmsg
	end try
end repeat

Now all you need to do is tweak your speech recognition settings to not use the "Esc" key to begin speaking. I also turned off the command prefix:

Speech settings

The YARD Roadmap to 1.0

By Loren Segal on February 22nd, 2008 at 4:45 AM

Tags: , , ,

It’s been almost a year now since the last release, but today I released a new version of my Ruby documentation tool, YARD 0.2.1. This release gives the tool a surprising amount of robustness for the little code I changed and should make it easier for people to play around with, since it can currently parse the following gems without exploding: merb-core, merb-more, datamapper and obviously yard. You can actually see them here, but don’t judge them for aesthetic merit– the templates were hacked together to show off the introspective power of the documentation, not the visual results. In fact, the only docs worth looking at are the yard docs, because without the Yardoc formatting, a lot of the point is lost. I believe it works with active_record too but I haven’t tried it recently. You can install YARD with a simple:

sudo gem install yard

To run this tool against your Ruby code just type in yardoc in your root directory; a doc directory will be created with an index.html file you can run to see the results, providing it didn’t crash in the process. Note: It still is horribly buggy.

For those who don’t know (and haven’t clicked the above link to YARD’s homepage), YARD is a Ruby documentation tool meant to fix all the problems with RDoc. There are plenty, and I outlined them all on the YARD site, so read up. After you read up, close that page, because it’s horribly out of date.

Where I’ve been

This gem update was a long time coming. A long time thinking, and a long time pondering over what YARD would be and where it would go. I was very disappointed with the state of the code (read: parser) when I gave up, and I also got a little sidetracked over last summer with an internship at Apple, so I kind of abandoned it knowing I would eventually come back. Between having people actually ask me about my progress, and Merb’s docs using a format very similar to the one I created, I think it’s becoming increasingly clear that the time is either now or very soon.

And so, I had a chance to look at the yard source again recently and realized it actually works a lot better than I thought it did. I applied a few patches, and here we are.

But there is still plenty of work to be done.

I’ve actually known how I wanted to get to 1.0 for a while now. This week I bumped into a colleague of mine on the same flight as me to Seattle (well, he was going to Frisco for GDC, me to Redmond for a MS interview, more on that later) and we got to chatting about a bunch of projects we’ve been working ended up discussing how we plan our milestones for projects. I actually laid out my entire roadmap to him right there, so I figure I’ll make it more official and put it out on my blog for those interested in developments.

The Roadmap

I plan on dividing my releases in ‘0.x.0 - 0.x.9′ segments to target one specific design feature all the way up to 1.0. There will be a number of iterations for each design feature, but for the most part each feature will be able to be developed independently/concurrently. That is to say, I plan on seeing through development of 0.3.x potentially before even finishing 0.2.x. As for what each exact release will be, that will depend on how development goes. I can’t get that detailed just yet.

0.2.0 - 0.2.9: Parser fixes / complete parser rewrite

The parser is horrid and needs to be fixed. On the other hand, every parser is horrid, so I’ll live with it the horridness. What I can’t live with is the bugginess. My only wish is that I define the grammar formally using some parser generator, that way I don’t have to look at the ugly part of the code. Ragel is currently my top choice, but Adrian Thurston personally recommended I look at Island parsing. There’s a lot of reading to do, and a lot of time to do it in. This code will need to change as Ruby does (going from 1.8 to 2.0) and there is a lot of testing that has to go through to get it working right. This iteration will probably take a long time. We can potentially use the Ruby lex file to do the lexing, but I don’t know much about hacking with that. Help.

0.3.0 - 0.3.9: Fixing the developer API

YARD holds itself on being an extensible / modular piece of code that could easily support the documentation of various Ruby DSL’s through plugins. The code is not as elegant as it could be, but there is a general framework in place that can be whipped up into something much more usable. Once this is done, YARD can start talking to framework developers to write YARD plugins and use Yardoc formatting. Merb would be a great place to start since they’re already going in that direction.

0.4.0 - 0.4.9: Revamping the templates

Currently YARD was prototyped with a basic Javadoc-style template just to show off some of the inheritance features it has over RDoc doc templates. It’s not meant to be pretty right now. That needs to change eventually, but should be easy once the API is tacked down. YARD can potentially provide a number of templates for different peoples’ tastes, from Javadoc (for JRuby guys) to a more Ruby feel, and everything in between. Output will not be limited to XHTML of course, and can be done in plaintext, man files, etc.

0.5.0 - 0.5.9: Bring it to the community / design review & feature requests

As much as I have my say in software I make, other people have their own. It’s important to hear them out. This will be a good time to get the name out and start getting users using the software and evangelizing the benefits of YARD / structured documentation. I’ll be able to see what people can and can’t live with from a high level, and make some changes before it’s too late. Documentation, tutorials, howtos and basic written word can be fleshed out at this point, to give people better ways to access YARD’s internals.

0.6.0 - 0.6.9: Extend on API with optimization focus on raw data storage, consider database adapters.

One of the other goals of YARD is to provide a raw format for all the information that YARD collects about your source code. This enables developers to perform analysis on documentation through auditing tools, or simply provide a way to serialize the documentation data to another format that may not directly be for human consumption: YAML, XML, etc. This would also mean providing a way for developers to pull the contents from the raw data file to a database, which could enable developers to write interactive documentation applications for their frameworks/code (what caboo.se is currently attempting to do– but manually, or even RailsLodge). This could even mean allowing YARD to store data directly to a database of choice using various SQL adapters.

0.7.0 - 0.7.9: Make YARD work with everyone’s code, up to 2.0

By this point the parser should be far superior to the current 0.2.1 state. Now it should be parsing Ruby’s source tree, Rails, Merb, etc. without issues. At this point we can pre-emptively start adding compatibility for Ruby1.9.x since this will need changes to the parser. Better now than after Ruby1.9.x is finalized.

0.8.0 - 0.8.9: Integrate YARD with Gems/Ruby

Currently when you release a gem you can have it generate RDoc. We’re going to want to change that– well, not change, just allow Yardoc generation. This will require some cross-patching, and a little convincing the gems guys that YARD is awesome. It should be at this point. Also, we want to have YARD generating Yardoc files for Ruby’s source tree when it installs at this point so that you can yri String, for instance, or link to the String class from your applications documentation.

This is where we would need to start converting Ruby’s RDoc format to Yardoc in preparation for the 1.0. This will take a while and extend all the way through the 0.9.x run as well.

0.9.0 - 0.9.9: Run a round of stability patches and general bugfixes, feature-set should be stable.

Have another iteration of community feedback, last minute minor changes to the source for small fixes, minor feature additions/removals, but nothing too grand. YARD should already be awesome and integrated at this point, right? We should also be continuing to convert Ruby docs here.

1.0!

Matz pulls YARD into Ruby as a standard module replacing RDoc. I can dream, right?

Help me dammit!

This project needs developers. Rewriting a parser isn’t easy, and that’s just one of the many steps involved. If anyone finds any shred of motivation to help in any way they can, pleasepleaseplease contact me [l s e g a l (a t) s o e n . c a]. I’m not usually the type to ask for help but it would be extremely healthy for this any project not to hinge on just one person. If there’s any set of features listed above that you think you can tackle and you actually want to see this get done, get in touch. Stuff like template design, coding, writing documentation, testing, or even just blogging about the project / making a screencast (when there’s something to screencast, of course), it all helps.

Your SCM may be decentralized, but your project isn’t

By Loren Segal on February 22nd, 2008 at 12:18 AM

Tags: , ,

Or, why your project doesn’t need Git

Geeks love new things. If geeks were frogs, gadgets would be lillypads. Here’s a diagram:

Figure 1: Frogs vs. Lillypads

Frogs Geeks love to jump from lillypad to lillypad as soon as they see a new one. This is why I’m not all that surprised that people (at least in the Ruby community) are quickly ditching the Subversion ship that’s gotten so much attention over the last few years. While surprised I am not, I am slightly disappointed in people for leapfrogging to a new technology so quickly before it’s proven to really be as cool as it sounds.

Important note: I’ve only been using Git for about a week now, so many of the technical details I provide about Git may will be wrong. Please correct me. However, none of what I’m about to say has anything to do with the technical aspects of git. Having seen the way git is used in certain projects makes me wonder what people think they’re really getting from distributed source control management. Also note that I’m not discussing Linux development here– I know nothing about it, nor do I care. The fact that git works great for Linus doesn’t mean it works for every other open source project.

Git is a technical masterpiece; but not all technical masterpieces are useful to you.

Git has some really nice features. Branching is effortless and hidden from the user. Merging is nice, conceptually, though resolving conflicts seems like a pain compared to Subversion. The speed/size optimizations are a must, and I sure hope the svn guys get their act together.

Note however that none of what I just credited Git for has anything to do with its decentralizededness (dictionary, please). All of this could be implemented in Subversion without changing your workflow.

Programming is not quite like editing Wikipedia

The supposed advantage to decentralized SCM’s is that anyone can contribute code just by running a git clone and then making and sharing changes. Everybody has "their own branch" that they could develop on. But the truth is that anyone who tells you this is simply giving you a false sense of reality.

In real life, you don’t just download the repository, make a change and get a guarantee that your work will be merged back into the main branch for everyone to use (you made the change using SCM so you could share it, after all). In real life, projects are well guarded from the outside world and have a few gatekeepers known as maintainers. These people are usually the project owners/creators. In real life, you deal with having to convince these maintainers that your code deserves to be in the main branch.

The only advantage to git is that once you deal with all the politics, you can theoretically have the maintainers merge your Git branch with theirs really easily– though in reality most projects still would rather take patches using ticketing systems. In reality, workflow trumps technology.

Your workflow is the ultimate bottleneck.

To really understand why decentralized SCM is a complete waste for 90% of your projects, you must first step back and look at how you work. Let’s describe your average open source project:

  1. Most open source projects are small. Not everything is Gnome/wine/Linux.
  2. Most projects I see using git have about 10-12 active developers, with about 3-5 active committers. It can easily be fewer.
  3. More specifically, these projects have far fewer developers than users. Most of the people who download the source only do so to compile it– never to edit.
  4. Sometimes there is only one main committer with one or two backups. Watch project timelines, you’ll see only a handful of names– the rest will be your odd patch.
  5. 99.99% of all open source project inevitably make one official release for each set of changesets.
  6. Such a release is usually hosted in one centralized location with maybe a few mirrors strictly for distribution’s sake.

Is anyone coming to a scary realization here? As decentralized as you attempt to make your project, you will always run into a single point of failure: your workflow.

I’m currently watching merb-core development as an example of one of these projects and I’ve noticed that the workflow is essentially equivalent to one with a centralized repository. Someone will submit a patch and have it committed by one of three main committers. If your patch doesn’t make it to the main branch, you’re simply out of luck. Sure, you could use your patch locally, but you could do this with any body of source code whatsoever, .git, .svn or .tar.gz. This is really no different from Subversion to anyone outside the core development team.

If your project has only one or two active committers or falls under any of the above categories, do yourself a favour and don’t waste your time installing git on your server. You won’t be benefiting from its features because your project and workflow will not have changed.

So who really benefits from distributed source control?

Core developers do. The truth is, git isn’t as great as a DVCS as it is a private whiteboard for "pre-commits" to the main repository. Git can make it a lot easier to pass around changes before finalizing them which would mean less broken builds on the main repository. That’s a good thing, and almost worth a two-tier setup (see diagrams below). But really, this is nothing Subversion cannot do with almost as little effort.

Why Subversion can do what Git does

This is what a git development workflow normally looks like:

 Figure 2: How git development works

The outer repository in this diagram is bundled with a release server (web server, most likely) and ticketing system for patches. The "git" blocks are machines with individual branches for each core developer (abstracted from their physical machines in case they use github or something). I didn’t draw all the connecting lines, but enough to show where the bottleneck lies. More importantly, that you’re not really using git as a decentralized development platform (sorry).

Now lets try this setup with Subversion:

 Figure 3: The same scenario with Subversion

Notice that the Tier 1 workflow does not change. Instead, imagine a subversion repository (it doesn’t have to be the same physical repository) where each developer has their own branch and "publishes" their changes by committing to that branch. This development workflow is 100% equivalent to using, say, github, to share changes. Literally– it’s exactly the same. No, really, it is. In this scenario, Joe can merge Larry’s changes by simply– merging them into his branch. When a final release is made, the few maintainers will merge the code that they got from other branches back into trunk, potentially tagging the changeset.

In fact, not only is this workflow exactly the same as Git’s, but it has a side effect that nearly makes it more powerful than using Git: in an optimistic development environment, the maintainers could give out write access for branches to people from the outside world. I could get "Loren’s branch", and start developing my changes in my own little sandbox. This would be similar to git, but the visibility of my code would be much higher in that the core developers would be able to keep tabs on changes that non-core developers are making without having me ping them about it *. I would no longer be a second-class citizen with my own git repository far off at some URI in the public world (see git diagram), but instead I would be developing in the same location where the core team is. I have no clue why people think this is a bad thing.

* To be fair, git could do what I just described, but the developers would need to manage links to all the outsider repositories / track them all with relatively complex, currently-non-free-or-very-private software (github being one). The infrastructure for doing this in Subversion is built-in and implicit.

In summary, group me with all the other people who are skeptical of distributed source control management, please.

If only all hackers were this awesome.

By Loren Segal on February 16th, 2008 at 10:01 PM

Tags: , , , , ,

These are fake but oh so cool:

He has way more on his youtube page.

Very Flat Merb Projects, New in 0.9.0

By Loren Segal on February 15th, 2008 at 6:41 PM

Tags: , , , , , , ,

Man oh man am I happy. I recently discovered Merb, a more lightweight thread-safe alternative to Ruby on Rails. I might devote an entire post to this framework because there’s a lot to be said, but I want to get this off my chest.

I need a mini-framework

I’ve been looking at all these different frameworks to find something that can run more efficiently on my tiny little ghetto server, because I don’t have the resources to run these memory hogging rails apps for my relatively tiny websites. I really need a small framework for the kind of apps I make. You know, pull out a quick blog, wiki, or other, with no DB access at all. What about Camping? No, that’s a little too small. I settled on Merb. Merb is considerably bulkier than Camping, and for a bit I was wondering if even it was too much, until now…

The Merb guys have just released a developer version 0.9.0 which has some neat changes, but what definitely caught my eye was (from the last few lines of the post) the –very-flat option for the application template generator. No need to explain, I’ll just show you how awesome it is:

titanium:merb jinx$ merb-gen blog --very-flat
Fri, 15 Feb 2008 22:57:23 GMT ~ Not Using Sessions
RubiGen::Scripts::Generate
      create  /blog.rb
      create  /README.txt

Now, if we load up the one file created for our project:

Merb::Router.prepare do |r|
  r.match('/').to(:controller => 'blog', :action =>'index')
end
 
class Blog < Merb::Controller
  def index
    "hi"
  end
end
 
Merb::Config.use { |c|
  c[:framework]           = {},
  c[:session_store]       = 'none',
  c[:exception_details]   = true
}

BAM!

Imagine that? That’s all there is to our app. That’s the entire source code– program logic, configuration and support files. It’s fully executable as its own application with the command merb -I blog. Microframework indeed. Granted, it’s pulling all of the ‘merb-core‘ gem, but that’s pretty small. This code is super portable (in the put-it-on-a-usb-key sense, not the platform-to-platform sense) and really quick to develop with. Step aside, Camping.

Granted, I don’t think I’ll use this method of development since it’s the same reason I won’t use Camping, but it’s nice to know I can drop down to the really simple level if I need to just prototype one quick "one-button" app and still have room to grow it out. The fact that Merb can actually do this is what’s most mindblowing. It’s a true testament to the modularity and extensibility of the framework’s design. Getting Rails to run without ActiveRecord is a pain enough, let alone pulling out everything but routing and controllers.

Lighting 101

By Loren Segal on February 14th, 2008 at 12:14 AM

Tags: , , , ,

I just learned about a really cool photography lighting site called Strobist. It’s basically hundreds of tips, tutorials, and DIY’s to get professional results with the least gear possible, as well as the least $$$. This is great for the starving student thing I got going on. That’s also a reason I usually only shoot nature landscapes. I found this really informative "basics of lighting" video I thought I’d share, it’s one of the first links in Strobist’s Lighting 101 article. Enjoy.

Now I have a reason to go find my SpeedLite that’s hiding somewhere in my room.

Come Get the F$#@!^ing Blocks

By Loren Segal on February 13th, 2008 at 9:13 PM

Tags: ,

Layout Disaster

By Loren Segal on February 08th, 2008 at 4:38 PM

Tags: , , , ,

You may have heard of graceful degradation- this website degrades quite differently.

hema