Writing a translator in a day, with Dhaka
Jun. 15th, 2009 12:21 pmI suppose you could generously call this a compiler, but it's so simple that I won't even try to make that case.
For a project I'm working on, I need the ability to write queries. I have objects (files) with tags, and each tag has an optional value associated with it. So, I want to be able to write something like this:
This would return all the files with the tag "unheard" (any value, or none), and with the "bitrate" tag (either value 192, or value 'variable').
I was originally going to hack something stupid together, but I figured, it's a lazy Sunday, I was eventually going to have to rewrite any hacked-up crap I did by hand anyway, and maybe whatever Ruby has wouldn't be as horrible as when I did this in 2006 with lex and yacc.
Oh man, was I right. The best parser generator I found for Ruby, Dhaka, was vastly easier than lex/yacc, and I liked using it so much that I think I want to write this post just to go through the process again.
( Making a translator with Dhaka )
So, what does Dhaka do better than lex/yacc?
First off, it's all Ruby. No weird code generation step, no different file formats, it just slots neatly into your program.
Second, because it's Ruby, you don't need to care about malloc or free. Doing anything in C is a huge pain because of this, and parsers are even worse because you're trying to build a large structure (the parse tree) across function calls.
But the main benefit is that it builds the parse tree for you, and even draws a diagram of it. That's like a whole step that you don't have to deal with, and another step that becomes so much easier to debug.
For a project I'm working on, I need the ability to write queries. I have objects (files) with tags, and each tag has an optional value associated with it. So, I want to be able to write something like this:
unheard and (bitrate=192 or bitrate='variable')
This would return all the files with the tag "unheard" (any value, or none), and with the "bitrate" tag (either value 192, or value 'variable').
I was originally going to hack something stupid together, but I figured, it's a lazy Sunday, I was eventually going to have to rewrite any hacked-up crap I did by hand anyway, and maybe whatever Ruby has wouldn't be as horrible as when I did this in 2006 with lex and yacc.
Oh man, was I right. The best parser generator I found for Ruby, Dhaka, was vastly easier than lex/yacc, and I liked using it so much that I think I want to write this post just to go through the process again.
( Making a translator with Dhaka )
So, what does Dhaka do better than lex/yacc?
First off, it's all Ruby. No weird code generation step, no different file formats, it just slots neatly into your program.
Second, because it's Ruby, you don't need to care about malloc or free. Doing anything in C is a huge pain because of this, and parsers are even worse because you're trying to build a large structure (the parse tree) across function calls.
But the main benefit is that it builds the parse tree for you, and even draws a diagram of it. That's like a whole step that you don't have to deal with, and another step that becomes so much easier to debug.