Having gotten the basic REPL working, I needed to begin building the parsing stuff. For that, I decided on porting my idiotically simple recursive-descent parser library to Rust because that seemed like a thing I might do, and I’m nothing if not me.
The original Elision parser used ANTLR. (I would link to this, but I think it predates the history of Elision in Github, which starts with the handover to ORNL.) The ANTLR-based parser worked well for a while but the files soon became numerous and very large and bootstrap parsing was taking too long.
Logan Lamb and I each built a parser (a friendly competition fueled by lack of communication) to replace the ANTLR parser. Logan used a recursive descent parser library (I think it was the Parboiled parser, seen here), and I… wrote a tiny class in Scala that used a two-buffer approach (it would fill one buffer while the other was parsed) and allowed for rapid parsing of left-linear grammars (as seen here).
I then had a different, embedded project that required faster parsing of files in C, so I ported the basics to C99. You can still find this as SPSPS, along with a JSON parser. Again, it turned out to be faster than the alternatives. Good design? Bad alternatives? You decide.
The C99 version handles ASCII-encoded files. I had started to rewrite it to handle UTF-8 encoded files… but got busy with other things. Now that I’m starting on a parser for Parser, Loc, and ParserError). The Parser provides a set of simple methods to “peek” at the character stream and “consume” characters from it. It’s a bit complicated by the error handling, but not that much! The Scala code used exceptions. The C99 code used an error field. Rust uses a custom Result.
use relision::parser;
fn parse_unsigned_integer<R: io::Read>(parser: &mut parser::Parser<R>) -> parser::Result<u64> {
let result = parser.take_while(|ch| ch.is_digit(10))?;
match result.parse::<u64>() {
Ok(number) => Ok(number),
Err(err) => Err(parser.error(err.to_string())),
}
}
At this point it seems I can start designing a parser, so that’s probably a good next step.