Actuarius - A Fast Markdown Processor Written In Scala

Jan 6, 2011 17:28 · 799 words · 4 minutes read Scala Markdown Actuarius

Update (7th August 2017): This and other articles are kept for reference. Please note that Actuarius is no longer maintained by me but is now a part of the Lift Web Framework The web frontends for comparison were too much effort to maintain, so regrettably the respective links are dead.

I just finished my newest project, Actuarius, a Markdown Processor written in Scala using parser combinators. If you are not interested in the why and how, just jump straight ahead to the project page were you can get a more verbose intro into Markdown, some overview over the design decisions, scaladoc, sources, binaries and a little AJAX Webapp with which you can test Actuarius and it’s competitors.

Why Markdown Processing?

I needed a way to write my blog posts more comfortably — I started out with plain HTML snippets for each article first, but that was much too cumbersome to write. I had a look at a number of Wiki-style languages, but they still were IMHO too much effort when writing mostly plain text with some code examples. I also still wanted the possibility to just embed plain HTML, as no Wiki language offers the full feature set of full HTML. Markdown just has that perfect balance for me, as the syntax is super simple, yet you can just embed HTML without any escape sequences. Here is a small example of how my blog posts now look under the hood:

## A level 2 section header ##

And then a paragraph with plain text, some
of it may be *emphasized*

    //here is some very smart code:
    def main(args:Array[String]) = println("Hello Word!")
    
Then another paragraph where I make a smart
comment citing embedded code: `println` is so useful!

Then a list with smart points:

* this 
* is a
* list

<div class="foo">
    And if necessary, verbatim HTML.
</div>

Since I switched to Markdown, it is much easier and smoother to write something. I can only recommend Markdown to anyone looking for an easy to use text description language, as the syntax is much more natural than for example the one used at Wikipedia. If you are interested in more details about Markdown, I recommend you visit the original Markdown page.

Why another Markdown Processor?

Markdown implementations exist in about every language. There are already two implementations you can use on the JVM (my Blog is written in Scala, so I was obviously looking for either a JAVA or Scala implementation), one written in JAVA(PegDown) and one in Scala (Knockoff). Yet I think Actuarius still is a useful alternative. Some of it’s advantages over the two competitors:

  • No additional dependencies: It is written in plain Scala, so you do not need any other libs (if you write in Scala yourself — JAVA apps need the Scala standard library to use Actuarius)

  • Fast: Actuarius beats Knockoff by a factor of two and PegDown by a factor of four on ’average’ input (tested on my Blog posts and some sample input - I will provide the complete testsuite and results in a different article shortly)

  • Better handling of recursively nested elements than Knockoff: Knockoff
    runs into Problems if you nest too many lists and/or Blocks. The example Markdown on my Actuarius Project Page renders correctly in the original Markdown and Actuarius but fails in Knockoff.

  • Better handling of escapes than PegDown: PegDown does not escape special HTML characters like <, >, &, “ and ‘ in normal text paragraphs, just in code blocks.

There are of course also disadvantages of Actuarius in comparison to the other two:

  • Pegdown offers a lot of fancy and useful extensions right out of the box like beautifying quotation marks, a special table syntax, filtering plain HTML and lots more. Actuarius just offers the original plain Markdown syntax and the possibility to modify the way resulting HTML tags are rendered and a switch to disable verbatim HTML (mostly for security reasons to suppress XSS)

  • Knockoff is much more flexible in the way it’s output is converted. You are not limited to (X)HTML output, but you can easily write to LaTex, PDF and other formats as well or provide your own renderer for the parsed document.

  • Both other engines are (probably) more mature, as they are already much longer around. I do not know of any Bugs in Actuarius at the moment, but as I am the sole user this is hardly surprising. (Yet both Knockoff and PegDown have their quirks, too, as mentioned above).

So it is definitely worth checking all implementations out, as they all have their pros and cons. I have built a page where you can preview all three engines to check how they render input differently from each other. So the best way to get to know them is just to test them.