Writing with Pleasure Using Markdown and pandoc
I write a lot, and I never know where or when my ideas will strike. In order to at least get things out of my head, I like to write in a format that I can use from just about anywhere, so I rely on Markdown. It’s plain text, which means that I can write it on my phone, my tablet, my laptop, with or without an internet connection. It’s also lightweight HTML, which means that I can add some structure to my writing, like lists, headings, and links, without distracting myself from what I’m trying to write.
A friend, Mike Bowler, once told me that he hated writing in a word processor, because the moment he started formatting what he was writing, he stopped writing and became fixated on the formatting. I found the same thing when I worked in Word, Pages, and other full-fledged word processors. I even have this problem in HTML, but I have to get the formatting right. I’ve noticed that this problem mostly goes away when I write in Markdown. Even when I do a little formatting, I can do it quickly, unobtrusively, and then get back to the thing I’m actually writing. This makes my job much easier.
I have also found it easier to write in Markdown without worrying about where I’m going to publish what I’m writing, in part due to pandoc
, the Swiss Army knife of text processing. Although pandoc
is not suitable for all users, I think that even non-technical users can enjoy the benefits of writing whatever, whenever, without worrying about producing “the right format” for how you will publish your writing. With pandoc
, I can write in Markdown, then decide later whether I need RTF, HTML, TeX, or any of a variety of formats. I can decide later when to publish an article as a blog post, a whitepaper, or even as a chapter in a book! The combination of Markdown and pandoc
gives me options.
The Simplest Case: Publishing to a Blog
If you publish to a blog, then you probably already have a system in place for publishing new posts. Whatever you do, working with Markdown and pandoc
will feel similar, and will probably not appear to give you much benefit over your current system. I felt the same way, which explains why I started using pandoc
only relatively recently. My old system worked well enough, which both means that there wasn’t much reason to change, but the similarity of the new system also makes it relatively easy to change.
I tend to compose blog posts at my laptop, where I mostly use Atom. (I work in vim, too, but I remain at best an intermediate vim user.) I use an Atom plugin to let me preview my Markdown documents as HTML. The plugin, called markdown-preview-plus
, lets me use pandoc
as my Markdown processor, meaning that I get the same results in the preview window as when I publish to my jekyll
blog, which also uses pandoc
to convert Markdown to HTML. I don’t have to worry about formatting something one way while writing, then getting something unexpected when I publish, which normally causes me to scramble and fix something at the last moment. When I publish something, it just looks right.
Since I travel quite extensively to work, I also find myself in airports and coffee shops when I have the urge to write something. In this situation, if I find inconvenient to pull out my laptop—or if its battery is dead—then I can pull out my Logitech K480 keyboard and Samsung Galaxy NotePro and write, using an app called JotterPad. This app lets me compose in Markdown, get a quick preview (not with pandoc
, but it’s good enough), then move the composition onto my laptop (directly or through Dropbox), where I can give it a final check before publishing it. This just works, and I feel no extra friction when I want to write using my tablet compared to using my “full-fledged” system on the laptop. I can write anywhere and get consistent results, which makes it easier for me to just write.
An Interesting Case: Rescuing Old Articles
I’m in the process of moving away from Wordpress towards jekyll
for my blog at blog.jbrains.ca. This means moving a bunch of articles from older formats into my current system. I dreaded the prospect of doing this, because it meant reverse-engineering the Markdown source for 100 HTML articles. Bleh.
Nope. I didn’t have to worry, thanks to pandoc
!
The first wonderful discovery I made about pandoc
was that it can convert HTML back to Markdown with at least 95% accuracy. Yes, it misses things like embedded video, but it does 95% of this terribly tedious work, and for that, I love it. (I love it as much as one can love software, anyway.) Now, when I have 30 free minutes, I can rescue a few old blog posts, by doing this:
- Log in to my Wordpress blog. Edit an article.
- Load my new blog project into Atom and start a new document.
- Copy the blog post’s title and publication date into the new document in Atom.
- Add a category or two to the new document in Atom.
- Select the entire Wordpress blog post content and copy it.
- Run this command:
pandoc --no-wrap --read=html --write=markdown <(pbpaste) | pbcopy
, which replaces the HTML on my clipboard with (almost) equivalent Markdown. - Paste the Markdown on my clipboard into the document in Atom.
- Save, preview, publish.
- Redirect the old blog URL to the new one.
Granted, you need some technical chops to do the last step, and the pandoc
command looks intimidating, but at least it’s exactly the same command every time, so you can just copy/paste it in your Terminal/Command Prompt. (If you work on Windows, then find a power user friend to teach you the exact command. I don’t do Windows, so I don’t know it.)
The important thing is this: pandoc
is helping me finally, after years, get away from Wordpress. If you have a love/hate relationship with Wordpress, then you understand why I might want to get away from it. So far, I’m still on my honeymoon with jekyll
. :)
Whatever You Want, pandoc
Can Do It
I really hate most modern software, because it lets me down constantly, but not pandoc
. I would even recommend it to non-technical users. Most non-technical users have two primary problems with software:
- Installing it.
- Dealing with problems when it fails, because every failure looks equally disastrous. It either works or it doesn’t.
Granted, installing pandoc
isn’t pretty to non-technical users, even though highly-technical users probably love how simple it is. If you need someone to install pandoc
for you, then I think it’s worth the effort. One thing you won’t have to worry about: pandoc
is solid. It just works. Once you learn how to ask it to process a document, it just does it, well and quickly. You won’t have to fiddle with it. It just works.
This makes me trust pandoc
. I don’t trust most software any more, so when I find something that I can trust, I really enjoy it. Now, if I need to convert some text from one format to another, I feel confident that pandoc
will do it for me. This means that I don’t have to worry about “Will I be able to republish these blog posts into a book?” It’ll just work. I can take 20 blog posts, paste some connective tissue around them, and then publish it as a Leanpub book and charge $7. No, it’s not the same as publishing to amazon, but it’s really good!
The Bottom Line
Markdown lets me focus on the writing. pandoc
gives me confidence that I can publish it anywhere, anytime, with minimal effort. I’ve written 50,000 words since I started using this pair of tools. I love it.
If you want to get started, read on.
The First Detail: Running pandoc
Once you have installed it, running pandoc
can be as simple as this:
$ pandoc --read=markdown --write=html -o results.html my_awesome_post.markdown
If you don’t specify -o
, then you pandoc
just spits out the resulting HTML, so you can do it with whatever you want.
Of course, pandoc
has many, many options. You’ll have to ask it for all the details: pandoc --help
.
The Gory Details: Setting up Atom with pandoc
Install both Atom and pandoc
, then read on.
Installmarkdown-preview-plus
Press Cmd+Shift+P
, then start typing “Install packages”:
Search for markdown-preview-plus
, then press Install.
After the package installs, choose Settings.
You might find it useful to change two settings:
- Pandoc Options: Path. Set this to wherever you installed
pandoc
. In my case,$HOME/.cabal/bin/pandoc
. If you don’t know, then open a Terminal window/command prompt and typewhich pandoc
. - Pandoc Options: Commandline Arguments. I like to have my blog’s CSS available to the preview, so I choose
--css /Users/jbrains/Workspaces/jekyll-jbrains.ca/php/css/main.css
. Great, no?
Now you can close the Settings tab, open any file containing Markdown, and press Ctrl+Shift+M
to get the preview window. If, for some reason, that doesn’t work, then do this from Atom’s menu: Packages > Markdown Preview Plus > Toggle Preview.
Now you can use pandoc
from within Atom.
More Gory Details: Using pandoc
with jekyll
Unless you speak bash
, you should probably skip this section.
To use pandoc
with jekyll
requires creating a “relocatable binary”. Fortunately, the installation instructions for pandoc
make that relatively clear. Once you have a relocatable binary, you can include it in your blog and make it available to your blog’s “production environment”—or server, if you prefer. I publish to OpenShift.
Ship pandoc
with your jekyll
blog
Since I want to schedule future posts to be published, I want my production environment (OpenShift) to rebuild my blog every so often. If you plan to publish each post as you go, then you don’t have to worry about this part at all: instead, just run jekyll build
whenever you want to republish. This means that I need a way to rebuild my blog in production, then a way to schedule the “rebuild” command.
Tell jekyll
to use pandoc
I added the following to my configuration file (the default name is _config.yml
).
# Bundle pandoc for production. See .openshift/action_hooks/deploy
markdown: pandoc
pandoc:
format: html5
extensions: [smart, mathjax]
Rebuild!
I put pandoc
in the folder libs
at the root of my blog project. On OpenShift, this corresponds to $OPENSHIFT_REPO_DIR/libs/
, so that I know how to use pandoc
to build my site on OpenShift.
I created an action hook script at .openshift/action_hooks
that rebuilds my blog using jekyll
.
#!/bin/bash
echo "$(date) Regenerating blog." >> $OPENSHIFT_LOG_DIR/regenerating-blog.log
export LC_CTYPE=en_CA.UTF-8
export LANG=en_CA.UTF-8
export LD_LIBRARY_PATH=/opt/rh/mysql55/root/usr/lib64:/opt/rh/ror40/root/usr/lib64:/opt/rh/ruby200/root/usr/lib64
export PATH=$OPENSHIFT_REPO_DIR/libs:~/.gem/bin:/opt/rh/ruby200/root/usr/bin:$PATH
gem install bundle
cd $OPENSHIFT_REPO_DIR && bundle install && bundle exec jekyll build --config _config.yml
I found out the hard way that I have to set the locale explicitly by setting LC_CTYPE
and LANG
. I did not enjoy that little surprise.
I put $OPENSHIFT_REPO_DIR/libs/
on the system path so that bundle exec jekyll
would find pandoc
and be able to run it.
Rebuild Every Hour!
I added the Cron cartridge to my OpenShift application and added this little script to .openshift/cron/hourly
#!/bin/bash
echo "$(date) Regenerating blog." >> $OPENSHIFT_LOG_DIR/regenerating-blog.log
cd $OPENSHIFT_REPO_DIR && bundle install && bundle exec jekyll build
I now realize that I should change the deploy script to just invoke this one, but my bash
skills aren’t excellent, so I’ll wait until I need to learn how to do that, then I’ll do it.
Now, I can write posts with a date in the future and my production environment will publish them within an hour of the date I specify. Nice, no?
References
John Gruber, “Markdown: Syntax”. A complete reference of the basics of Markdown. You’ll probably need to use this for a few weeks while you get used to how to write in Markdown.
Adam Pritchard, “Markdown Cheatsheet”. You can never have too many good references for writing in Markdown.
pandoc.org. Get started here with pandoc
. The installation page looks pretty straightforward, especially for Windows.
atom.io. A fine text editor, especially if you speak Javascript and like to hack, but even if you don’t.
Markdown Preview Plus for Atom. The plugin I use to get a WYSIWYGish view of what I’m composing in Atom.
leanpub.com. When you want to publish what you’ve written and charge money, Leanpub provides a low-friction way to make that happen.
jekyllrb.com. When you want to publish what you’ve written to the web, Jekyll provides a very simple way to make that happen, and without Wordpress.
blog.jbrains.ca. A blog I publish with jekyll
that uses pandoc
to produce HTML from my posts composed in Markdown.