A Plain Text Workflow for Academic Writing with Atom

[Update: Since this post has a bigger audience than I’d anticipated, I’ve fixed some typos and added some details about Atom, along with short sections at the end on using Zotero with the Better BibTeX extension and fixing errors.]

Why Plain Text?

I recently decided to switch to plain text for my academic writing for a number of reasons. I wrote my first book and articles toward the second book in Scrivener and MS Word, but when I started the new book manuscript in earnest I found myself with new needs: seamless cross-platform compatibility (I got fed up with Apple’s hardware offerings, and Scrivener is between versions on Windows and macOS); the customizability and file organization I came to rely on in Scrivener, and; now that I may need to change citation styles depending on how the book and pieces are published, a citation management system, which is very difficult to do with Scrivener.

I got some great tips on Twitter from colleagues about how to start, and I especially recommend Dennis Tenen and Grant Wythoff’s guide to plain text at The Programming Historian. But after gathering those guides and resources (linked in the text and also gathered at the bottom here), there was still a bit to do to set things up and figure it out.

I thought this post would be helpful to colleagues who are interested in plain text (and share some combination of the needs I list above), but who’d like to see what the editor and workflow might look like before committing to a change.

Tools

First, one needs a programming editor. I’d intended to start with Sublime Text, which I’d used for programming in the past, but Moacir de Sá Pereira pointed me to the very similar Atom, which is free, open-source, and, like Sublime, it’s cross-platform and has a robust ecosystem of packages for academic writers.

Half the setup work was just looking for the right packages for an academic writing (rather than coding) workflow. In Atom, I’ve installed, through the “install packages” tab in the settings,

  • wordcount (a running wordcount in the bottom),
  • language-markdown (for syntax highlighting),
  • file-icons (highlights different file types in the file tree on the left),
  • date-time (for quickly date-stamping journal entries or blog posts), and
  • bibtex-autocomplete (more on this in a moment).

Helpful settings for me included:

  • soft wrap (wrapping lines isn’t something coders automatically want),
  • autosave enabled (beautifully, saves every time you click away from a document; this option is inside the autosave package settings),
  • autocomplete delayed (as I only use it with autocomplete-bibtex),
  • Palatino font, and
  • “remove empty panes” disabled, which preserves my Scrivener-inspired layout of document panes across sessions or document swap-outs.

It looks like this:


Atom with an open writing project

​From left to right: the folder structure for the project, a main document window for draft text, a pane on the upper right for reference (notes on sources or outline notes), and one on the lower right for to-do lists, project-wide searches, or double-checking the bibliography file. All this can be customized however one likes (from arrangement to color to snippets), which is the main advantage of using a programming editor instead of a word processing program. I’d be shocked if one couldn’t arrive just as quickly at the setup of one’s dreams in Sublime, vim, emacs, or one of many other editors. It was surprisingly easy to to incorporate all the features I had used in Scrivener, and then some.

(A programming editor like Atom also makes it very easy to work across several files mainly from the keyboard. Shortcuts let you navigate between panes, open a file by name, or arrow around the file tree to rename, move, or duplicate files. And a command palette, cmd-shift-P does a fuzzy search for any command whose keyboard shortcut you don’t know.)

In addition to the programming editor, one also needs a citation manager that can export to .bib format, such as Zotero with the Better BibTeX extension (more on these below), as well as the command-line document conversion tool that I use at the end of the workflow, Pandoc. A LaTeX suite, such as MikTeX, makes it possible to typeset documents in .pdf with Pandoc, or to customize .pdf styling after having Pandoc convert to .tex. ​

Drafting

Markdown

I write in the center pane, using the Markdown syntax, which is like a simplified HTML, but uses punctuation marks and spacing to denote formatting. It’s quite intuitive: surrounding a word or phrase with asterisks puts it in italics, headers are lines that begin with a number of #, followed by a space, and links, images, lists, and more can be added. (More in this Markdown Cheatsheet). They’re all plain text files, which makes them extremely flexible and portable. I’ll show an example of the syntax below in a moment.

Footnotes

Footnotes are included in the version of Markdown that Pandoc translates, and they’re great. There are two kinds, and they both use a caret and brackets: an ^[Inline footnote] can come in the middle of a paragraph, while a [^Labeled] footnote refers to a longer footnote whose text is in its own paragraph(s). It’s all numbered automatically at the end. All the details for footnotes are in Pandoc’s footnote documentation.

Citation

This part is both the most powerful and the most complicated aspect of the workflow. First, you gather sources in a citation manager and sync it to a .bib file. In Markdown generally, you get your unique citation key from your reference manager, and the Markdown syntax is, [@Warner2013, 55]. It’s like parenthetical documentation, then it gets styled appropriately later.

Atom’s bibtex-autocomplete package lets one specify the .bib file for the project, and then it suggests keys as you type. Markdown’s citation format allows you to add words such as “see” or page numbers or ranges inside the brackets, include a bunch of citations inside a reference footnote, or suppress an author name mentioned in the sentence (using -@) as appropriate. See Pandoc Citation Documentation for more.

Markdown Example

# Sample Article Title

A footnote can be written as part of the paragraph.^[This is my inline footnote] And my paragraph continues.

Long footnotes can can use a unique label for digressions about *Aesthetic Theory*.[^Adornonote]

Citations just use the key and page number [@Warner2013, 55]. If you're doing a footnote citation style, it mixes fine with inline and labeled footnotes.

<!--HTML-style comments work, too, for notes to self, using cmd-'/' -->

[^Adornonote]: In [-@Adorno1970], Theodor Adorno describes

So this is what things look like for writing. I write each section of each chapter in a separate document in my own organizational scheme, but that’s very flexible. I could use version control (Atom is designed to work very well with git), but for now my institutional Box account syncs between machines across platforms and offers some basic file history.

Processing:

When I’m ready to share my work as a .pdf or .docx, I have to leave Atom and head to Pandoc on the command line. Because I work with several .md files at a time, I start by combining the files in a folder with cat * > fulldraft.md, which makes the basic file I’ll work with. To that fulldraft.md file, I’ll add metadata (in the similar-to-Markdown YAML format) to the beginning:

---
title: 'Test Markdown Document'
author: Scott Selisker
bibliography: library.bib
csl: chicago-note-bibliography-with-ibid.csl
---

There are many fields one can populate here, such as the abstract or even the whole .bib file, and Pandoc will read them and fed them into the final document (see Pandoc YAML metadata documentation.) The last two lines here tell Pandoc to use my citation manager’s file library.bib as the bibliography, and the BibTex style sheet “chicago-note-bibliography-with-ibid.csl” to define the citation style. One downloads style sheets from a repository linked through CitationStyles.org. Literature scholars can, for instance, find different MLA editions, different variants and editions of Chicago, and a few journals’ house citation styles.

Next, I’m ready to have Pandoc do its jobs—delightfully, it’s converting from plain text to a format of my choosing (.pdf, .docx, .html, .tex, .odt, etc etc), turning Markdown into styling in that file, and turning my citation tags into fully formatted references with the bibliography specified above. All in this command:

pandoc -s -o seliskerdraft.pdf --filter=pandoc-citeproc fulldraft.md

Translation: Pandoc, make a (-s)tandalone (-o)utput file called seliskerdraft.pdf, using the pandoc-citeproc filter (the package for bibliography functionality) and do it with the input file fulldraft.md. And here it is:

Output as .pdf

This looks great, but Pandoc can do much, much more. If I want a .docx file to share instead, I type: pandoc -s -o seliskerdraft.docx --filter=pandoc-citeproc fulldraft.md makes me a .docx file instead.

Output as .docx

Now, to really live the dream, let’s put it in MLA format in one command. We can override the citation style sheet (that YAML line from above) by specifying it on the command line, to spit out this article in MLA style rather than Chicago: pandoc -s -o seliskerdraftmla.pdf --filter=pandoc-citeproc --csl=modern-language-association-7th-edition-with-url.csl fulldraft.md

Output as .pdf in MLA format

I hope folks find this useful. Happy writing.

More: Zotero and Better BibTex for Zotero

I’m new to citation management, so I was still shopping around for a reference manager as I finished the first version of this post. I was hoping to be able to use the academic-community-based Zotero all along, and now that Chris Forster has told me about the Better BibTeX for Zotero extension, I’m happy to say it does everything I need it to. I also think it’s the best option for this workflow. It let me simplify the default citation key assignment to [auth][year], and it automatically adds letters to the end of the citation keys of the fiercely productive and the very commonly surnamed. (After changing the key format, one has to select-all in the Zotero library, right-click, and select “Refresh BibTeX key.”) Better BibTeX for Zotero also allows for automatic export, so the library.bib file stays updated in the writing project folder, and Zotero’s library also syncs via the cloud. Zotero’s extension for Chrome is also in my experience the best at grabbing correct and complete bibliography entries from publisher webpages, journal pages, library search pages, and elsewhere.

Using the simplified citation keys means that I’ll likely just type out the full citation keys when I’m citing something in my writing, and I can double-check them by looking at an open Zotero window. But for longer or harder-to-remember key formats, there are Atom packages for Zotero-citations (by the author of Better BibTeX for Zotero), the aforementioned autocomplete-bibtex, and Zotero-picker, each of which uses a different approach to streamlining the lookup process.

Small Errors

I’ve noticed one vaguely field-specific hitch for literary scholars in the otherwise near-magical Pandoc, which is footnote formatting when a quotation occurs at the end of a sentence, common in literary studies or depending on one’s writing style. In the combination, ...end of my quote" [@Citation]., Pandoc renders fine for parenthetical styles, and correctly for footnoted styles except for the sequence, ". before the footnote superscript. This is easy to account for in the .md file if you know you’re using a footnote style, or to find-and-replace in a .docx or in a .tex file on the way to a .pdf. It might also be the result of a user error of some kind. At the same time, I think it stands as a fair example of the small sort of thing one is likely to have to fix manually on occasion when using these powerful and customizable open-source tools, whose free-as-in-beer and -as-in-speech nature makes them well worth using and supporting.

Additional Resources

A few links I found useful in figuring this out: I’ll mention again Tenen and Wythoff’s Sustainable Authorship in Plain Text Using Pandoc and Markdown. For those who want to do version control or work seriously with data and charts in a setup like this, I recommend ​ Kieran Healy’s Plain Person’s guide to Plain Text Social Science, and Moacir de Sa Pereira’s guides to Atom for his JavaScripting English Major course: The Atom Programming Environment and Atom Help, both of which address version control.

It took me a while to figure out that Markdown formatting questions are all answered in the Pandoc Manual. There are many Markdown flavors, but the thing that matters most is Pandoc’s implementation of Markdown, which is also the best one for academic writers’ needs.

For Atom, there’s also this discussion of Atom packages for academic writers, which helped me get started choosing packages for myself.

And, finally, for those starting with citation management, I was delighted to find the reasonably functional text2bib, an online tool that does its level best to convert a copied-and-pasted bibliography into .bib format for easy importing into a reference manager. Its interface lets you double-check them one at a time or download the .bib file and correct it in the reference manager.