Classic Logic

Automatic Markdown Table Generation

Because I had nothing better to do on a Saturday night, I decided to create a list of books I’ve ever read, to post on my personal blog. I know what you’re thinking.

I wanted to include a sequence number, title of the book, and its author. The title would also double up as a link to its amazon page. Because this is a structured data, a table made more sense than an ordered (or unordered) list of books.

Posts in my personal blog are written in markdown. And I quickly discovered that the markdown syntax for generating tables – though less verbose than equivalent HTML syntax – quickly became unweidly.

For instance, here is what two rows would look like:

|1|[The Kite Runner](http://a.co/3WRFolt){:target='_blank'}|Khaled Hosseini|
|2|[The Testament](http://a.co/1sOjiAj){:target='_blank'}|John Grisham|
  • The | separates the columns. Note that there are only 3 columns of information here – the sequence number, title, and author.
  • The title is also a hyperlink that opens in a new tab.

Though at its heart there’s only 3 pieces of information – title, author, and link – there’s so much more I need to type so that the whole thing renders properly.

Sequence number should not be an information I need to provide. It’s a sequence of increasing numbers, but AFAIK there’s no way to automatically generate them – I have to manually type in “1”, “2” etc at the start of every row as I’ve done above. If I decide to remove or add books in between, or re-order them, that would throw off the entire numbering sequence.

Adding {:target='_blank'} at every line (so that links open in a new tab) was repetitive.

This way of writing tables scales poorly. It’s verbose, and by the time I wrote 5 rows of content, I was officially annoyed. There has to be a better way.

I searched the internet, and there was apparently no way to automatically number tables in jekyll or markdown (the underlying technologies of my personal blog). Or maybe I didn’t search thoroughly enough.

I did come across a technique that involved HTML and jquery, but I was not tempted to sprinkle HTML in my markdown file.

Writing a plugin for jekyll or markdown seemed like the logical thing to do, but I barely knew ruby.

The Solution

After some contemplation, I decided to script my own solution.

  • A text file would store on each line the information – title, author, and link – to put in the table.
  • A script reads this text file, and outputs the markdown table code for the contents of the input file.
  • I would paste this table code into my markdown website.

The text file to this program is to be formatted as shown below:

The Kite Runner | Khaled Hosseini | http://a.co/3WRFolt
The Testament | John Grisham |http://a.co/1sOjiAj

The title, author, and the link are separated by |; it couldn’t be separated by spaces as the title and author names themselves contain spaces. The leading and trailing white-spaces of each field shall be trimmed by the scipt.

It’s much easier to write the table contents this way than the monstrosity that’s the markdown table syntax.

Now, here’s the script written in python:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# tablegen.py
# python 3.6
# https://github.com/arjunkrishnababu96/py-md-table-generator

import argparse
from string import Template

def main():
  parser = argparse.ArgumentParser()
  parser.add_argument("file", help="path to input file")
  args = parser.parse_args()

  t = Template("|$sno|[$title]($link){:target='_blank'}|$author|")

  with open(args.file) as f:
    for i, line in enumerate(f, 1):
      title, auth, link = line.split('|')
      s = t.substitute(sno=i,
                      title=title.strip(),
                      link=link.strip(),
                      author=auth.strip())
      print(s)

if __name__ == '__main__':
  main()
  1. It takes as command-line argument the input file name.
  2. Creates a template string for the markdown table row, with placeholders for the moving parts (line 13).
  3. The contents of the input file are read line-by-line (line 16).
  4. The title, author, and link are obtained by splitting the input line based on | (line 17).
  5. Adds the auto-generated sequence number, title, link, and author into the placeholders of the template string (lines 18 - 21).
  6. Prints the generated string to the standard output (line 22).

Assuming that our input file above is called read_books.txt, the output of the program would be:

$ python tablegen.py read_books.txt
|1|[The Kite Runner](http://a.co/3WRFolt){:target='_blank'}|Khaled Hosseini|
|2|[The Testament](http://a.co/1sOjiAj){:target='_blank'}|John Grisham|

Now, all I have to do is paste this output into my original markdown post.

Pros

  1. You only have to provide the 3 pieces of information.
  2. Because the numbering is done by the script, you can add, remove, and re-order books arbitrarily (and re-run the script).

Cons

  1. Copy-pasting the output may get annoying eventually.
  2. If you loose your original text file of input data, you’re screwed. This can be overcome by maintaining backups of the input file, version controlling it etc.

This is probably not the best solution to the issue, but it works well in my case. It’s definitely better than manually building up the table in markdown, especially if the table would grow or you need to add, remove, or re-order books arbitrarily.

At the moment it’s not a “general” application because the template string is hard-coded into the application itself.

The code is hosted on my github repo under the MIT License. Feel free to play with it and tailor it to your own needs. I’m curious to know what people would come up with.

Happy hacking!

PS: You can find the list of books here.