Getting data from Org Mode tables

Posted: - Modified: | emacs, org

Org Mode is an amazingly powerful package for Emacs. I’ve been learning a lot about how to use its support for plain-text tables and spreadsheet calculations.

Using table data in Emacs Lisp with the :var argument

For example, I wanted to be able to define my abbreviations in an Org Mode table in my config. I remembered coming across this technique a few weeks ago, but I couldn’t find the webpage with the code. It turned out to be simple to write from scratch. Here’s the plain text I added to my config.

#+NAME: abbrev
| Base  | Expansion                             |
|-------+---------------------------------------|
| bc    | because                               |
| wo    | without                               |
| wi    | with                                  |
| ex    | For example,                          |
| qm    | sacha@sachachua.com                   |
| qe    | http://sachachua.com/dotemacs         |
| qw    | http://sachachua.com/                 |
| qb    | http://sachachua.com/blog/            |
| qc    | http://sachachua.com/blog/emacs-chat/ |

#+begin_src emacs-lisp :exports code :var data=abbrev
(mapc (lambda (x) (define-global-abbrev (car x) (cadr x))) data)
#+end_src

The :var data=abbrev argument to the Emacs Lisp source block is where all the magic happens. Here, it takes the data from the table named “abbrev” (which I set using #+NAME: before the table) and makes it available to the code. Emacs evaluates that data when the code is tangled (or exported) to my configuration. The code that’s in my Sacha.el looks like this:

(let ((data (quote (("bc" "because")
                    ("wo" "without")
                    ("wi" "with")
                    ("ex" "For example,")
                    ("email" "sacha@sachachua.com")
                    ("dote" "http://sachachua.com/dotemacs")
                    ("web" "http://sachachua.com/")
                    ("blog" "http://sachachua.com/blog/")
                    ("ec" "http://sachachua.com/blog/emacs-chat/")))))
  (mapc (lambda (x) (define-global-abbrev (car x) (cadr x))) data) )

Looking up data with org-lookup-first, org-lookup-last, and org-lookup-all

You can do more complex things with Org tables, too. Inspired by Eric Boyd’s talk on his Epic Quest of Awesome (which he based on Steve Kamb‘s), I started putting together my own. I made a list of little achievements, guessed at the years, and assigned arbitrary experience points.

The achievements table had rows like this:

Approximate date Category XP Description ID
2014 Life 50 Became a Canadian citizen – link L_CAN
2014 Programming 20 Used NodeJS and AngularJS for a client project – link P_NOD
2014 Programming 5 Pulled information out of Evernote

I wanted to summarize the points by year: points gained, total points, level (according to a lookup table based on D&D experience points), and description. The lookup table was structured like this:

#+TBLNAME: levels
| Total XP | Level | Adjective             |
|----------+-------+-----------------------|
|        0 |     1 | trained-initiate      |
|     1000 |     2 | experienced           |
|     2250 |     3 | savvy                 |
|     3750 |     4 | veteran               |
|     5500 |     5 | unusually experienced |

Now for the summary table. I created rows for different years, and then I used Org Mode to fill in the rest. (Org Mode! Wow.)

| Year | Points gained | Cumulative points | Level | Adjective        |
|------+---------------+-------------------+-------+------------------|
| 1997 |             0 |                 0 |     1 | trained-initiate |
| 1998 |            10 |                10 |     1 | trained-initiate |
| 1999 |            50 |                60 |     1 | trained-initiate |
| 2000 |            50 |               110 |     1 | trained-initiate |
| 2001 |           100 |               210 |     1 | trained-initiate |
| 2002 |            60 |               270 |     1 | trained-initiate |
| 2003 |           245 |               515 |     1 | trained-initiate |
| 2004 |           115 |               630 |     1 | trained-initiate |
| 2005 |           140 |               770 |     1 | trained-initiate |
| 2006 |            60 |               830 |     1 | trained-initiate |
| 2007 |           270 |              1100 |     2 | experienced      |
| 2008 |           290 |              1390 |     2 | experienced      |
| 2009 |           205 |              1595 |     2 | experienced      |
| 2010 |           215 |              1810 |     2 | experienced      |
| 2011 |           115 |              1925 |     2 | experienced      |
| 2012 |           355 |              2280 |     3 | savvy            |
| 2013 |           290 |              2570 |     3 | savvy            |
| 2014 |           350 |              2920 |     3 | savvy            |
| 2015 |            45 |              2965 |     3 | savvy            |
#+TBLFM: $2='(calc-eval (format "vsum(%s)" (vconcat (org-lookup-all $1 '(remote(accomplishments,@2$1..@>$1)) '(remote(accomplishments,@2$3..@>$3))))))::$3=vsum(@2$2..@+0$2)::$4='(org-lookup-last $3 '(remote(levels,@2$1..@>$1)) '(remote(levels,@2$2..@>$2)) '>=);N::$5='(org-lookup-last $3 '(remote(levels,@2$1..@>$1)) '(remote(levels,@2$3..@>$3)) '>=);L

The TBLFM (table formula) line is very long, so let me break it down.

Points gained:

(calc-eval
 (format "vsum(%s)"
         (vconcat
          (org-lookup-all
           $1
           '(remote(accomplishments,@2$1..@>$1))
           '(remote(accomplishments,@2$3..@>$3))))))

This uses org-lookup-all to look up the value of the first column ($1) in the accomplishments table, from the second row to the last row @2..@>, looking in the first column ($1). It returns the values from the third column of the matching rows ($3). This is then passed through calc’s vsum function to calculate the sum.

Cumulative points: vsum(@2$2..@+0$2) is the sum of the second column $2 from the second row @2 to the current row @+0.

Level: This uses org-lookup-last to find the last value where the operator function returns true. In this case, testing the level from column $3 against each of the values in the levels table’s column $1 while the given level is greater than or equal to the value from levels. When it finds the last matching row, it returns the $2 second column from it. ;N means treat everything as a number.

org-lookup-first is like org-lookup-last, but it returns the first matching row.

Adjective: This one works like Level does, but it returns the value from column $3 instead. I found that it converted the return values to 0 if I used ;N, so I used ;L instead.

Passing data to R or other external programs

Of course, you’re not limited to things that Emacs can do. I wanted to summarize the data in graphs, so here’s what I did.

#+RESULTS: category_analysis

#+name: category_analysis
#+begin_src R :var data=accomplishments :exports both :results graphics :file quest_category.png :height 300
library(plyr)
library(ggplot2)
categories <- ddply(data, c("Category"), summarize, Points=sum(XP))
cat_sorted <- transform(categories, Category=reorder(Category, Points))
plot <- ggplot(data = cat_sorted, aes(x = Category, y = Points))
plot <- plot + geom_bar(stat="identity")
plot <- plot + geom_text(aes(label = Points, x = Category, y = Points + 10, hjust=0))
plot <- plot + scale_y_continuous(expand=c(0,70))
plot <- plot + coord_flip()
print(plot)
#+end_src

I like including source code in published pages for fellow geeks, but having the results come first gives people more context for the source block. So I named the source block using the #+name: directive and defined a #+RESULTS: directive before it. The source block used the :var argument to bring the data in from the accomplishments table. With R blocks, the data becomes available as a data frame that you can then do interesting things with. I used the :file argument to save the output to quest_category.png.

Those are a few ways that you can get data out of Org Mode tables and into Emacs Lisp, other Org Mode tables, or external programs. As I learn more about Org Mode, I find myself using it for more of the things that I used to use Microsoft Excel for – tracking, analyzing, and even graphing. I found it a little difficult to piece together what I needed to do from the manuals and examples on the Web, so I hope this explanation will help you (and that it’ll help me when I forget the syntax, as I’m sure I will). If you come up with something really neat that uses Org Mode tables, tell me what you’ve figured out!

You can comment with Disqus or you can e-mail me at sacha@sachachua.com.