On this page:

Getting R and ggplot2 to work in Emacs Org Mode Babel blocks; also, tracking the number of TODOs

I started tracking the number of tasks I had in Org Mode so that I could find out if my TODO list tended to shrink or grow. It was easy to write a function in Emacs Lisp to count the number of tasks in different states and summarize them in a table.

(defun sacha/org-count-tasks-by-status ()
  (interactive)
  (let ((counts (make-hash-table :test 'equal))
        (today (format-time-string "%Y-%m-%d" (current-time)))
        values output)
    (org-map-entries
     (lambda ()
       (let* ((status (elt (org-heading-components) 2)))
         (when status
           (puthash status (1+ (or (gethash status counts) 0)) counts))))
     nil
     'agenda)
    (setq values (mapcar (lambda (x)
                           (or (gethash x counts) 0))
                         '("DONE" "STARTED" "TODO" "WAITING" "DELEGATED" "CANCELLED" "SOMEDAY")))
    (setq output
          (concat "| " today " | "
                  (mapconcat 'number-to-string values " | ")
                  " | "
                  (number-to-string (apply '+ values))
                  " | "
                  (number-to-string
                   (round (/ (* 100.0 (car values)) (apply '+ values))))
                  "% |"))
    (if (called-interactively-p 'any)
        (insert output)
      output)))
(sacha/org-count-tasks-by-status)

I ran this code over several days. Here are my results as of 2014-05-01:

Date DONE START. TODO WAIT. DELEG. CANC. SOMEDAY Total % done + done +canc. + total + t – d – c Note
2014-04-16 1104 1 403 3 1 104 35 1651 67%
2014-04-17 1257 0 114 4 1 171 107 1654 76% 153 67 3 -217 Lots of trimming
2014-04-18 1292 0 74 4 5 183 100 1658 78% 35 12 4 -43 A little bit more trimming
2014-04-20 1305 0 80 4 5 183 100 1677 78% 13 0 19 6
2014-04-21 1311 1 78 4 4 184 99 1681 78% 6 1 4 -3
2014-04-22 1313 2 75 4 4 184 99 1681 78% 2 0 0 -2
2014-04-23 1369 4 66 4 5 186 101 1735 79% 56 2 54 -4 Added sharing/index.org
2014-04-24 1371 3 69 4 5 186 101 1739 79% 2 0 4 2
2014-04-25 1379 3 60 3 5 189 103 1742 79% 8 3 3 -8
2014-04-26 1384 3 65 3 5 192 103 1755 79% 5 3 13 5
2014-04-27 1389 2 66 3 5 192 103 1760 79% 5 0 5 0
2014-04-28 1396 3 67 3 5 192 103 1769 79% 7 0 9 2
2014-04-29 1396 3 67 3 5 192 103 1769 79% 0 0 0 0
2014-04-30 1404 4 70 4 5 192 103 1782 79% 8 0 13 5
2014-05-01 1413 4 80 3 4 193 103 1800 79% 9 1 18 8

Here’s the source for that table:

#+NAME: burndown
#+RESULTS:
|       Date | DONE | START. | TODO | WAIT. | DELEG. | CANC. | SOMEDAY | Total | % done | + done | +canc. | + total | + t - d - c | Note                       |
|------------+------+--------+------+-------+--------+-------+---------+-------+--------+--------+--------+---------+-------------+----------------------------|
| 2014-04-16 | 1104 |      1 |  403 |     3 |      1 |   104 |      35 |  1651 |    67% |        |        |         |             |                            |
| 2014-04-17 | 1257 |      0 |  114 |     4 |      1 |   171 |     107 |  1654 |    76% |    153 |     67 |       3 |        -217 | Lots of trimming           |
| 2014-04-18 | 1292 |      0 |   74 |     4 |      5 |   183 |     100 |  1658 |    78% |     35 |     12 |       4 |         -43 | A little bit more trimming |
| 2014-04-20 | 1305 |      0 |   80 |     4 |      5 |   183 |     100 |  1677 |    78% |     13 |      0 |      19 |           6 |                            |
| 2014-04-21 | 1311 |      1 |   78 |     4 |      4 |   184 |      99 |  1681 |    78% |      6 |      1 |       4 |          -3 |                            |
| 2014-04-22 | 1313 |      2 |   75 |     4 |      4 |   184 |      99 |  1681 |    78% |      2 |      0 |       0 |          -2 |                            |
| 2014-04-23 | 1369 |      4 |   66 |     4 |      5 |   186 |     101 |  1735 |    79% |     56 |      2 |      54 |          -4 | Added sharing/index.org    |
| 2014-04-24 | 1371 |      3 |   69 |     4 |      5 |   186 |     101 |  1739 |    79% |      2 |      0 |       4 |           2 |                            |
| 2014-04-25 | 1379 |      3 |   60 |     3 |      5 |   189 |     103 |  1742 |    79% |      8 |      3 |       3 |          -8 |                            |
| 2014-04-26 | 1384 |      3 |   65 |     3 |      5 |   192 |     103 |  1755 |    79% |      5 |      3 |      13 |           5 |                            |
| 2014-04-27 | 1389 |      2 |   66 |     3 |      5 |   192 |     103 |  1760 |    79% |      5 |      0 |       5 |           0 |                            |
| 2014-04-28 | 1396 |      3 |   67 |     3 |      5 |   192 |     103 |  1769 |    79% |      7 |      0 |       9 |           2 |                            |
| 2014-04-29 | 1396 |      3 |   67 |     3 |      5 |   192 |     103 |  1769 |    79% |      0 |      0 |       0 |           0 |                            |
| 2014-04-30 | 1404 |      4 |   70 |     4 |      5 |   192 |     103 |  1782 |    79% |      8 |      0 |      13 |           5 |                            |
| 2014-05-01 | 1413 |      4 |   80 |     3 |      4 |   193 |     103 |  1800 |    79% |      9 |      1 |      18 |           8 |                            |
#+TBLFM: @3$11..@>$11=$2-@-1$2::@3$13..@>$13=$9-@-1$9::@3$14..@>$14=$13-$11-($7-@-1$7)::@3$12..@>$12=$7-@-1$7

I wanted to graph this with Gnuplot, but it turns out that Gnuplot is difficult to integrate with Emacs on Microsoft Windows. I gave up after a half an hour of poking at it, since search results indicated there were long-standing problems with how Gnuplot got input from Emacs. Besides, I’d been meaning to learn more R anyway, and R is more powerful when it comes to statistics and data visualization.

Getting R to work with Org Mode babel blocks in Emacs on Windows was a challenge. Here are some of the things I ran into.

The first step was easy: Add R to the list of languages I could evaluate in a source block (I already had dot and ditaa from previous experiments).

(org-babel-do-load-languages
 'org-babel-load-languages
 '((dot . t)
   (ditaa . t) 
   (R . t)))

But my code didn’t execute at all, even when I was trying something that printed out results instead of drawing images. I got a little lost trying to dig into org-babel-execute:R with edebug, eventually ending up in comint.el. The real solution was even easier. I had incorrectly set inferior-R-program-name to the path of R in my configuration, which made M-x R work but which meant that Emacs was looking in the wrong place for the options to pass to R (which Org Babel relied on). The correct way to do this is to leave inferior-R-program-name with the default value (Rterm) and make sure that my system path included both the bin directory and the bin\x64 directory.

Then I had to pick up the basics of R again. It took me a little time to figure out that I needed to parse the columns I pulled in from Org, using strptime to convert the date column and as.numeric to convert the numbers. Eventually, I got it to plot some results with the regular plot command.

dates <- strptime(as.character(data$Date), "%Y-%m-%d")
tasks_done <- as.numeric(data$DONE)
tasks_uncancelled <- as.numeric(data$Total) - as.numeric(data$CANC.)
df <- data.frame(dates, tasks_done, tasks_uncancelled)
plot(x=dates, y=tasks_uncancelled, ylim=c(0,max(tasks_uncancelled)))
lines(x=dates, y=tasks_uncancelled, col="blue", type="o")
lines(x=dates, y=tasks_done, col="green", type="o")

r-plot

I wanted prettier graphs, though. I installed the ggplot2 package and started figuring it out. No matter what I did, though, I ended up with a blank white image instead of my graph. If I used M-x R instead of evaluating the src block, the code worked. Weird! Eventually I found out that adding print(...) around my ggplot made it display the image correctly. Yay! Now I had what I wanted.

library(ggplot2)
dates <- strptime(as.character(data$Date), "%Y-%m-%d")
tasks_done <- as.numeric(data$DONE)
tasks_uncancelled <- as.numeric(data$Total) - as.numeric(data$CANC.)
df <- data.frame(dates, tasks_done, tasks_uncancelled)
plot = ggplot(data=df, aes(x=dates, y=tasks_done, ymin=0)) + geom_line(color="#009900") + geom_point() + geom_line(aes(y=tasks_uncancelled), color="blue") + geom_point(aes(y=tasks_uncancelled))
print(plot)

 r-graph

The blue line represents the total number of tasks (except for the cancelled ones), and the green line represents tasks that are done.

Here’s something that looks a little more like a burn down chart, since it shows just the number of things to be done:

library(ggplot2)
dates <- strptime(as.character(data$Date), "%Y-%m-%d")
tasks_remaining <- as.numeric(data$Total) - as.numeric(data$CANC.) - as.numeric(data$DONE)
df <- data.frame(dates, tasks_remaining)
plot = ggplot(data=df, aes(x=dates, y=tasks_remaining, ymin=0)) + geom_line(color="#009900") + geom_point()
print(plot)

r-graph-2

The drastic decline there is me realizing that I had lots of tasks that were no longer relevant, not me being super-productive. =)

As it turns out, I tend to add new tasks at about the rate that I finish them (or slightly more). I think this is okay. It means I’m working on things that have next steps, and next steps, and steps beyond that. If I add more tasks, that gives me more variety to choose from. Besides, I have a lot of repetitive tasks, so those never get marked as DONE over here.

Anyway, cool! Now that I’ve gotten R to work on my system, you’ll probably see it in even more of these blog posts. =D Hooray for Org Babel and R!

Update 2014-05-09: Stephen suggested http://blogs.neuwirth.priv.at/software/2012/03/28/r-and-emacs-with-org-mode/ for more tips on setting up Org Mode with R and Emacs Speaks Statistics (ESS).

Quantified Awesome: Adding calendar heatmaps to categories

It’s amazing how little tweaks give you a whole new sense of the data. I’ve been using Cal-HeatMap to look at my blogging history. I figured I’d build it into Quantified Awesome to make it even easier to analyze how I spend my time. 1.9 hours later, here’s what I have. All totals are reported for the past 12-month period by default (as of this writing, July 19 2012 to July 19 2013, including the day’s activities), but it adjusts depending on the filter settings.

Here’s me working on the Quantified Awesome system:

image

Instead of just a table of log entries or a summary of numbers, I can see the gaps and sprints in my activity.

Here’s the one for Discretionary – Productive:

image

Pretty consistent, actually.

and Discretionary – Play:

image

February must’ve been when I had a new video game to tinker around with. Plenty of opportunities to relax.

Here’s my Business – Earn graph:

image

and Business – Build:

image

I’ve been biking pretty regularly, mostly on Tuesdays and Thursdays…

image

In contrast, I take the subway only if it’s winter or really rainy, if I’m going somewhere far or steeply uphill, or if my bike is flat (as it was yesterday).

image

Neato. I should definitely do this for groceries too, now that I’ve loaded my grocery receipts into Quantified Awesome! (No public link yet for that data, sorry. =) ) I also want to figure out how to speed things up enough so that I can do quartile analysis and then use that to colour the scale…

Calendar heatmaps for the win!

Mohiomap: A visual way to browse your Evernote notebook

Evernote is a great tool for taking notes, but sometimes searching and browsing those notes can get unwieldy if you have thousands of items. For example, searching my notebooks for “evernote” gets me >130 results, which look a little like this in Evernote’s desktop application:

image

This is great if I can narrow things down with notebooks, keywords, and tags, but wouldn’t it be nice to be able to explore better?

Christian Hirsch (who has been working on quite a few visual interfaces to wikis and knowledgebases) reached out to me about Mohiomap, which links up with your Evernote notebook and lets you see it as an interactive map.

image

You can click on notes to navigate further and to see a preview in the left sidebar.

image

You can expand items without closing the previous ones, so it’s a handy tool for exploration. I like the way that they indicate number of other entries with both a thicker line as well as a larger circle – the thicker lines are easier to follow when you’re starting from a node.

The trick with new tools is to figure out how you want to fit them into your workflow. Right now, Mohiomap is a visualization and search tool. What new questions can I ask with this interface? How can I use it to learn more?

  • Use Mohiomap to find related notes: I like the way it displays links to related notes. The notes are determined using the Evernote API, which seems to take the note source and tags into account. Related notes are difficult to find using the desktop application, so this might be a good way to explore when I’m writing blog posts.
  • Use Mohiomap when searching for something that will have hits in multiple notebooks, if I want to group by notebook: Mohio’s search interface organizes the first layer of results by notebook. If I used notebooks more, then this might be a good way to browse through my search results. I tend to use tags, though. Oh well!
  • Use Mohiomap to encourage myself to tag more, and to fix my tags. Mohiomap shows tags that are connected with each other, so that might be a way to identify overlapping tags. This is slighly less useful with a small result set (30 notes don’t have much overlap), but maybe it will become more useful later. It also lets you draw lines from notes to tags in order to add a tag to a note, and maybe this will evolve into more tagging features.

It looks like the first use (browsing through related notes) might be the most relevant for me. Let’s see how well Evernote’s recommendation algorithm works!

Other thoughts: Plus points for making the back button work and keeping graphs individually bookmarkable. =) I’d love to be able to add more search results, like viewing 50 or 100 at a time – or viewing a graph of the tags in my entire Evernote knowledgebase, which would be nifty. Dynamic force-directed networks can be disconcerting because of the motion. It might be great to have different views of it in addition to the current interface – maybe something more constrained like the way FreeMind or thebrain.com work?

UserVoice appears to be the place for suggestions related to Mohiomap. Looking forward to seeing this grow, and any other apps that visualize your data!

Visualization resources

One of my coworkers asked me if I knew interesting examples of visualizations. I mentioned quite a few sites and she found them super-helpful (like, give-Sacha-a-hug helpful! =) ). Just in case you find these handy: (no hugs required)

Flowing Data is one of my favourite blogs for data graphics inspiration. Data Visualization is cool, too.

IBM Many Eyes
This collaborative visualization project makes coming up with charts and graphs so much easier. Lots of data sets and lots of examples to explore, too. Note: don’t upload private data.

image
Protovis has a graphing library and a gallery of pretty examples. I’d love to play around with graphs like this. RaphaelJS has a few examples, too. Graphing libraries generally do.

Hans Rosling shows you can do play-by-play commentary for statistics and have people on the edge of their seats.

OKCupid visualizations are fascinating. It turns out that one can get all sorts of insights out of a massive online dating database. The blog posts are cleverly written and often include practical tips, like this one on profile picture attractiveness, camera types, flash, depth of field, and time of day. They have mind-boggling data. You may not want to open the blog posts in a school or work context, though.

What are your favourite sources for visualization inspiration?

Learning from my mood data

One of the unexpected benefits of switching my phone plan to something that includes unlimited international texting is that I can participate in nifty things like Experimonth, which is a month-long study about moods. I get regular text messages prompting me to rate my happiness on a scale of 1-10, and it graphs it for me. I can probably come up with similar graphs using KeepTrack and a bit of spreadsheet magic, but the convenience and the social data make this fun and interesting.

Here’s how my mood data stacks up so far:

image

I stay on a fairly even keel, with awesome happy experiences possibly any day of the week. Hmm, maybe I should track text notes too, so I can get a better handle on what causes the 10s or the 6s. It might also be interesting to combine the happiness ratings with my time analyses to see if there any correlations.

Here are the results they’ve collected so far:

Visualization of my blog categories

This visualizes how often I blogged something with a tag in a given year, sorted by all-time popularity. There are more categories, but I skipped them. The height of each block represents how many blog posts I wrote in that category, while the different blocks represent the years, ending with 2010 at the far right. The graph reflects changing interests and recurring themes.

image

This visualizes some of the things I’ve been writing about in 2010. We’re only a month in, so the last line is pretty small, and in some cases (n < 4) not even visible.

image

Sparkline bar graphs created with Sparklines for Excel. Initial categories table created with the following SQL incantation:

select p.post_date, p.post_title, terms.name from wp_posts p inner join wp_term_relationships tr on p.id=tr.object_id inner join wp_term_taxonomy tt on tr.term_taxonomy_id=tt.term_taxonomy_id inner join wp_terms terms on tt.term_id=terms.term_id into outfile '/tmp/categories.csv';

then imported and tweaked in Microsoft Excel.