Category Archives: geek

Gradually evolving my data entry interfaces

I’m interested in Quantified Self as a way to make better decisions through data. When I come up with a question I want to explore, I usually start off tracking things on paper or in a spreadsheet. This means I can get started quickly, and I can check whether the question is useful enough to invest in further.


I track my clothes to make it easier to simplify my wardrobe, and to guide my purchases.

I started by writing down dates and clothing combinations on an index card in the morning. Since my index card was unlined and my brain is pretty fuzzy early in the day, I occasionally had problems with incorrect dates or items not matching up. Eventually, I built a small Rails application (Quantified Awesome) to keep track of the clothes for me. Adding pictures made it easier to select the right item. Over time, I added little conveniences like the ability to display or sort by the last time I wore something.

I often find myself backdating entries, so maybe tracking my clothes isn’t as easy or as fun as it could be. I wonder if making it more prescriptive (“Pick one of these three outfits, or select what you’re going to wear”) would help, or maybe integrating it more into my morning routine.


I track my time to guide my activity decisions and remind me of how I used the time.

I used apps on my phone to track time for a few months. I started with Time Recording and a few categories, adding more as I went along. When the number of categories got to be a bit unwieldy, I moved on to Tap Log so that I could organize the buttons into a menu. Since it didn’t have the built-in time analysis I liked about Time Recording, I added time analysis tools to Quantified Awesome. After I added other features to Quantified Awesome, I shifted to using it as my time tracking and analysis tool.

For a while, I tracked time by bringing up the Quantified Awesome web interface on my phone and typing in a substring of a category name. Then I decided to look into building Emacs integration so that I could automatically clock in from my to-do list. To speed up time tracking on my phone, I used Tasker to create a menu of my most common time categories. Since fiddling with Tasker on my phone was time-consuming and a little annoying, I eventually shifted to using Tasker and Javascript. That way, I could edit my HTML file in Emacs, copy it onto my phone through Dropbox, and get my handy menu of buttons. Using Tasker also allowed me to code extra behaviour such as turning off WiFi when I go for a walk.

My next step is probably to build more time visualizations so I can see the shifts from day to day, week to week.


I track groceries so that I can make better decisions at the supermarket and so that I can get a sense of the balance and patterns of our consumption.

I started by typing in my receipts manually, but it was a little boring. I paid a virtual assistant to enter the data from my scanned receipts. This worked out to be better than the receipt scanning companies that were out there, since I could get line-item detail in a spreadsheet shared in Dropbox. I periodically reviewed the data, fixing errors and analyzing totals.

After some time doing this and quite a few errors in the data, I decided to build my own interface for entering data more reliably. Now that I’ve built my neat interactive interface, I find it faster (and more fun!) to enter the data in myself than to scan it and send it over. I’ve been digging into visualizing the data with D3 too.

Here’s a quick demo:

My next step is probably to build a grocery list interface for it. We’re currently using OurGroceries because it syncs well between my husband’s phone and mine, but I should be able to use either straight AJAX or WebSockets to get the synchronization part working.

So those are a few examples of how I slowly improve my tracking systems, rounding off rough edges and making things a little bit simpler for myself. Web programming is super helpful for me. Backend tools like Ruby on Rails allow me to build my own tracking tools and front-end tools like Javascript allow me to create personalized interfaces and visualizations.

I tend to code the next step of improvements only when something annoys me enough for me to do something about it or when a question makes me curious enough to want to investigate it. I’ve been deliberately working on my personal projects more often, though, and that might lead to more of these little improvements. We’ll see!

Exploring our grocery numbers

Analyzing my grocery data is more challenging than analyzing my time data. There’s a lot more data cleanup needed. I have to figure out obscure line items on old receipts and catch typos in both names and numbers. Then there’s figuring out how much I want to combine different items and how much I want to keep them separate.

For example, milk has different receipt item names depending on the item (size, brand, type) and the store. If I want to know how much we’ve spent on milk, I’ll use the total for all of them. But if I want to get a sense of the price history, it makes sense to track each receipt item type separately. I do this by keeping the receipt name (fixing typos as I review my data) and mapping these receipt names to a friendly name I set for myself. This way, the line “HOMO 4LI” on my receipt gets turned into “Milk” in my report. Come to think of it, maybe I should change it to “Milk, 4 L, Homogenized”…

Categories are handy for reporting too. Because of the ad-hoc way I created receipt item mappings and assigned them to categories, I ended up with inconsistent categorization. Some types of toilet paper were in the Supplies category, and some types were in the Other category. I manually reviewed the category assignments and I think I’ve gotten them sorted out.

Anyway, analyzing my data from 2013-07-01 to 2015-07-01, I see that we spend an average of $80 per week on groceries, which sounds about right. Some of the receipts are missing and there are almost certainly other little errors in the data, but this should give me the overall picture.

I’m still trying to figure out a good way to visualize the data in order to answer the questions I’m curious about, so here are my notes along the way. X axis is date, Y axis is total cost on that day, color is how it compares to the average price it is (lower price than average = blue, higher = orange).


2015-07-03 20_47_42-sachachua.com_8080_grocery_analysis

Milk consumption is pretty straightforward. Every week, we use around 0.6 bags of milk (~2.4L) – more when J- and her friends are over (teenagers!). The price of milk has stayed at $4.97 per 4L, except for the time we bought a slightly more expensive type of milk (~Oct 2013) and the time in June 2014 when a smaller size was on sale, so we picked up one of those instead.



We used to buy extra-large eggs, but the supermarket close to us stopped carrying 18-packs of those, so we switched to 18-packs of large eggs instead.

Extra-large eggs

2015-07-03 20_45_34-sachachua.com_8080_grocery_analysis

Large eggs

2015-07-03 20_46_07-sachachua.com_8080_grocery_analysis

The price of large eggs is stable at $4.27 for 18. We use ~11 eggs a week.

Things we buy when they’re on sale

Canned tomatoes

We stock up on canned tomatoes when they go on sale, since they’re easy to store.

2015-07-03 20_50_19-sachachua.com_8080_grocery_analysis


We probably use ~3 cans a month. The sale price has drifted up from $0.88 to $0.97, while the regular price is a little bit over $1.50.


2015-07-03 20_52_53-sachachua.com_8080_grocery_analysis

We haven’t bought butter at full-price in two years. The sale price for unsalted butter tends to be between $2.77 and $3.33, while the regular price is $6+.



I like strawberries, but I stopped buying them for a long time because they seemed like such an indulgence and the sweetness tended to be hit-or-miss. This year, I gave myself permission to splurge on strawberries in season.

2015-07-03 20_55_25-sachachua.com_8080_grocery_analysis


We seem to go through banana phases. When we hit banana overload, we stop for a while.

2015-07-03 20_57_54-sachachua.com_8080_grocery_analysis


The colours here are just due to floating point imprecision. Bananas have actually stayed the same price for the past two years ($1.26/kg).


We often get gala apples:

2015-07-03 21_01_18-sachachua.com_8080_grocery_analysis

We like picking up ambrosia apples during the rare occasions they’re available. Last winter was a good one for ambrosia apple availability.

2015-07-03 21_03_09-sachachua.com_8080_grocery_analysis


Whole chickens

2015-07-03 21_05_03-sachachua.com_8080_grocery_analysis

Lots of whole chickens lately, because of the rotisserie.

Chicken quarters

2015-07-03 21_07_49-sachachua.com_8080_grocery_analysis

Our main protein, although we also buy a fair bit of beef and pork, and chicken drumsticks/thighs when they’re on sale.

There’s more I haven’t explored yet, but I figured I’d put together these little observations along the way. =)



Emacs Hangout June 2015

Times may be off by a little bit, sorry!

Boo, I accidentally browsed in the Hangouts window before copying the text chat, so no copy of the text chat this time… =|

Finding missing dates in PostgreSQL

My analytics numbers were way off from what I expected them to be. When I did a day-by-day comparison of my numbers and the reference set of numbers, I realized that a few weeks of data were missing from the year of data I was analyzing – a couple of days here, two weeks there, and so on. I manually identified the missing dates so that I could backfill the data. Since this was the second time I ran into that problem, though, I realized I needed a better way to catch this error and identify gaps.

Initially, I verified the number of days in my PostgreSQL database table with a SQL statement along the lines of:

SELECT year, month, COUNT(*) AS num_days FROM
(SELECT date_part('year', day_ts) AS year,
 date_part('month', day_ts) AS month,
 day_ts FROM (SELECT DISTINCT day_ts FROM table_with_data) AS temp) AS temp2
ORDER BY year, month

I checked each row to see if it matched the number of days in the month.

It turns out there’s an even better way to look for missing dates. PostgreSQL has a generate_sequence command, so you can do something like this:

SELECT missing_date
FROM generate_series('2015-01-01'::date, CURRENT_DATE - INTERVAL '1 day') missing_date
WHERE missing_date NOT IN (SELECT DISTINCT day_ts FROM table_with_data)
ORDER BY missing_date

Neat, huh?

Using your own Emacs Lisp functions in Org Mode table calculations: easier dosage totals

UPDATE 2015-06-17: In the comments below, Will points out that if you use proper dates ([yyyy-mm-dd] instead of yyyy-mm-dd), Org will do the date arithmetic for you. Neato! Here’s what Will said:

Hi Sacha. Did you know you can do date arithmetic directly on org’s inactive or active timestamps? It can even give you an answer in fractional days if the time of day is different in the two timestamps:

| Start                  | End                    | Interval |
| [2015-06-16 Tue]       | [2015-06-23 Tue]       |        7 |
| <2015-06-13 Sat>       | <2015-06-15 Mon>       |        2 |
| [2015-06-10 Wed 20:00] | [2015-06-17 Wed 08:00] |      6.5 |
#+TBLFM: $3=$2 - $1 

Here’s my previous convoluted way of doing things… =)

I recently wrote about calculating how many doses you need to buy using an Org Mode table. On reflection, it’s easier and more flexible to do that calculation using an Emacs Lisp function instead of writing a function that processes and outputs entire tables.

First, we define a function that calculates the number of days between two dates, including the dates given. I put this in my Emacs config.

(defun my/org-days-between (start end)
  "Number of days between START and END.
This includes START and END."
  (1+ (- (calendar-absolute-from-gregorian (org-date-to-gregorian end))
         (calendar-absolute-from-gregorian (org-date-to-gregorian start)))))

Here’s the revised table. I moved the “Needed” column to the left of the medication type because this makes it much easier to read and confirm.

| Needed | Type         | Per day |      Start |        End | Stock |
|     30 | Medication A |       2 | 2015-06-16 | 2015-06-30 |     0 |
|      2 | Medication B |     0.1 | 2015-06-16 | 2015-06-30 |   0.2 |
#+TBLFM: @2$1..@>$1='(ceiling (- (* (my/org-days-between $4 $5) (string-to-number $3)) (string-to-number $6)))

C-c C-c on the #+TBLFM: line updates the values in column 1.

@2$1..@>$1 means the cells from the second row (@2) to the last row (@>) in the first column ($1).  '  tells Org to evaluate the following expression as Emacs Lisp, substituting the values as specified ($4 is the fourth column’s value, etc.).

The table formula calculates the value of the first column (Needed) based on how many you need per day, the dates given (inclusive), and how much you already have in stock. It rounds numbers up by using the ceiling function.

Because this equation uses the values from each row, the start and end date must be filled in for all rows. To quickly duplicate values downwards, set org-table-copy-increment to nil, then use S-return (shift-return) in the table cell you want to copy. Keep typing S-return to copy more.

This treats the calculation inputs as strings, so I used string-to-number to convert some of them to numbers for multiplication and subtraction. If you were only dealing with numbers, you can convert them automatically by using the ;N flag, like this:

| Needed | Type         | Per day | Days | Stock |
|      6 | Medication A |       2 |    3 |     0 |
|      1 | Medication B |     0.1 |    3 |   0.2 |
#+TBLFM: @2$1..@>$1='(ceiling (- (* $3 $4) $5)));N

Providing values to functions in org-capture-templates

Over at the Emacs StackExchange, Raam Dev asked how to define functions for org-capture-templates that could take arguments. For example, it would be useful to have a function that creates a Ledger entry for the specified account. Functions used in org-capture-templates can’t take any arguments, but you can use property lists instead. Here’s the answer I posted.

You can specify your own properties in the property list for the template, and then you can access those properties with plist-get and org-capture-plist. Here’s a brief example:

Here’s a brief example:

(defun my/expense-template ()
  (format "Hello world %s" (plist-get org-capture-plist :account)))
(setq org-capture-templates '(("x" "Test entry 1" plain
                               (file "~/tmp/test.txt")
                               (function my/expense-template)
                               :account "Account:Bank")
                              ("y" "Test entry 2" plain
                               (file "~/tmp/test.txt")
                               (function my/expense-template)
                               :account "Account:AnotherBank")))

I hope that helps!