Category Archives: development

Exploring neighbourhood libraries and other notes from the Toronto Public Library Hackathon

UPDATE 2015-11-18: I figured out how to make this entirely client-side, so you don’t have to run a separate server. First, install either Tampermonkey (Chrome) or Greasemonkey (Firefox). Then install the user script insert-visualize-link.user.js , and the Visualize link should appear next to the library branch options on Toronto Public Library search result pages. See the Github repository for more details.

Yay! My neighbourhood library visualization won at the Toronto Public Library hackathon. It added a Visualize link to the search results page which mapped the number of search results by branch. For example, here’s a visualization of a search that shows items matching “Avengers comics“.


It’s a handy way to see which branches you might want to go to so that you can browse through what’s there in person.

Librarians could also use it to help them plan their selections, since it’s easy to see the distribution across branches. For example, here’s the visualization for books in Tagalog.


The collections roughly match up with Wellbeing Toronto‘s data on Tagalog as the home language, although there are some areas that could probably use collections of their own.

tagalog census

Incidentally, I was delighted to learn that Von Totanes had done a detailed analysis of the library’s Filipino collections in the chapter he wrote in Filipinos in Canada: Disturbing Invisibility (Coloma, McElhinny, and Tungohan, 1992). Von sent me the chapter after I mentioned the hackathon on Facebook; yay people bumping into other people online!

Personally, I’m looking forward to using this visualization to see things like which branches have new videos. Videos released in the past year can only be borrowed in person – you can’t request them online – so it’s good to check branches regularly to see if they’re there. It would be even better if the library search engine had a filter for “On the shelf right now”, but in the meantime, this visualization tool gives me a good idea of our chances of picking up something new to watch while we’re folding laundry. =)


More notes will probably follow, but here are a few quick drawings:

2015-11-15b Tech - Exploring neighbourhood libraries -- index card #tpl #hackathon

2015-11-15b Tech – Exploring neighbourhood libraries – index card #tpl #hackathon

2015-11-15c Kaizen and the Toronto Public Library hackathon -- index card #tpl #hackathon #kaizen #improvement

2015-11-15c Kaizen and the Toronto Public Library hackathon – index card #tpl #hackathon #kaizen #improvement

2015-11-15d Other reflections from TPL hackathon -- index card #tpl #hackathon #introversion #prototyping #presenting

2015-11-15d Other reflections from TPL hackathon – index card #tpl #hackathon #introversion #prototyping #presenting

2015-11-15e Ideas for following up on TPL hackathon -- index card #prototyping #tpl #hackathon

2015-11-15e Ideas for following up on TPL hackathon – index card #prototyping #tpl #hackathon


The code works by extracting the branch names and totals on the left side of search pages and combining those with the locations of the branches (KML). I don’t really need the server component, so I’m thinking of rewriting the script so that it runs entirely client-side – maybe as a Chrome extension or as a user script. That way, other people can play with the idea without running their own server (and without my having to keep a server around), and we can try it out without waiting for the library to integrate it into their website. That said, it would be totally awesome to get it into the interface of the Toronto Public Library! We’ll just have to see if it can happen. =) Happy to chat with library geeks to get this sorted out.

It was fun working on this. W- decided to join me at the last minute, so it turned into a fun weekend of hanging out with my husband at the library. I wanted to keep my weekend flexible and low-key, so I decided not to go through the team matchmaking thing. W- found some comfy chairs in the corner of the area, I plugged in the long extension cord I brought, and we settled in.

I learned a lot from the hackathon mentors. In particular, I picked up some excellent search and RSS tips from Alan Harnum. You can’t search with a blank query, but he showed me how you can start with a text string, narrow the results using the facets on the left side, and then remove the text string from the query in order to end up with a search that uses only the facets. He also showed me that the RSS feed had extra information that wasn’t in the HTML source and that it could be paginated with URL parameters. Most of the RSS feeds I’d explored in the past were nonpaginated subsets of the information presented on the websites, so it was great to learn about the possibilities I had overlooked.

The faceted search was exactly what I needed to list recent videos even if I didn’t know what they were called, so I started thinking of fun tools that would make hunting for popular new videos easier. (There have been quite a few times when I’ve gone to a library at opening time so that I could snag a video that was marked as available the night before!) In addition to checking the specific item’s branch details to see where it was on the shelf and which copies were out on loan, I was also curious about whether we were checking the right library, or if other libraries were getting more new videos than our neighbourhood library was.

W- was curious about the Z39.50 protocol that lets you query a library catalogue. I showed him the little bits I’d figured out last week using yaz-client from the yaz package, and he started digging into the protocol reference. He figured out how to get it to output XML (format xml) and how to search by different attributes. I’m looking forward to reading his notes on that.

Me, I figured that there might be something interesting in the visualization of new videos and other items. I hadn’t played around a lot with geographic visualization, so it was a good excuse to pick up some skills. First, I needed to get the data into the right shape.

Step 1: Extract the data and test that I was reading it correctly

I usually find it easier to start with the data rather than visualizations. I like writing small data transformation functions and tests, since they don’t involve complex external libraries. (If you miss something important when coding a visualization, often nothing happens!)

I wrote a function to extract information from the branch CSV on the hackathon data page, using fast-csv to read it as an array of objects. I tested that with jasmine-node. Tiny, quick accomplishment.

Then I worked on extracting the branch result count from the search results page. This was just a matter of finding the right section, extracting the text, and converting the numbers. I saved a sample results page to my project and used cheerio to parse it. I decided not to hook it up to live search results until I figured out the visualization aspect. No sense in hitting the library website repeatedly or dealing with network delays.

Step 2: Make a simple map that shows library branches

I started with the Google Maps earthquake tutorial. The data I’d extracted had addresses but not coordinates. I tried using the Google geocoder, but with my rapid tests, I ran into rate limits pretty early. Then it occurred to me that with their interest in open data, the library was the sort of place that would probably have a file with branch coordinates in terms of latitude and longitude. The hackathon data page didn’t list any obvious matches, but a search for Toronto Public Library KML (an extension I remembered from W-‘s explorations with GPS and OpenStreetMap) turned up the file I wanted. I wrote a test to make sure this worked as I expected.

Step 3: Combine the data

At first I tried to combine the data on the client side, making one request for the branch information and another request for the results information. It got a bit confusing, though – I need to get the hang of using require in a from-scratch webpage. I decided the easiest way to try my idea out was to just make the server combine the data and return the GeoJSON that the tutorial showed how to visualize. That way, my client-side HTML and JS could stay simple.

Step 4: Fiddle with the visualization options

Decisions, decisions… Red was too negative. Blue and green were hard to see. W- suggested orange, and that worked out well with Google Maps’ colours. Logarithmic scale or linear scale? Based on a maximum? After experimenting with a bunch of options, I decided to go with a linear scale (calculated on the server), since it made sense for the marker for a branch with a thousand items to be significantly bigger than a branch with five hundred items. I played with this a bit until I came up with maximum and minimum sizes that made sense to me.

Step 5: Hook it up to live search data

I needed to pass the URL of the search results, and I knew I wanted to be able to call the visualization from the search results page itself. I used TamperMonkey to inject some Javascript into the Toronto Public Library webpage. The library website didn’t use JQuery, so I looked up the plain-vanilla Javascript way of selecting and modifying elements.

  .parentNode.querySelector('h3').innerHTML =
  'Library Branch <a target="_blank" style="color: white; ' +
  'text-decoration: underline" ' +
  'href="http://localhost:9000/viz.html?url=' +
  encodeURIComponent(location.href) + '">(Visualize)</a>';

Step 6: Tweak the interface

I wanted to display information on hover and filter search results on click. Most of the tutorials I saw focused on how to add event listeners to individual markers, but I eventually found an example that showed how to add a listener to and get the information from the event object. I also found out that you could add a title attribute and get a simple tooltip to display, which was great for confirming that I had the data all lined up properly.

Step 7: Cache the results

Testing with live data was a bit inconvenient because of occasional timeouts from the library website, so I decided to cache search results to the filesystem. I didn’t bother writing code for checking last modification time, since I knew it was just for demos and testing.

Step 8: Prettify the hover

The tooltip provided by title was a little bare, so I decided to spend some time figuring out how to make better information displays before taking screenshots for the presentation. I found an example that showed how to create and move an InfoWindow based on the event’s location instead of relying on marker information, so I used that to show the information with better formatting.

Step 9: Make the presentation

Here’s how I usually plan short presentations:

  1. Figure out the key message and the flow.
  2. Pick a target words-per-minute rate and come up with a word budget.
  3. Draft the script, checking it against my word budget.
  4. Read the script out loud a few times, checking for time, tone, and hard-to-say phrases.
  5. Annotate the script with notes on visual aids.
  6. Make visuals, take screenshots, etc.
  7. Record and edit short videos, splitting them up in Camtasia Studio by using markers so that I can control the pace of the video.
  8. Copy the script (or keywords) into the presenter’s notes.
  9. Test the script for time and flow, and revise as needed.

I considered two options for the flow. I could start with the personal use case (looking for new videos) and then expand from there, tying it into the library’s wider goals. That would be close to how I developed it. Or I could start with one of the hackathon challenges, establish that connection with the library’s goals, and then toss in my personal use case as a possibly amusing conclusion. After chatting about it with W- on the subway ride home from the library, I decided to start with the second approach. I figured that would make it easier for people to connect the dots in terms of relevance.

I used ~140wpm as my target, minus a bit of a buffer for demos and other things that could come up, so roughly 350 words for 3 minutes. I ran through the presentation a few times at home, clocking in at about 2:30. I tend to speak more quickly when I’m nervous, so I rehearsed with a slightly slower pace. That way, I could get a sense of what the pace should sound like. During the actual presentation, though, I was a teensy bit over time – there was a bit of unexpected applause. Also, even though I remembered to slow down, I didn’t breathe as well as I probalby should’ve; I still tend to breathe a little shallowly when I’m on stage. Maybe I should pick a lower WPM for presentations and add explicit breathing reminders. =)

I normally try to start with less material and then add details to fit the time. That way, I can easily adjust if I need to compress my talk, since I’ve added details in terms of priority. I initially had a hard time concisely expressing the problem statement and tying together the three examples I wanted to use, though. It took me a few tries to get things to fit into my word budget and flow in a way that made me happy.

Anyway, once I sorted out the script, I came up with some ideas for the visuals. I didn’t want a lot of words on the screen, since it’s hard to read and listen at the same time. Doodles work well for me. I sketched a few images and created a simple sequence. I took screenshots for the key parts I wanted to demonstrate, just in case I didn’t get around to doing a live demo or recording video. That way, I didn’t have to worry about scrambling to finish my presentation. I could start with something simple but presentable, and then I could add more frills if I had time.

Once the static slides were in place, I recorded and edited videos demonstrating the capabilities. Video is a nice way to give people a more real sense of how something works without risking as many technical issues as a live demo would.

I had started with just my regular resolution (1366×768 on my laptop) and a regular browser window, but the resulting video was not as sharp as it could have been. Since the presentation template had 4:3 aspect ratio, I redid the video with 1024×768 resolution and a full-screen browser in order to minimize the need for resizing.

I sped up boring parts of the video and added markers where I wanted to split it into slides. Camtasia Studio rendered the video into separate files based on my markers. I added the videos to individual slides, setting them to play automatically. I like the approach of splitting up videos onto separate slides because it allows me to narrate at my own pace instead of speeding up or slowing down to match the animation.

I copied the segments of my script to the presenter notes for each slide, and I used Presenter View to run through it a few more times so that I could check whether the pace worked and whether the visuals made sense. Seemed all right, yay!

Just in time, too. I had a quick lunch and headed off to the library for the conclusion of the hackathon.

There wsa a bit of time before the presentations started. I talked to Alan again to show him what I’d made, hear about what he had been working on, and pick his brain to figure out which terms might resonate with the internal jargon of the library – little things, like what they call the people who decide what kinds of books should be in which libraries, or what they call the things that libraries lend. (Items? Resources? Items.) Based on his feedback, I edited my script to change “library administrators” to “selection committees”. I don’t know if it made a difference, but it was a good excuse to learn more about the language people used.

I tested that the presentation displayed fine on the big screen, too. It turned out that the display was capable of widescreen input at a higher resolution than what I’d set, but 1024×768 was pretty safe and didn’t look too fuzzy, so I left it as it was. I used my presentation remote to flip through the slides while confirming that things looked okay from the back of the room (colours, size, important information not getting cut off by people’s heads, etc.). The hover text was a bit small, but it gave the general idea.

And then it was presentation time. I was third, which was great because once I finished, I could focus on other people’s presentations and learn from their ideas. Based on W-‘s cellphone video, it looks like I remembered to use the microphone so that the library could record, and I remembered to look up from my presenter notes and gesture from time to time (hard when you’re hidden behind the podium, but we do what we can!). I stayed pretty close to my script, but I hope I kept the script conversational enough that it sounded more like me instead of a book. I didn’t have the mental bandwidth to keep an eye on the timer in the center of the presenter view, but fortunately the time worked out reasonably well. I concluded just as the organizer was getting up to nudge me along, and I’d managed to get to all the points I wanted to along the way. Whew!

Anyway, that’s a quick braindump of the project and what it was like to hack it together. I’ll probably write some more about following up on ideas and about other people’s presentations, but I wanted to get this post out there while the experience was fresh in my head. It was fun. I hope the Toronto Public Library will take the hackathon ideas forward, and I hope they’ll get enough out of the hackathon that they’ll organize another one! =)

Thinking about problem-solving and sequencing

We’ve been helping J- with her culminating project in Grade 11 programming class, a text-based blackjack game. She was getting stuck because she wasn’t comfortable enough with Java or programming. She didn’t know where to start or what to do. I didn’t write the program for her (and that was never on the table, anyway), but I offered to guide her through the process. More experienced learners can do that kind of planning on their own, but when you’re new to a subject (and especially if you’re under time pressure), it helps to have that kind of scaffolding.

The first thing I did was to properly set up Eclipse on her computer. She had already tried setting it up, but it kept quitting with a cryptic error message. I realized that she didn’t have the Java software development kit installed. Once we added that, Eclipse started working.

Then I set up a sequence of tiny, tiny steps that she could implement with a little guidance. This was not the same sequence of events that would be in the final game: betting, shuffling, dealing, scoring, and so on. Instead, I focused on the steps she could build on bit by bit. It went something like this:

  1. Display a random number.
  2. Initialize the deck array based on the face array and suit array J- had already set up. This would contain strings like “ace of hearts”.
  3. Print the deck array.
  4. Set up a parallel array of card values, and print that as well. This would contain integers such as 1 for ace.
  5. Shuffle the deck by swapping each card with a random card, swapping the same card’s value as well.
  6. Write a function that takes an array of card values and returns the value of the hand.
  7. Modify the function to take aces into account, since aces could be either 1 or 11.
  8. Handle the initial dealing of two cards each by keeping track of the top card and updating a status array.
  9. Display the cards for a specified hand based on the status array.
  10. Update the hand value function to take the status array into account.
  11. Deal cards to the player as requested.
  12. Check if the player has lost.
  13. Follow the rules for dealing additional cards to the dealer.
  14. Check if the dealer has lost.
  15. Figure out who won that round.
  16. Read the bet.
  17. Update the bet after the player has won or lost the round.
  18. Check if the player has doubled their money or lost all their money.
  19. Add more error-checking.
  20. Add more flexibility.

2015-06-11a Blackjack implementation sequence -- index card #sequencing #problem-solving

2015-06-11a Blackjack implementation sequence – index card #sequencing #problem-solving

This sequence meant that she could write and test the less-interactive parts first (preparing the deck, shuffling the cards, calculating the score) before slowing down her compile-run-test cycle with input. It also let her work on the simplest parts first without writing a lot of redundant code and with just a little prompting from me.

Instead of this zigzag path, we could have followed the chronological flow of the program, especially if I introduced her to the practice of stubbing functions until you’re ready to work on them. This shuffled order felt a bit better in terms of demonstrable progress, and she seems to have been able to follow along with the construction process. In retrospect, the chronological flow might have been easier for her to learn and apply to other projects, though, since breaking down and shuffling the order of tasks is a skill that programming newbies probably need a while to develop.

Anyway, helping J- with her project got me thinking about how I work my way through programming challenges on my personal projects and in my consulting. I tend to try to figure things out by myself instead of asking mentors or the Internet. I also tend to write things in a bottom-up instead of top-down order: starting with small things I can write and test quickly, and then gradually building more elaborate processes around them.

2015-06-09e What do I mean by sequencing -- index card #learning #problem-solving #sequencing

2015-06-09e What do I mean by sequencing – index card #learning #problem-solving #sequencing

I think of the process of breaking down tasks and organizing them into a useful order as “sequencing”, which is part of the general domain of problem-solving. There’s probably an official term for it, but until I find it, that’s the term makes sense to me. When I was teaching undergraduate computer science, I noticed that students often struggled with the following aspects:

  • Imagining a good goal, if they were doing a self-directed project (not too big, not too small, etc.)
  • Fleshing out their goal into specifications
  • Identifying possible chunks along the path
  • Breaking those chunks down into smaller chunks as needed
  • Figuring out how to bridge those chunks together
  • Prioritizing chunks
  • Re-evaluating paths, progress, and goals as they learned more
  • Managing motivation
  • Finding resources
  • Translating or adapting those resources to what they needed to do
  • Figuring out what they could start with (what they already knew, or that first hand-hold on a chunk)

2015-06-09h Sequencing challenges -- index card #learning #problem-solving #sequencing #challenges

2015-06-09h Sequencing challenges – index card #learning #problem-solving #sequencing #challenges

I sometimes have a hard time with these aspects too, especially when I’m learning a new toolkit or language. I think I’m getting better at picking ridiculously tiny steps and celebrating that progress instead of getting frustrated by blocks. When I can’t figure tiny steps out, I know that going through tutorials and reading other people’s code will help me build my vocabulary. That way, I can get better at finding resources and planning chunks. I have a better idea of what’s out there, what things are called, and maybe even what’s probably harder and what’s probably easier.

2015-06-09f How do I develop my own sequencing skills -- index card #learning #problem-solving #sequencing

2015-06-09f How do I develop my own sequencing skills – index card #learning #problem-solving #sequencing

Programming gives me plenty of opportunities to develop my sequencing skills. By writing notes with embedded code snippets and TODOs, I can break large chunks down into smaller chunks that fit in my working memory. Sometimes I read programming documentation or source code without any particular project in mind, just collecting ideas and expanding my imagination. If I focus on writing tiny chunks that start off as “just good enough” rather than making them as elaborate as I can, that lets me move on to the rest of the program and get feedback faster. Saving snippets of unused or evolving code (either manually in my notes or automatically via a version contro system) lets me reuse them elsewhere.

I imagine that getting even better at sequencing might involve:

  • Sharing intermediate or neighbouring products to get feedback faster and create more value
  • Figuring out (and remembering!) sequences for getting into various areas such as Angular or D3
  • Becoming fluent with many styles of sequencing, like working top-down instead of just bottom-up
  • Building reliable automated tests along the way, even as my understanding of the project evolves

What about helping other people get better at sequencing?

2015-06-09g How can I help other people develop sequencing skills -- index card #learning #problem-solving #sequencing

2015-06-09g How can I help other people develop sequencing skills – index card #learning #problem-solving #sequencing

  1. If I share a pre-defined sequence, that helps people with less experience follow a possibly more efficient path.
  2. If I customize the sequence for someone I’m helping, that’s even more useful.
  3. Talking about how we’re customizing the sequence exposes sequencing as a skill.
  4. Guiding someone through the breakdown of a small chunk (from a task to pseudocode, for example) can help them get the hang of turning things into specifics.
  5. Guiding someone through the breakdown of a larger chunk into smaller chunks helps them think of larger chunks as doable.
  6. Once someone has identified a few chunks, I can help with prioritizing them based on effort, relationship between chunks, etc.
  7. In preparation for independent learning, it’s good to practice the skill of figuring out those prioritization factors: small experiments, research, and so on.
  8. And at some point (woohoo!), I can help someone reflect on and tweak a plan they developed.

I’m not yet at the point of being able to do even step 1 well, but maybe someday (with lots of practice)!

Since this is a super-useful skill and not everyone’s into programming, it might be interesting to look around for non-programming situations that can help develop sequencing skills.

2015-06-09i Non-programming examples of sequencing -- index card #learning #problem-solving #sequencing

2015-06-09i Non-programming examples of sequencing – index card #learning #problem-solving #sequencing

On the passive side, it’s interesting to learn about how things work or how they’re made. Better yet, you can actively practise sequencing with math problems, video games with quests… Writing is like this too, if you think of key points and use tools such as outlines or index cards. Making things involves imagining what you want and figuring out lots of small steps along the way. Life is filled with little opportunities for everyday problem-solving.

I’d love to learn more about this. Chunk decomposition? Problem-solving order? Problem decomposition? Step-wise refinement? I’m more comfortable with bottom-up problem-solving, but top-down solving has its merits (focus, for one), so maybe I can practice that too. Using both approaches can be helpful. (Ooh, there’s a Wikipedia article on top-down and bottom-up design…) Time to check out research into the psychology of problem-solving!

Thinking about adaptive menus for tracking

I’ve been thinking about building more tools for myself. Some of the ideas I’ve been playing around with are around simplifying activity tracking further, possibly getting it to the point where it suggests things for me to do when my brain is fuzzy.

My current tracking system has two tiers. For my most-common activities, I use a custom menu that I can open from my phone’s home screen. I created the menu using Tasker. It’s easy to configure menu items to update my Quantified Awesome activity records as well as run other logic on my phone. For example:

  • “Eat dinner” creates an activity record for “Dinner”, then starts MyFitnessPal so that I can log the meal
  • “Walk home” creates an activity record for “Walk – Other”, then starts step-by-step navigation of the walk back to my house
  • “Sleep” creates an activity record, then launches Sleep as Android
  • “Ni No Kuni” creates an activity record, prompts me for what I want to do after an hour of playing, opens a page with tips for the game, and then reminds me of my plans after that hour passes

One advantage of using something on my phone is that I don’t have to wait for the initial web page from Quantified Awesome to load. My keyboard occasionally takes a while to come up, too, so the menu-based interface gets around that as well. Also, as I get the hang of using Tasker, I can set up more intelligent processes. The menu has a link to open the web version, so if I want to track something less frequent, I can always use the web interface.

In the web interface, I usually type a substring to identify the category I want to track. For example, “kitch” results in an activity record for “Clean the kitchen”. I use this interface if I need to backdate entries (ex: -5m), too.

In addition to the two interfaces above, I’ve been thinking about taking advantage of the predictability of my schedule.

Research into adaptive menus turns up quite a few design ideas and considerations. Since I’m building this for myself, I can get around many of the challenges of adaptive interfaces, such as privacy and predictability. I’m curious about the following options:

  • Text-based input with minimal cues, as part of a more powerful command line
  • Context-sensitive menu with a handful of items (3-4, with a link to more): I can probably suggest candidate activities based on the past two activity records. That might mean a little bit of latency as I check, though. It also means that the menu will keep shifting, so I’ll need to read it and find the item I want to click on.
    • For everyday use?
    • For really fuzzy moments?
  • Static menu of frequent items, but adaptive highlighting (ex: green or bold, or fading out other things slightly)
    • Abrupt onset, others fading in over 500ms
    • Colour?
    • Weight?
  • Split menu: frequent items on top, everything else below
    • Static
    • Adaptive
  • Hierarchy of menus: speedy, but lots of tapping
  • Cut off menu: show only the activities that come after the one I’ve just tracked

Hmm… It might be interesting to play around with different menu options. It would be good to experiment with NFC as well, especially early in the morning. =)


De-dupe and link: Using the Flickr API to neaten up my archive and link sketches to blog posts

I’ve been thinking about how to manage the relationships between my blog posts and my Flickr sketches. Here’s the flow of information:

2015-01-06 Figuring out information flow -- index card

2015.01.06 Figuring out information flow – index card

I scan my sketches or draw them on the computer, and then I upload these sketches to Flickr using photoSync, which synchronizes folders with albums. I include these sketches in my outlines and blog posts, and I update my index of blog posts every month. I recently added a tweak to make it possible for people to go from a blog post to its index entry, so it should be easier to see a post in context. I’ve been thinking about keeping an additional info index to manage blog posts and sketches, including unpublished ones. We’ll see how well that works. Lastly, I want to link my Flickr posts to my blog posts so that people can see the context of the sketch.

My higher goal is to be able to easily see the open ideas that I haven’t summarized or linked to yet. There’s no shortage of new ideas, but it might be interesting to revisit old ones that had a chance to simmer a bit. I wrote a little about this in Learning from artists: Making studies of ideas. Let me flesh out what I want this archive to be like.

2015-01-05 Thinking about my archive -- index card
2015.01.05 Thinking about my archive

When I pull on an idea, I’d like to be able to see other open topics attached to it. I also want to be able to see open topics that might jog my memory.

How about the technical details? How can I organize my data so that I can get what I want from it?

2015-01-05 Figuring out the technical details of this idea or visual archive I want -- index card
2015.01.05 Figuring out the technical details of this idea or visual archive I want – index card

Because blog posts link to sketches and other blog posts, I can model this as a directed graph. When I initially drew this, I thought I might be able to get away with an acyclic graph (no loops). However, since I habitually link to future posts (the time traveller’s problem!), I can’t make that simplifying assumption. In addition, a single item might be linked from multiple things, so it’s not a simple tree (and therefore I can’t use an outline). I’ll probably start by extracting all the link information from my blog posts and then figuring out some kind of Org Mode-based way to update the graph.

2015-01-07 Mapping the connections in my blog -- index card
2015.01.07 Mapping the connections in my blog – index card

To get one step closer to being able to see open thoughts and relationships, I decided that my sketches on Flickr:

  • should not have duplicates despite my past mess-ups, so that:
    • I can have an accurate count
    • it’s easier for me to categorize
    • people get less confused
  • should have hi-res versions if possible, despite the IFTTT recipe I tried that imported blog posts but unfortunately picked up the low-res thumbnails instead of the hi-res links
  • should link to the blog posts they’re mentioned in, so that:
    • people can read more details if they come across a sketch in a search
    • I can keep track of which sketches haven’t been blogged yet

I couldn’t escape doing a bit of manual cleaning up, but I knew I could automate most of the fiddly bits. I installed node-flickrapi and cheerio (for HTML parsing), and started playing.

Removing duplicates

Most of the duplicates had resulted from the Great Renaming, when I added tags in the form of #tag1 #tag2 etc. to selected filenames. It turns out that adding these tags en-masse using Emacs’ writable Dired mode broke photoSync’s ability to recognize the renamed files. As a result, I had files like this:

  • 2013-05-17 How I set up Autodesk Sketchbook Pro for sketchnoting.png
  • 2013-05-17 How I set up Autodesk Sketchbook Pro for sketchnoting #tech #autodesk-sketchbook-pro #drawing.png

This is neatly resolved by the following Javascript:

exports.trimTitle = function(str) {
    return str.replace(/ --.*$/g, '').replace(/#[^ ]+/g, '').replace(/[- _]/g, '');

and a comparison function that compared the titles and IDs of two photos:

exports.keepNewPhoto = function(oldPhoto, newPhoto) {
    if (newPhoto.title.length > oldPhoto.title.length)
        return true;
    if (newPhoto.title.length < oldPhoto.title.length)
        return false;
    if ( < 
        return true;
    return false;

So then this code can process the photos:

exports.processPhoto = function(p, flickr) {
    var trimmed = exports.trimTitle(p.title);
    if (trimmed && hash[trimmed] && != hash[trimmed].id) {
        // We keep the one with the longer title or the newer date
        if (exports.keepNewPhoto(hash[trimmed], p)) {
            exports.possiblyDeletePhoto(hash[trimmed], flickr);
            hash[trimmed] = p;
        else if ( != hash[trimmed].id) {
            exports.possiblyDeletePhoto(p, flickr);
    } else {
        hash[trimmed] = p;

You can see the code on Gist: duplicate_checker.js.

High-resolution versions

I couldn’t easily automate this, but fortunately, the IFTTT script had only imported twenty images or so, clearly marked by a description that said: “via sacha chua :: living an awesome life…”. I searched for each image, deleting the low-res entry if a high-resolution image was already in the system and replacing the low-res entry if that was the only one there.

Linking to blog posts

This was the trickiest part, but also the most fun. I took advantage of the fact that WordPress transforms uploaded filenames in a mostly consistent way. I’d previously added a bulk view that displayed any number of blog posts with very little additional markup, and I modified the relevant code in my theme to make parsing easier.

See this on Gist:

 * Adds "Blogged" links to Flickr for images that don't yet have "Blogged" in their description.
 * Command-line argument: URL to retrieve and parse

var secret = require('./secret');
var flickrOptions = secret.flickrOptions;
var Flickr = require("flickrapi");
var fs = require('fs');
var request = require('request');
var cheerio = require('cheerio');
var imageData = {};
var $;

function setDescriptionsFromURL(url) {
  request(url, function(error, response, body) {
    // Parse the images
    $ = cheerio.load(body);
    $('article').each(function() {
      var prettyLink = $(this).find("h2 a").attr("href");
      if (!prettyLink.match(/weekly/i) && !prettyLink.match(/monthly/i)) {
        collectLinks($(this), prettyLink, imageData);

function updateFlickrPhotos() {
    Flickr.authenticate(flickrOptions, function(error, flickr) {
        {user_id: flickrOptions.user_id,
         per_page: 500,
         extras: 'description',
         text: ' -blogged'}, function(err, result) {
           processPage(result, flickr);
           for (var i = 2 ; i <; i++) {
               {user_id: flickrOptions.user_id, per_page: 500, page: i,
                extras: 'description', text: ' -blogged'},
               function(err, result) {
                 processPage(err, result, flickr);

function collectLinks(article, prettyLink, imageData) {
  var results = [];
  article.find(".body a").each(function() {
    var link = $(this);
    if (link.attr('href')) {
      if (link.attr('href').match(/sachachua/)
          || !link.attr('href').match(/^http/)) {
        imageData[exports.trimTitle(link.attr('href'))] = prettyLink;
      } else if (link.attr('href').match(/ {
        imageData[exports.trimTitle(link.text())] = prettyLink;
  return results;

exports.trimTitle = function(str) {
  return str.replace(/^.*\//, '').replace(/^wpid-/g, '').replace(/[^A-Za-z0-9]/g, '').replace(/png$/, '').replace(/[0-9]$/, '');

function processPage(result, flickr) {
  if (!result) return;
  for (var i = 0; i <; i++) {
    var p =[i];
    var trimmed = exports.trimTitle(p.title);
    var noTags = trimmed.replace(/#.*/g, '');
    var withTags = trimmed.replace(/#/g, '');
    var found = imageData[noTags] || imageData[withTags];
    if (found) {
      var description = p.description._content;
      if (description.match(found)) continue;
      if (description) {
        description += " - ";
      description += '<a href="' + found + '">Blogged</a>';
      console.log("Updating " + p.title + " with " + description);
         description: description},
        function(err, res) {
          if (err) { console.log(err, res); }
        } );


And now sketches like 2013-11-11 How to think about a book while reading it are now properly linked to their blog posts. Yay! Again, this script won’t get everything, but it gets a decent number automatically sorted out.

Next steps:

  • Run the image extraction and set description scripts monthly as part of my indexing process
  • Check my list of blogged images to see if they’re matched up with Flickr sketches, so that I can identify images mysteriously missing from my sketchbook archive or not correctly linked

Yay code!

First steps towards Javascript testing

I know, I know, it’s about time I got the hang of this. Better late than never, right? =) Anyway, I spent some time going through tutorials for QUnit and Jasmine. For QUnit, I followed this Smashing Magazine tutorial on Javascript unit testing. I modified the code a little bit to add the Z timezone to the test data, since my tests initially didn’t pass.


<!DOCTYPE html>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <title>Refactored date examples</title>
    <link rel="stylesheet" href="" />
    <script src=""></script>
    <script src="prettydate.js"></script>
     test("prettydate.format", function() {
       function date(then, expected) {
         equal(prettyDate.format('2008/01/28 22:25:00Z', then), expected);
       date("2008/01/28 22:24:30Z", "just now");
       date("2008/01/28 22:23:30Z", "1 minute ago");
       date("2008/01/28 21:23:30Z", "1 hour ago");
       date("2008/01/27 22:23:30Z", "Yesterday");
       date("2008/01/26 22:23:30Z", "2 days ago");
       date("2007/01/26 22:23:30Z", undefined);
     function domtest(name, now, first, second) {
       test(name, function() {
         var links = document.getElementById('qunit-fixture').getElementsByTagName('a');
         equal(links[0].innerHTML, 'January 28th, 2008');
         equal(links[2].innerHTML, 'January 27th, 2008');
         equal(links[0].innerHTML, first);
         equal(links[2].innerHTML, second);

     domtest("prettyDate.update", '2008-01-28T22:25:00Z', '2 hours ago', 'Yesterday');
     domtest("prettyDate.update, one day later", '2008-01-29T22:25:00Z', 'Yesterday', '2 days ago');
  <div id="qunit"></div>
  <div id="qunit-fixture">
      <li class="entry" id="post57">
        <p>blah blah blah…</p>
        <small class="extra">
          Posted <span class="time"><a href="/2008/01/blah/57/" title="2008-01-28T20:24:17Z">January 28th, 2008</a></span>
          by <span class="author"><a href="/john/">John Resig</a></span>
      <li class="entry" id="post57">
        <p>blah blah blah…</p>
        <small class="extra">
          Posted <span class="time"><a href="/2008/01/blah/57/" title="2008-01-27T22:24:17Z">January 27th, 2008</a></span>
          by <span class="author"><a href="/john/">John Resig</a></span>

For practice, I converted the QUnit tests to Jasmine. The first part of the test was easy, but I wanted a clean way to do the HTML fixture-based tests for prettydate.update too. Jasmine-JQuery gives you a handy way to have HTML fixtures. Here’s what my code ended up as:


describe("PrettyDate.format", function() {
    function checkDate(name, then, expected) {
        it(name, function() {
            expect(prettyDate.format('2008/01/28 22:25:00Z', then)).toEqual(expected);
    checkDate("should display recent times", '2008/01/28 22:24:30Z', 'just now');
    checkDate("should display times within a minute", '2008/01/28 22:23:30Z', '1 minute ago');
    checkDate("should display times within an hour", '2008/01/28 21:23:30Z', '1 hour ago');
    checkDate("should display times within a day", '2008/01/27 22:23:30Z', 'Yesterday');
    checkDate("should display times within two days", '2008/01/26 22:23:30Z', '2 days ago');
describe("PrettyDate.update", function() {
    function domtest(name, now, first, second) {
       it(name, function() {
           var links = document.getElementById('qunit-fixture').getElementsByTagName('a');
           expect(links[0].innerHTML).toEqual('January 28th, 2008');
           expect(links[2].innerHTML).toEqual('January 27th, 2008');
    domtest("prettyDate.update", '2008-01-28T22:25:00Z', '2 hours ago', 'Yesterday');
    domtest("prettyDate.update, one day later", '2008-01-29T22:25:00Z', 'Yesterday', '2 days ago');


  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  <title>Jasmine Spec Runner v2.0.2</title>

  <link rel="shortcut icon" type="image/png" href="lib/jasmine-2.0.2/jasmine_favicon.png">
  <link rel="stylesheet" type="text/css" href="lib/jasmine-2.0.2/jasmine.css">

  <script type="text/javascript" src=""></script>
  <script type="text/javascript" src="lib/jasmine-2.0.2/jasmine.js"></script>
  <script type="text/javascript" src="lib/jasmine-2.0.2/jasmine-html.js"></script>
  <script type="text/javascript" src="lib/jasmine-2.0.2/boot.js"></script>
  <script type="text/javascript" src="jasmine-jquery.js"></script>

  <!-- include source files here... -->
  <script type="text/javascript" src="prettydate.js"></script>

  <!-- include spec files here... -->
  <script type="text/javascript" src="spec/DateSpec.js"></script>



I’m looking forward to learning how to use Jasmine to test Angular applications, since behaviour-driven testing seems to be common practice there. Little steps! =)

Upgrading from Rails 3 to Rails 4; thank goodness for Emacs and rspec

I spent some time working on upgrading Quantified Awesome from Rails 3 to Rails 4. I was so glad that I had invested the time into writing enough RSpec and Cucumber tests to cover all the code, since that flushed out a lot of the differences between versions: deprecated methods, strong parameters, and so on.

rspec-mode was really helpful while testing upgrade-related changes. I quickly got into the habit of using C-c , m (rspec-verify-matching) to run the spec file related to the current file. If I needed to test specific things, I headed over to the spec file and used C-c , s (rspec-verify-single). Because RSpec had also changed a little bit, I needed to change the way rspec-verify-single worked.

(defun sacha/rspec-verify-single ()
  "Runs the specified example at the point of the current buffer."
     (rspec-spec-file-for (buffer-file-name))
               (number-to-string (line-number-at-pos))))

(sacha/package-install 'rspec-mode)
(use-package rspec-mode
    (setq rspec-command-options "--fail-fast --color")
    (fset 'rspec-verify-single 'sacha/rspec-verify-single)))

C-c , c (rspec-verify-continue) was also a handy function, especially with an .rspec containing the --color --fail-fast options. rspec-verify-continue starts from the current test and runs other tests following it, so you don’t have to re-run the tests that worked until you’re ready for everything.

I should probably get back to setting up Guard so that the tests are automatically re-run whenever I save files, but this is a good start. Also, yay, I’m back to all the tests working!

Test coverage didn’t mean I could avoid all the problems, though. I hadn’t properly frozen the versions in my Gemfile or checked the asset pipeline. When I deployed to my webserver, I ran into problems with incompatible changes between Rails 4.0 and 4.1, and Bootstrap 2 and Bootstrap 3. Whoops! Now I’m trying to figure out how to get formtastic-bootstrap to play nicely with Bootstrap 3 and Rails 4 and the latest Formtastic… There are some git repositories that claim to do this correctly, but they don’t seem to work.


Fine, I’ll switch to simple_form, since that seems to be sort of okay with Bootstrap 3. I ended up using this simple_form_bootstrap3 initializer along with

# You can wrap a collection of radio/check boxes in a pre-defined tag, defaulting to none.
config.collection_wrapper_tag = :div

# You can define the class to use on all collection wrappers. Defaulting to none.
config.collection_wrapper_class = 'collection'

and this in my style.css.sass:

    @extend .form-group
    @extend .col-sm-2
  .control-group > .form-control, .form-group > .form-control, .form-group > .collection
    @extend .col-sm-10
    @extend .col-sm-offset-2
    text-align: left
    font-weight: normal
    @extend .col-sm-offset-2
    font-weight: normal

which is totally a hack, but it sort-of-semi-works for now.

More changes to come, since I’ve got to get it sorted out for Rails 4.1 too! Mrph.