Categories: geek » development

RSS - Atom - Subscribe via email

De-dupe and link: Using the Flickr API to neaten up my archive and link sketches to blog posts

Posted: - Modified: | development, geek

I've been thinking about how to manage the relationships between my blog posts and my Flickr sketches. Here's the flow of information:

2015-01-06 Figuring out information flow -- index card

I scan my sketches or draw them on the computer, and then I upload these sketches to Flickr using photoSync, which synchronizes folders with albums. I include these sketches in my outlines and blog posts, and I update my index of blog posts every month. I recently added a tweak to make it possible for people to go from a blog post to its index entry, so it should be easier to see a post in context. I've been thinking about keeping an additional info index to manage blog posts and sketches, including unpublished ones. We'll see how well that works. Lastly, I want to link my Flickr posts to my blog posts so that people can see the context of the sketch.

My higher goal is to be able to easily see the open ideas that I haven't summarized or linked to yet. There's no shortage of new ideas, but it might be interesting to revisit old ones that had a chance to simmer a bit. I wrote a little about this in Learning from artists: Making studies of ideas. Let me flesh out what I want this archive to be like.

2015-01-05 Thinking about my archive -- index card
2015.01.05 Thinking about my archive

When I pull on an idea, I'd like to be able to see other open topics attached to it. I also want to be able to see open topics that might jog my memory.

How about the technical details? How can I organize my data so that I can get what I want from it?

2015-01-05 Figuring out the technical details of this idea or visual archive I want -- index card
2015.01.05 Figuring out the technical details of this idea or visual archive I want – index card

Because blog posts link to sketches and other blog posts, I can model this as a directed graph. When I initially drew this, I thought I might be able to get away with an acyclic graph (no loops). However, since I habitually link to future posts (the time traveller's problem!), I can't make that simplifying assumption. In addition, a single item might be linked from multiple things, so it's not a simple tree (and therefore I can't use an outline). I'll probably start by extracting all the link information from my blog posts and then figuring out some kind of Org Mode-based way to update the graph.

2015-01-07 Mapping the connections in my blog -- index card

To get one step closer to being able to see open thoughts and relationships, I decided that my sketches on Flickr:

  • should not have duplicates despite my past mess-ups, so that:
    • I can have an accurate count
    • it's easier for me to categorize
    • people get less confused
  • should have hi-res versions if possible, despite the IFTTT recipe I tried that imported blog posts but unfortunately picked up the low-res thumbnails instead of the hi-res links
  • should link to the blog posts they're mentioned in, so that:
    • people can read more details if they come across a sketch in a search
    • I can keep track of which sketches haven't been blogged yet

I couldn't escape doing a bit of manual cleaning up, but I knew I could automate most of the fiddly bits. I installed node-flickrapi and cheerio (for HTML parsing), and started playing.

Removing duplicates

Most of the duplicates had resulted from the Great Renaming, when I added tags in the form of #tag1 #tag2 etc. to selected filenames. It turns out that adding these tags en-masse using Emacs' writable Dired mode broke photoSync's ability to recognize the renamed files. As a result, I had files like this:

  • 2013-05-17 How I set up Autodesk Sketchbook Pro for sketchnoting.png
  • 2013-05-17 How I set up Autodesk Sketchbook Pro for sketchnoting #tech #autodesk-sketchbook-pro #drawing.png

This is neatly resolved by the following Javascript:

exports.trimTitle = function(str) {
    return str.replace(/ --.*$/g, '').replace(/#[^ ]+/g, '').replace(/[- _]/g, '');
};

and a comparison function that compared the titles and IDs of two photos:

exports.keepNewPhoto = function(oldPhoto, newPhoto) {
    if (newPhoto.title.length > oldPhoto.title.length)
        return true;
    if (newPhoto.title.length < oldPhoto.title.length)
        return false;
    if (newPhoto.id < oldPhoto.id) 
        return true;
    return false;
};

So then this code can process the photos:

exports.processPhoto = function(p, flickr) {
    var trimmed = exports.trimTitle(p.title);
    if (trimmed && hash[trimmed] && p.id != hash[trimmed].id) {
        // We keep the one with the longer title or the newer date
        if (exports.keepNewPhoto(hash[trimmed], p)) {
            exports.possiblyDeletePhoto(hash[trimmed], flickr);
            hash[trimmed] = p;
        }
        else if (p.id != hash[trimmed].id) {
            exports.possiblyDeletePhoto(p, flickr);
        }
    } else {
        hash[trimmed] = p;
    }
};

You can see the code on Gist: duplicate_checker.js.

High-resolution versions

I couldn't easily automate this, but fortunately, the IFTTT script had only imported twenty images or so, clearly marked by a description that said: "via sacha chua :: living an awesome life…". I searched for each image, deleting the low-res entry if a high-resolution image was already in the system and replacing the low-res entry if that was the only one there.

Linking to blog posts

This was the trickiest part, but also the most fun. I took advantage of the fact that WordPress transforms uploaded filenames in a mostly consistent way. I'd previously added a bulk view that displayed any number of blog posts with very little additional markup, and I modified the relevant code in my theme to make parsing easier.

See this on Gist:

/**
 * Adds "Blogged" links to Flickr for images that don't yet have "Blogged" in their description.
 * Command-line argument: URL to retrieve and parse
 */

var secret = require('./secret');
var flickrOptions = secret.flickrOptions;
var Flickr = require("flickrapi");
var fs = require('fs');
var request = require('request');
var cheerio = require('cheerio');
var imageData = {};
var $;

function setDescriptionsFromURL(url) {
  request(url, function(error, response, body) {
    // Parse the images
    $ = cheerio.load(body);
    $('article').each(function() {
      var prettyLink = $(this).find("h2 a").attr("href");
      if (!prettyLink.match(/weekly/i) && !prettyLink.match(/monthly/i)) {
        collectLinks($(this), prettyLink, imageData);
      }
    });
    updateFlickrPhotos();
  });
}

function updateFlickrPhotos() {
    Flickr.authenticate(flickrOptions, function(error, flickr) {
      flickr.photos.search(
        {user_id: flickrOptions.user_id,
         per_page: 500,
         extras: 'description',
         text: ' -blogged'}, function(err, result) {
           processPage(result, flickr);
           for (var i = 2 ; i < result.photos.pages; i++) {
             flickr.photos.search(
               {user_id: flickrOptions.user_id, per_page: 500, page: i,
                extras: 'description', text: ' -blogged'},
               function(err, result) {
                 processPage(err, result, flickr);
               });
           }
         });
    });
}

function collectLinks(article, prettyLink, imageData) {
  var results = [];
  article.find(".body a").each(function() {
    var link = $(this);
    if (link.attr('href')) {
      if (link.attr('href').match(/sachachua/)
          || !link.attr('href').match(/^http/)) {
        imageData[exports.trimTitle(link.attr('href'))] = prettyLink;
      } else if (link.attr('href').match(/flickr.com/)) {
        imageData[exports.trimTitle(link.text())] = prettyLink;
      }
    }
  });
  return results;
}

exports.trimTitle = function(str) {
  return str.replace(/^.*\//, '').replace(/^wpid-/g, '').replace(/[^A-Za-z0-9]/g, '').replace(/png$/, '').replace(/[0-9]$/, '');
};

function processPage(result, flickr) {
  if (!result) return;
  for (var i = 0; i < result.photos.photo.length; i++) {
    var p = result.photos.photo[i];
    var trimmed = exports.trimTitle(p.title);
    var noTags = trimmed.replace(/#.*/g, '');
    var withTags = trimmed.replace(/#/g, '');
    var found = imageData[noTags] || imageData[withTags];
    if (found) {
      var description = p.description._content;
      if (description.match(found)) continue;
      if (description) {
        description += " - ";
      }
      description += '<a href="' + found + '">Blogged</a>';
      console.log("Updating " + p.title + " with " + description);
      flickr.photos.setMeta(
        {photo_id: p.id,
         description: description},
        function(err, res) {
          if (err) { console.log(err, res); }
        } );
    }
  }
}

setDescriptionsFromURL(process.argv[2]);

And now sketches like

are now properly linked to their blog posts. Yay! Again, this script won't get everything, but it gets a decent number automatically sorted out.

Next steps:

  • Run the image extraction and set description scripts monthly as part of my indexing process
  • Check my list of blogged images to see if they're matched up with Flickr sketches, so that I can identify images mysteriously missing from my sketchbook archive or not correctly linked

Yay code!

View or add comments (Disqus), or e-mail me at sacha@sachachua.com

First steps towards Javascript testing

Posted: - Modified: | development, geek

I know, I know, it's about time I got the hang of this. Better late than never, right? =) Anyway, I spent some time going through tutorials for QUnit and Jasmine. For QUnit, I followed this Smashing Magazine tutorial on Javascript unit testing. I modified the code a little bit to add the Z timezone to the test data, since my tests initially didn't pass.

test.html

<!DOCTYPE html>
<html>
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <title>Refactored date examples</title>
    <link rel="stylesheet" href="http://code.jquery.com/qunit/qunit-1.15.0.css" />
    <script src="http://code.jquery.com/qunit/qunit-1.15.0.js"></script>
    <script src="prettydate.js"></script>
    <script>
     test("prettydate.format", function() {
       function date(then, expected) {
         equal(prettyDate.format('2008/01/28 22:25:00Z', then), expected);
       }
       date("2008/01/28 22:24:30Z", "just now");
       date("2008/01/28 22:23:30Z", "1 minute ago");
       date("2008/01/28 21:23:30Z", "1 hour ago");
       date("2008/01/27 22:23:30Z", "Yesterday");
       date("2008/01/26 22:23:30Z", "2 days ago");
       date("2007/01/26 22:23:30Z", undefined);
     });
     function domtest(name, now, first, second) {
       test(name, function() {
         var links = document.getElementById('qunit-fixture').getElementsByTagName('a');
         equal(links[0].innerHTML, 'January 28th, 2008');
         equal(links[2].innerHTML, 'January 27th, 2008');
         prettyDate.update(now);
         equal(links[0].innerHTML, first);
         equal(links[2].innerHTML, second);
       });
     }

     domtest("prettyDate.update", '2008-01-28T22:25:00Z', '2 hours ago', 'Yesterday');
     domtest("prettyDate.update, one day later", '2008-01-29T22:25:00Z', 'Yesterday', '2 days ago');
    </script>
</head>
<body>
  <div id="qunit"></div>
  <div id="qunit-fixture">
    <ul>
      <li class="entry" id="post57">
        <p>blah blah blah…</p>
        <small class="extra">
          Posted <span class="time"><a href="/2008/01/blah/57/" title="2008-01-28T20:24:17Z">January 28th, 2008</a></span>
          by <span class="author"><a href="/john/">John Resig</a></span>
        </small>
      </li>
      <li class="entry" id="post57">
        <p>blah blah blah…</p>
        <small class="extra">
          Posted <span class="time"><a href="/2008/01/blah/57/" title="2008-01-27T22:24:17Z">January 27th, 2008</a></span>
          by <span class="author"><a href="/john/">John Resig</a></span>
        </small>
      </li>
    </ul>
  </div>
</body>
</html>

For practice, I converted the QUnit tests to Jasmine. The first part of the test was easy, but I wanted a clean way to do the HTML fixture-based tests for prettydate.update too. Jasmine-JQuery gives you a handy way to have HTML fixtures. Here's what my code ended up as:

spec/DateSpec.js

describe("PrettyDate.format", function() {
    function checkDate(name, then, expected) {
        it(name, function() {
            expect(prettyDate.format('2008/01/28 22:25:00Z', then)).toEqual(expected);
        });
    }
    checkDate("should display recent times", '2008/01/28 22:24:30Z', 'just now');
    checkDate("should display times within a minute", '2008/01/28 22:23:30Z', '1 minute ago');
    checkDate("should display times within an hour", '2008/01/28 21:23:30Z', '1 hour ago');
    checkDate("should display times within a day", '2008/01/27 22:23:30Z', 'Yesterday');
    checkDate("should display times within two days", '2008/01/26 22:23:30Z', '2 days ago');
});
describe("PrettyDate.update", function() {
    function domtest(name, now, first, second) {
       it(name, function() {
           loadFixtures('test_fixture.html');
           var links = document.getElementById('qunit-fixture').getElementsByTagName('a');
           expect(links[0].innerHTML).toEqual('January 28th, 2008');
           expect(links[2].innerHTML).toEqual('January 27th, 2008');
           prettyDate.update(now);
           expect(links[0].innerHTML).toEqual(first);
           expect(links[2].innerHTML).toEqual(second);
       });
    }
    domtest("prettyDate.update", '2008-01-28T22:25:00Z', '2 hours ago', 'Yesterday');
    domtest("prettyDate.update, one day later", '2008-01-29T22:25:00Z', 'Yesterday', '2 days ago');
});

jasmine.html

<!DOCTYPE HTML>
<html>
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  <title>Jasmine Spec Runner v2.0.2</title>

  <link rel="shortcut icon" type="image/png" href="lib/jasmine-2.0.2/jasmine_favicon.png">
  <link rel="stylesheet" type="text/css" href="lib/jasmine-2.0.2/jasmine.css">

  <script type="text/javascript" src="https://code.jquery.com/jquery-2.1.1.min.js"></script>
  <script type="text/javascript" src="lib/jasmine-2.0.2/jasmine.js"></script>
  <script type="text/javascript" src="lib/jasmine-2.0.2/jasmine-html.js"></script>
  <script type="text/javascript" src="lib/jasmine-2.0.2/boot.js"></script>
  <script type="text/javascript" src="jasmine-jquery.js"></script>

  <!-- include source files here... -->
  <script type="text/javascript" src="prettydate.js"></script>

  <!-- include spec files here... -->
  <script type="text/javascript" src="spec/DateSpec.js"></script>

</head>

<body>
</body>
</html>

I'm looking forward to learning how to use Jasmine to test Angular applications, since behaviour-driven testing seems to be common practice there. Little steps! =)

View or add comments (Disqus), or e-mail me at sacha@sachachua.com

Upgrading from Rails 3 to Rails 4; thank goodness for Emacs and rspec

Posted: - Modified: | development, emacs, rails

I spent some time working on upgrading Quantified Awesome from Rails 3 to Rails 4. I was so glad that I had invested the time into writing enough RSpec and Cucumber tests to cover all the code, since that flushed out a lot of the differences between versions: deprecated methods, strong parameters, and so on.

rspec-mode was really helpful while testing upgrade-related changes. I quickly got into the habit of using C-c , m (rspec-verify-matching) to run the spec file related to the current file. If I needed to test specific things, I headed over to the spec file and used C-c , s (rspec-verify-single). Because RSpec had also changed a little bit, I needed to change the way rspec-verify-single worked.

(defun sacha/rspec-verify-single ()
  "Runs the specified example at the point of the current buffer."
  (interactive)
  (rspec-run-single-file
   (concat
     (rspec-spec-file-for (buffer-file-name))
     ":"
     (save-restriction
               (widen)
               (number-to-string (line-number-at-pos))))
   (rspec-core-options)))

(sacha/package-install 'rspec-mode)
(use-package rspec-mode
  :config
  (progn
    (setq rspec-command-options "--fail-fast --color")
    (fset 'rspec-verify-single 'sacha/rspec-verify-single)))

C-c , c (rspec-verify-continue) was also a handy function, especially with an .rspec containing the --color --fail-fast options. rspec-verify-continue starts from the current test and runs other tests following it, so you don't have to re-run the tests that worked until you're ready for everything.

I should probably get back to setting up Guard so that the tests are automatically re-run whenever I save files, but this is a good start. Also, yay, I'm back to all the tests working!

Test coverage didn't mean I could avoid all the problems, though. I hadn't properly frozen the versions in my Gemfile or checked the asset pipeline. When I deployed to my webserver, I ran into problems with incompatible changes between Rails 4.0 and 4.1, and Bootstrap 2 and Bootstrap 3. Whoops! Now I'm trying to figure out how to get formtastic-bootstrap to play nicely with Bootstrap 3 and Rails 4 and the latest Formtastic… There are some git repositories that claim to do this correctly, but they don't seem to work.

Grr.

Fine, I'll switch to simple_form, since that seems to be sort of okay with Bootstrap 3. I ended up using this simple_form_bootstrap3 initializer along with

# You can wrap a collection of radio/check boxes in a pre-defined tag, defaulting to none.
config.collection_wrapper_tag = :div

# You can define the class to use on all collection wrappers. Defaulting to none.
config.collection_wrapper_class = 'collection'

and this in my style.css.sass:

.form-horizontal
  .control-group
    @extend .form-group
  .control-label
    @extend .col-sm-2
  .control-group > .form-control, .form-group > .form-control, .form-group > .collection
    @extend .col-sm-10
  .help-block
    @extend .col-sm-offset-2
  .control-label.boolean
    text-align: left
    font-weight: normal
    @extend .col-sm-offset-2
  label.radio
    font-weight: normal

which is totally a hack, but it sort-of-semi-works for now.

More changes to come, since I've got to get it sorted out for Rails 4.1 too! Mrph.

View or add comments (Disqus), or e-mail me at sacha@sachachua.com

Reflecting on my growth as a programmer

| career, development

One of the things I realized from dealing with that programming issue is that I don't have a mature development workflow for front-end work yet. On previous projects, I focused mostly on back-end development. I had somewhat gotten the hang of test-driven development and code coverage when using Rails before, and I set up an issue tracker for my previous teams. For my main consulting engagement, I shifted to working on mostly HTML, Javascript, and CSS. I'd handled a bit of that before, but we usually worked with designers who did most of the heavy lifting (and the cross-browser fiddliness). Over the past two years,  I picked up more JQuery and Angular, fought with Internet Explorer 7 and then 8, and explored Chrome's developer tools a bit more.

I didn't have things quite set up the way I think other people have. I felt mildly guilty about installing programs that were not available from the client's internal software site, although Emacs was definitely worth the twinge of unease. Even the version control I used was ad-hoc, using Git on my computer to manage snippets for copying and pasting. I still haven't mastered the Javascript debugger in Google Chrome, much less the tools available for other browsers. (Hence all the grief Internet Explorer gave me.) I didn't have a test framework set up, so I often committed regression errors and other mistakes. I haven't yet internalized all the cool development tools in Emacs, like Smartparens and Magit. (I'm slowly getting the hang of multiple-cursors-mode, though!) In terms of workflow maturity, I felt more like someone a few years out of university (or maybe even someone in their final year of classes!), and definitely like someone cobbling things together instead of picking up practices from a well-running team.

My main consulting engagement is coming to a close, but I'm looking forward to learning more about the craft of software development. I have a few personal projects to practise on, and it might be easier to Do The Right Thing when you're less worried about potentially wasting the client's time. I'm looking forward to familiarizing myself with more of the nifty features in Emacs. I'm also looking forward to immersing myself in the right ways to do things with popular frameworks, including testing and deployment.

I'd like to become a good programmer someday. What would that be like? For the particular way that I work–a generalist pulling together different things quickly–a good programmer might be one who has a neatly organized library of snippets, and who writes modular code with simple tests that exercise different functionality. Using a debugger, the good programmer would be able to dig into other people's code, figuring out even things that aren't documented. That programmer would also be able to quickly prototype and build well-designed interfaces. Things don't have to win awards, but the interface shouldn't get in the way. An even better programmer would have the ability to coordinate other programmers, improving people's results by helping them work on both a tactical and strategic level.

Someday!

2014-09-15 Reflecting on my growth as a programmer

2014-09-15 Reflecting on my growth as a programmer

View or add comments (Disqus), or e-mail me at sacha@sachachua.com

My path for learning AngularJS

Posted: - Modified: | development, geek

I'd been meaning to learn AngularJS for a while, and rapidly prototyping a data-binding-heavy Javascript application was the perfect excuse. The phonecat tutorial on the AngularJS site was a little too heavy-weight for me, although it would probably have been useful for learning how to Do Things Right. Simpler, from-scratch tutorials like AngularJS in 30 minutes and ng-newsletter were a little more useful for me. After I got the hang of setting things up and using a controller, I browsed through the AngularJS documentation and looked for different modules as I needed them.

Here's the rough order I learned things in:

  1. Binding data with \{\{\}\}
  2. Retrieving data with $http (since I already had JSON handy from the NodeJS site I created)
  3. Iterating over data with ng-repeat
  4. Adding ng-click events
  5. Using ng-class
  6. Dependency injection
  7. Figuring out routing with ui-router
  8. Dividing things into multiple routers
  9. $interval and $timeout
  10. State change functions
  11. Resources (although I didn't end up really using these)
  12. Directives

I still have to learn about filters, nested views, testing, proper file organization, and all sorts of other goodness. But yeah, AngularJS feels pretty good for my brain… =) Yay!

View or add comments (Disqus), or e-mail me at sacha@sachachua.com

Programming and creativity

Posted: - Modified: | design, development

My client had been bringing me a constant stream of little technical challenges to solve. I pulled together different tools to make things happen: AutoHotkey, NodeJS, shared files, optical mark recognition, and so on. He said it was fun watching me figuring things out. It got me thinking about how programming can involve many different types of creativity. If you can tell the different types apart, you might be able to focus on improving some of those aspects.

2014-09-10 Programming - What does it mean to be creative?

2014-09-10 Programming – What does it mean to be creative?

Here's a rough first pass:

Hmm…

View or add comments (Disqus), or e-mail me at sacha@sachachua.com

Planning my code/development learning

Posted: - Modified: | development, experiment, geek, learning

One of the best things about programming is that as you learn more, the possibilities increase dramatically. Each new thing you learn can be combined with so many other things for even more awesomeness. I’m getting ready for my idea-of-the-month experiment, and I’m thinking about the kinds of building blocks I’d like to learn more about and use.

Android development – I can build small apps for myself
Data visualization library like D3 – web-based graphs
Some kind of Javascript/CSS/Rails front-end toolkit that makes it easier for me to prototype rich user interfaces
Evernote API, so that I can improve my workflow
AutoHotkey – finely-tuned timesavers
Rails 4 – prototyping
WordPress backend – building things
WordPress theming – design
Twitter API, for analysis
Meetup API, for analysis
Google Contacts, Google Calendar, IMAP headers – for analysis
OCR, speech recognition – so I can convert, even at 80% accuracy
Arduino, sensors, and motors – for interfacing with the physical world
Emacs LISP – for personal productivity
PhoneGap – cross-platform mobile apps?
Dropbox API – tools, analysis
Twilio or some other text API – communication

For March, I’d love to dig into Emacs, D3, and maybe Evernote. That way, I can prepare for the Emacs conference, visualize my Quantified Self stuff, and dig into my new brain backup system. :)

View or add comments (Disqus), or e-mail me at sacha@sachachua.com