Category Archives: wordpress

Filtering WordPress posts after a certain date

I wanted to make it easy to link people to a chronological view of my weekly reviews after becoming a parent, so I added this code to the functions.php in my custom WordPress theme.

function sacha_tweak_query() {
    if ($_REQUEST['after'] && preg_match('/^[-0-9]+$/', $_REQUEST['after']))  { 
        set_query_var('date_query', array('column' => 'post_date', 'after' => $_REQUEST['after'], 'inclusive' => true));
    }
    if ($_REQUEST['bulk']) {
        set_query_var('posts_per_page', -1);
    }
}

add_action('pre_get_posts', 'sacha_tweak_query');

It checks for HTTP query variables of the form after=2016-02-22 and bulk=1. If it sees an “after” filter, it updates the query to show only posts after that date. Bulk gets you all the entries on one page. (… Please use this wisely. =) )

Using the pre_get_posts action lets me make the functionality available across all the archive pages (tag, category, date) without adding special code to each of them. Neat!

Fixed paragraph breaks in WordPress, no more wall of text

While trying out the “after” filter I just added to my blog, I noticed that my paragraph breaks were missing. I hadn’t noticed it for a while because I’ve been building up my weekly and monthly reviews from sketches instead of blog posts. How embarrassing!

(Then A- woke up and it was time for lunch, so I was a bit frazzled. But W- stepped in and took care of her, hooray!)

I saw the paragraph breaks in WordPress’ visual editor, but not the exported HTML, which just kept whitespace in between the paragraphs instead of breaking them up with tags. It happened even when I created a new post through the web interface, so it wasn’t org2blog’s fault.

I checked if the paragraph issue happened on a new install. It didn’t.

I checked if the paragraph issue happened with all the plugins deactivated. It didn’t. Aha! (Note to self: I really should set up a dev environment again…)

I turned the plugins on one by one, and I narrowed it down to the NextGen Gallery plugin. It worked after I updated that.

Anyway, things should be readable again. Hooray!

Visualizing the internal citation network of my blog

I’m curious about the internal citation of my blog. Which thoughts have been developed over a long chain of posts? Which posts do I often link to? Where are there big jumps in time? Where have I combined threads?

2014-12-03 Internal citation network

I’ll probably need to build my own data extractor so that it can:

  • ignore weekly and monthly reviews, since I link to everything in those,
  • reconcile short and long permalinks, redirection, and sneak previews,
  • and maybe even index my sketches and look at follow-ups

and I’ll probably want to create something that I could eventually plot as an SVG or imagemap using Graphviz, or maybe analyze using Gephi.

It would be super-interesting to create some kind of output that I could fold into my blog outline or into individual posts. I would need a static dump for that one, I think.

How would I build something like this? One time, I used Ruby to analyze the text of my blog. That might work. I might be able to pull out all the link hrefs, do lookups…

As of Dec 3, 2014, there are 87 posts in 2014 that link to previous posts, out of 259 non-review posts (so roughly 34%). I used this SQL query to get that:

SELECT post_title FROM wp_posts WHERE post_content LIKE ‘%<a href=”http://sachachua.com/blog/20%’ AND post_date >= ‘2014-01-01’ AND post_title NOT LIKE ‘%review:%’ AND post_state=’publish’;

Hmm. I might even be able to do some preliminary explorations with Emacs and text processing instead of writing a script to analyze this, if I focus on 2014 and ignore the special cases (short permalinks, redirection, and sneak previews), just to see what the data looks like. Rough technical notes:

perl -i -p -e s/href/\nhref/gi 2014-manip.html
grep http://sachachua.com/blog/20 2013-manip.html > list-2013
perl -i -p -e "s/(<\/a>(<\/h2>)?).*/$1/gi" list-2013
(defun sacha/misc-clean-up-reviews ()
  (interactive)
  (while (re-search-forward "\\(Monthly\\|Weekly\\) review: .*</h2>" nil t)
    (let ((start (line-beginning-position)))
      (re-search-forward "</h2>")
      (delete-region start (line-beginning-position))
      (goto-char (line-beginning-position)))))

(defun sacha/org-tabulate-links ()
  (interactive)
  (goto-char (point-min))
  (let (main-link edges nodes)
    (while (not (eobp))
      (if (looking-at "^href=\"\\(.*?\\)\".*?</a></h2>")
          ;; Main entry
          (progn
            (setq nodes (cons (match-string 1) nodes))
            (setq main-link (match-string 1)))
        (if (looking-at "^href=\"\\(.*?\\)\"")
            (setq edges (cons (concat 
                               main-link  ;; from
                               "\t"
                               (match-string 1)   ;; to 
                               ) edges))))
      (forward-line 1))
    (kill-new (mapconcat 'identity edges "\n"))))

Ooooh. Pretty. Gephi visualization of the edge list formed by links, using the Yifan Hu layout. That big thread in the middle, that’s the one about taskmasters and choice and productivity, which is indeed the core theme running through this year of the experiment. The cluster on the right is a year in review. We see lots of little links too.

Internal links for entries posted in 2014

Internal links for entries posted in 2014

Now I’m curious about what happens when we add the posts and links from 2013 and 2012, too. I’ve colour-coded this by year, with It ties together posts on sketchnoting, blogging, choice, learning, writing, plans… Neat.

blog-graph

 

What does this say? It says that even though I write about lots of different things, there are connections between the different topics, and the biggest tangle in the middle has more than 320 nodes. I have lots of blog posts that build on an idea for three or four posts, sometimes spanning several years, even if I don’t think about it in advance. There are 90 such clumps, and it might be good to revisit some of these 2- and 3-post chains to see if I can link them up even further.

Also, it could be interesting to do this analysis with tags, not just year. Hmm… Also, I should dust off my data structures and algorithms, and make my own connected-component analyzer so that I can get a list of the clumps of topics. Good ideas to save for another day!

Thoughts in context: Connecting posts to my blog post index

I’ve been thinking about how to improve the inter-post organization of my blog so that I can write more effectively and so that other people can find things faster. I often link to posts I’ve previously written, but I rarely update old posts to link forward to other related posts. There are quite a few internal linking plugins for WordPress, but they seem more slanted towards SEO and keywords than I’d like.

I wanted to come up with another approach that could take advantage of the big outline of my blog posts at http://sachachua.com/index that I update every month. I’ve found this to be pretty handy for organizing things into finer categories instead of going back and updating lots of posts in WordPress. I can search this easily on my computer by using helm-swoop, and I can move things around using Org Mode.

It got me thinking: Is there a way I can make it easier to connect posts to the index so that if people find an old post useful, they can explore related posts?

So here’s what I came up with: a small See in index link at the end of the post.

2014-12-08 16_45_13-sacha chua __ living an awesome life - Page 2 of 1159 - learn - share - scale

Index link on old posts

Not all the posts are indexed yet, but for those that are, clicking on that link will open up the blog index and scroll it to the right post, highlighting the match, so people can see what else is in the neighbourhood.

2014-12-08 16_45_33-sacha chua __ blog

Matching link

To make this work, I added the following HTML code to my blog index:

<script type="text/javascript">
// from http://www.jquerybyexample.net/2012/06/get-url-parameters-using-jquery.html
function getURLParameter(sParam)
{
    var sPageURL = window.location.search.substring(1);
    var sURLVariables = sPageURL.split('&');
    for (var i = 0; i < sURLVariables.length; i++)
    {
        var sParameterName = sURLVariables[i].split('=');
        if (sParameterName[0] == sParam)
        {
            return sParameterName[1];
        }
    }
}

function sachaScrollToBlog() {
  if (getURLParameter("url")) {
    var link = $('a[href="' + getURLParameter("url") + '"]');
    if (link.length > 0) { 
      $('html, body').scrollTop(link.offset().top - 100);
      link.addClass("highlighted");
    } else {
      alert("Sorry, could not find post in index.");
    }
  }
}

$(document).ready(sachaScrollToBlog);
</script>
<style type="text/css">
a.highlighted { background-color: yellow; padding: 10px }
</style>

Then I added this to the post.php and single.php for my WordPress theme:

<?php
if (get_the_date('Y') >= '2008' && get_the_date('Y-m') < date('Y-m', time() - 60 * 60 * 24 * 7 * 2)) {
  print '<a href="http://pages.sachachua.com/sharing/blog.html?url=' . get_permalink() . '">See in index</a>';
}
?>

The date condition is there to minimize frustration. I’ve indexed posts published 2008 or later, and I usually post an updated index within the first half of the month. I might spend some time indexing older posts; if I do, I’ll update the starting position.

It would be interesting to refine my writing workflow so that my blog index is always up to date. That way, even new posts will have this magic indexing power. It might be difficult to get that squared up with my scheduling, though, since I sometimes write a few weeks in advance. Anyway, neat, huh? This should make the archives marginally more useful. =) Good for me too!

Publishing WordPress thumbnail images using Emacs and Org2Blog

I often include large images in my blog posts since I use sketches as another way to think out loud. I’d gotten used to using the WordPress web interface to drag and drop them into the relevant section of the page. I write most text in Emacs/Org Mode/Org2Blog because of the better outlining and writing tools, and then I used sacha/org-copy-region-as-html (which you can grab from my Emacs configuration) to copy the HTML markup and paste it into WordPress. Of course, I use Emacs for source-code heavy posts that make the most of its syntax formatting support.

Someone asked me recently about how to post and update blog posts with images through Org2blog, and if I had any recommendations for workflow. I’d dropped Windows Live Writer since it was flaking out on me and the WordPress web interface had improved a lot, but before recommending just using WordPress to add images, I was curious about whether I could improve my blogging workflow by digging into Org Mode and Org2Blog further.

It turns out (like it usually does in the Emacs world) that someone had already solved the problem, and I just didn’t have the updated version. Although the upstream version of Org2Blog didn’t yet have the thumbnail code, searching for “org2blog wordpress thumbnail” led me to cpbotha’s Github issue and pull request. Punchagan’s version had some changes that were a little bit ahead of cpbotha’s, so I dusted off my ancient org2blog repository, cloned it onto my computer, and issued the following commands:

git remote add upstream https://github.com/punchagan/org2blog
git pull upstream master
git remote add cpbotha https://github.com/cpbotha/org2blog.git
git pull cpbotha image-thumbnail

and tested it out on a blog post I’d already drafted in Org. It took me a little while to remember that the file URLs didn’t like ~, so I specified a relative path to the image instead. But then it all worked, yay! A quick git push later, and my Github repository was up to date again.

So now I’m back to running a Git version of org2blog instead of the one that I had installed using the built-in packaging system. The way I make it work is that I have this near the beginning of my Emacs configuration:

;; This sets up the load path so that we can override it
(package-initialize nil)
;; Override the packages with the git version of Org and other packages
(add-to-list 'load-path "~/elisp/org-mode/lisp")
(add-to-list 'load-path "~/elisp/org-mode/contrib/lisp")
(add-to-list 'load-path "~/code/org2blog")
(add-to-list 'load-path "~/Dropbox/2014/presentations/org-reveal")
;; Load the rest of the packages
(package-initialize t)
(setq package-enable-at-startup nil)

This allows me to mostly use the packages and to satisfy dependencies, but override some of the load paths as needed.

Hope that helps someone else!

More notes on managing a large blog archive: 17 things I do to handle 10+ years of blog posts

I’ve been thinking a lot about how to manage a large archive to encourage discovery and serendipity, and to make it easier to fish out articles so that I can send them to people. I started in 2001-ish and have more than 6,500 posts. There’s not a lot of information on how to manage a large archive. Most blogging-related advice focuses on helping people get started and get going. Few people have a large personal archive yet. I love coming across other bloggers who have been at this for more than ten years, because information architecture is fascinating. Here’s what I do, in case it gives you any ideas.

  1.  I set up Google Chrome quick searches for my blog, categories, and tags. This means I can quickly dig up blog posts if I remember roughly where they are. (Gear > Settings > Search > Manage Search Engines):
    • Blog (b): https://www.google.ca/search?q=site%3Asachachua.com+%s
    • Blog category (bc): http://sachachua.com/blog/category/%s
    • Blog tag (bt): http://sachachua.com/blog/tag/%s
  2. I create pages with additional notes and lists of content. I use either Display Posts Shortcode or WP Views, depending on what I need. See the Emacs page as an example.
  3. I’ve started using Organize Series to set up trails through my content. It’s more convenient than manually defining links, and it allows people to page through the posts in order too. Read my notes to find examples. I’m also working on maps, outlines, and overviews.
  4. I’ve also started packaging resources into PDFs and e-books. It makes sense to organize things in a more convenient form.
  5. I converted all the categories with fewer than ten entries to tags. Categories can get unwieldy when you create them organically, so I use categories for main topics and tags for other keywords that might graduate to become categories someday. I think I used Categories to Tags Converter or Taxonomy Converter for this. Hah! Similar Posts reminded me that I used Term Management Tools. Awesome.
  6. I manually maintain a more detailed categorical index at sach.ac/index. This makes it easier for me to see when many blog posts are piling up in a category, and to organize them more logically.
  7. I set up short URLs for frequently-mentioned posts. The Redirection plugin does a decent job at this. For example, people often ask me about the tools I use to draw, and it’s great to just be able to type in http://sach.ac/sketchtools as an answer.
  8. I post weekly and monthly reviews. The weekly review includes links to that week’s blog posts, and the monthly review includes a categorized list. I’ve also set up daily, weekly, and monthly subscriptions based on the RSS feeds. This is probably overkill (more choices = lower subscriptions), but I want to give people options for how frequently they want updates. The weekly and monthly reviews are also helpful for me in terms of quickly getting a sense of the passage of time.
  9. I use Similar Posts to recommend other things people might be interested in. There are a number of similar plugins, so try different ones to see which one you like the most. I tried nRelate and the one from Zemanta, but I wasn’t happy with the way those looked, so I’m back to plain text.
  10. I show recent comments. People often comment on really old posts, and this is a great way for other people to discover them.
  11. I use post titles in my next/previous navigation, and I labelled them “Older” and “Newer”. I think they’re more interesting than
  12. I customized my theme pages to make it easier to skim through posts or get them in bulk. For example, http://sachachua.com/blog/2014/02 lists all the posts for February. http://sachachua.com/blog/2014/?bulk=1 puts all the posts together so that I can copy and paste it into a Microsoft Word file. http://sachachua.com/blog/2014/?org=1 puts it in a special list form so that I can paste it into Org Mode in Emacs. You can also pass the number of posts to a category page: http://sachachua.com/blog/category/drawing/?posts_per_page=-1 displays all the posts instead of paginating them. These tweaks make it easier for me to copy information, too.
  13. I give people the option to browse oldest posts first. Sometimes people prefer starting from the beginning, so I’ve added a link that switches the current view around.
  14. I have an “On this day” widget. Sometimes I notice interesting things in it. I used to put it at the end of a post, but I moved it to the sidebar to make the main column cleaner.
  15. For fun, I have a link that goes to a random post. I used to display random post titles in the sidebar, which might be an interesting approach to return to.
  16. I back up to many different places. I mirror my site as a development environment. I back up the database and the files to another web server and to my computer, and I duplicate the disk image with Linode too. I should set up incremental backups so that it’s easier to go back in time, just in case.
  17. I rated my posts and archived my favourite ones as a PDF so that I’ll still have them even if I mess up my database. Besides, it was a good excuse to read ten years of posts again.

Hope that gives you some ideas for things to experiment with! I’m working on organizing more blog posts into trails and e-books. I’m also getting better at planning what I want to write about and learn. If you’re curious about any of the techniques I use or you want to bounce around ideas, feel free to e-mail me at [email protected] or set up a chat.

Do you have a large blog? How do you manage it?