Category Archives: wordpress

On this page:

Summarizing my WordPress posts using XSLT; 2008 as a PDF

It’s the time of the year for annual updates. I was thinking of reviewing all the blog posts I’d written this year. My weekly and monthly posts are incomplete, though, and I want to make sure I cover everything. I also know a few people who are slowly working their way through my archives. So I thought I’d export all of my posts from 2008 into something that people can read with fewer clicks.

If you want to skip past all the geek details, you can get the files here: 2008 blog (4.6 MB, 307 pages(!)), 2008 mostly nongeek entries (3.8 MB, 195 pages).

After some tinkering around with wptex and other modules that are supposed to make this easier, I gave up and decided to do it myself. I toyed with the idea of writing a short Ruby program that either parsed the XML or read the database, but I eventually ended up taking it as an excuse to learn XSLT, a language for transforming XML. WordPress can export posts and comments as XML. After I scrubbed my WordPress of spam and raised my PHP execution times, I downloaded the XML file and started figuring out how to get it into the form I wanted: a document organized by month, with a table of contents listing all the posts.

Here’s the main stylesheet I used:

 <xsl:stylesheet version="1.0"
                 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                 xmlns:content="http://purl.org/rss/1.0/modules/content/"
                 xmlns:wp="http://wordpress.org/export/1.0/">
   <xsl:output method="html"/>
   <xsl:template match="/">
     <html><body>
       <h0>January 2008</h0>
       <xsl:apply-templates select="/rss/channel/item[contains(pubDate, 'Jan 2008') and wp:status='publish']"/>
       <h0>February 2008</h0>
       <xsl:apply-templates select="/rss/channel/item[contains(pubDate, 'Feb 2008') and wp:status='publish']"/>
       <h0>March 2008</h0>
       <xsl:apply-templates select="/rss/channel/item[contains(pubDate, 'Mar 2008') and wp:status='publish']"/>
       <h0>April 2008</h0>
       <xsl:apply-templates select="/rss/channel/item[contains(pubDate, 'Apr 2008') and wp:status='publish']"/>
       <h0>May 2008</h0>
       <xsl:apply-templates select="/rss/channel/item[contains(pubDate, 'May 2008') and wp:status='publish']"/>
       <h0>June 2008</h0>
       <xsl:apply-templates select="/rss/channel/item[contains(pubDate, 'Jun 2008') and wp:status='publish']"/>
       <h0>July 2008</h0>
       <xsl:apply-templates select="/rss/channel/item[contains(pubDate, 'Jul 2008') and wp:status='publish']"/>
       <h0>August 2008</h0>
       <xsl:apply-templates select="/rss/channel/item[contains(pubDate, 'Aug 2008') and wp:status='publish']"/>
       <h0>September 2008</h0>
       <xsl:apply-templates select="/rss/channel/item[contains(pubDate, 'Sep 2008') and wp:status='publish']"/>
       <h0>October 2008</h0>
       <xsl:apply-templates select="/rss/channel/item[contains(pubDate, 'Oct 2008') and wp:status='publish']"/>
       <h0>November 2008</h0>
       <xsl:apply-templates select="/rss/channel/item[contains(pubDate, 'Nov 2008') and wp:status='publish']"/>
       <h0>December 2008</h0>
       <xsl:apply-templates select="/rss/channel/item[contains(pubDate, 'Dec 2008') and wp:status='publish']"/>
   </body></html>
   </xsl:template>
   <xsl:template match="//item">
     <h1><a>
       <xsl:attribute name="href">
         <xsl:value-of select="link"/>
       </xsl:attribute>
       <xsl:value-of select="title"/></a></h1>
     <div class="link"><xsl:value-of select="link"/></div>
     <div class="date"><xsl:value-of select="pubDate"/></div>
     <div class="content">
       <xsl:value-of select="content:encoded" disable-output-escaping="yes" />
     </div>
   </xsl:template>
 </xsl:stylesheet>

For the non-geek version, I replaced the template with:

   <xsl:template match="//item">
     <xsl:if test="not(category[@nicename='emacs']) and not(category[@nicename='drupal']) and not(category[@nicename='geek'])">
     <h1><a>
       <xsl:attribute name="href">
         <xsl:value-of select="link"/>
       </xsl:attribute>
       <xsl:value-of select="title"/></a></h1>
     <div class="link"><xsl:value-of select="link"/></div>
     <div class="date"><xsl:value-of select="pubDate"/></div>
     <div class="content">
       <xsl:value-of select="content:encoded" disable-output-escaping="yes" />
     </div>
     </xsl:if>
   </xsl:template>

I didn’t want to figure out how to demote all the headings in my blog posts (I have a few), so I used <h0> as my root element. I used xsltproc to transform the XML file I got from WordPress. Then I adjusted all the headings with the following bit of Emacs Lisp:

 (defun sacha/demote-all-headings ()
  (interactive)
   (while (re-search-forward "</?h\\([1-7]\\)>" nil t)
    (replace-match (number-to-string (1+ (string-to-number (match-string 1)))) nil t nil 1)))

It’s all held together with bubblegum and string, really.

2008 blog (4.6 MB, 307 pages(!)), 2008 mostly nongeek entries (3.8 MB, 195 pages)

I haven’t looked at these files much yet – I just scrolled through them quickly. No, don’t worry, I’m not going to send my 2008 update as 307 pages in the mail. ;) But it’s there so that we can flip through it or you borrow the code, and someday I’ll even figure out how to format the output neatly and everything.

Next step: I need to read all of that and highlight a couple of things that made my year.

(307 pages! Wow.)

I’m so sorry!

I got caught up in IBM’s Innovation Jam, and I hadn’t realized that my blog was somewhat broken.

Oops.

I’ve disabled a great number of third-party things, including the Twitter Tools feed that made everything go haywire.

Which widgets would you like back?

WordPress and lifestreaming – check out my draft firehose interface

Inspired by WordCamp Toronto (and the Flutter plugin in particular), I decided to spend some time figuring out if I could use WordPress as a tumblelog/lifestream without overwhelming people and while still making my regular blog posts easy to find. I also wanted to bring in some of the weekly and daily planning that I do. Here’s what I have so far:

Draft firehose interface

It’s currently running off categories of posts that are excluded from the default RSS feed and from index.php. I’m half-tempted to make it run off files instead, because I can very easily rsync those from my computer… and that will probably end up involving Emacs. ;) That would be pretty sweet, wouldn’t it?

… or a blosxom instance that feeds RSS into WordPress…

… or an Org/Planner export that feeds RSS into WordPress…

Oh, the possibilities.

What do you think? I’m planning to offer several interfaces to my blog. Firehose might become the default interface (there’ll be a mainpost-summary version for people who like scanning and a main-post full version for people who hate clicking). There could be a traditional reverse-chronological everything view and an almost-everything view (excludes tidbits). There could also be an explore view full of random posts and “On This Day” goodness. And maybe another view for people coming in from search engines…

What do you think? What would make it easier for you to browse?

Notes from WordCamp

wordcamptoronto on Twitter
#wpto08, #wcto08, which one?
Joseph Thornley
search.twitter.com
sociology + technology
RSS changed it from pastime to productivity tool
Magazine analogy – doesn’t make sense to keep physically checking newstand
Asked audience who has developed plugins, nice interaction
Check out category enhancements
wpdiso? profile plugin
live-conference.ca
phug.ca

Matt Mullenweg
If it takes you more than five minutes [to upgrade], you’re doing it wrong, as the lolcats would say (good idea for another presentation: bring in lolcats picture)
2 Wikipedias a month posted on wordpres.com
5 billion spam comments caught, 99.925% accuracy
camp vs conference, open source vs closed source
kudos to Davao WordCamp for being awesomest, mentioned karaoke sound system, pool, lumpia, super-passionate people, awesome shirt
Release cycles, time-based, 2 months dev 1 month cool-down, 1 month testing – reminds me of what Mark Shuttleworth said re cadence
Top 10 WordPress plugins
Looking into better multimodal support

Other notes
Wordpress help desk
Role scoper
Flashpress
Wordpress developer’s toolbox, Drupal version also
Flutter
Comicpress
Theme test drive
Wordpress e-commerce
Contact manager

Conversations
Himy – misses Emacs Planner PIM bliki
Brian Anderson, Mireille Massue, Elena Yunusov – storytelling
Mireille -SecondLife, presentations, storytelling, visual thinking – introduced by Tania Samsonova
Stuart Dykstra – SecondLife, virtual culture

New blog design

Well, it’s still really the Networker-10 theme underneath, but I’ve stripped away a lot of the CSS that made my site look heavy, moved things around, added some quick links along the top, and finally got around to making sure wp-cache worked. The site should be nice and zippy again. Check it out at sachachua.com!

Changed to excerpts on the main page – What do you think?

Following the advice in Debbie Weil’s “The Corporate Blogging Book”, I’ve changed the main page of my blog to show excerpts instead of full posts. That way, you won’t get intimidated by huge blocks of text. What do you think about it? Good idea? Bad idea? Should I go back to full posts?

The Corporate Blogging Book: Absolutely Everything You Need to Know to Get It Right
by Debbie WeilRead more about this book…