More notes on managing a large blog archive: 17 things I do to handle 10+ years of blog posts

I’ve been thinking a lot about how to manage a large archive to encourage discovery and serendipity, and to make it easier to fish out articles so that I can send them to people. I started in 2001-ish and have more than 6,500 posts. There’s not a lot of information on how to manage a large archive. Most blogging-related advice focuses on helping people get started and get going. Few people have a large personal archive yet. I love coming across other bloggers who have been at this for more than ten years, because information architecture is fascinating. Here’s what I do, in case it gives you any ideas.

  1.  I set up Google Chrome quick searches for my blog, categories, and tags. This means I can quickly dig up blog posts if I remember roughly where they are. (Gear > Settings > Search > Manage Search Engines):
    • Blog (b): https://www.google.ca/search?q=site%3Asachachua.com+%s
    • Blog category (bc): http://sachachua.com/blog/category/%s
    • Blog tag (bt): http://sachachua.com/blog/tag/%s
  2. I create pages with additional notes and lists of content. I use either Display Posts Shortcode or WP Views, depending on what I need. See the Emacs page as an example.
  3. I’ve started using Organize Series to set up trails through my content. It’s more convenient than manually defining links, and it allows people to page through the posts in order too. Read my notes to find examples. I’m also working on maps, outlines, and overviews.
  4. I’ve also started packaging resources into PDFs and e-books. It makes sense to organize things in a more convenient form.
  5. I converted all the categories with fewer than ten entries to tags. Categories can get unwieldy when you create them organically, so I use categories for main topics and tags for other keywords that might graduate to become categories someday. I think I used Categories to Tags Converter or Taxonomy Converter for this. Hah! Similar Posts reminded me that I used Term Management Tools. Awesome.
  6. I manually maintain a more detailed categorical index at sach.ac/index. This makes it easier for me to see when many blog posts are piling up in a category, and to organize them more logically.
  7. I set up short URLs for frequently-mentioned posts. The Redirection plugin does a decent job at this. For example, people often ask me about the tools I use to draw, and it’s great to just be able to type in http://sach.ac/sketchtools as an answer.
  8. I post weekly and monthly reviews. The weekly review includes links to that week’s blog posts, and the monthly review includes a categorized list. I’ve also set up daily, weekly, and monthly subscriptions based on the RSS feeds. This is probably overkill (more choices = lower subscriptions), but I want to give people options for how frequently they want updates. The weekly and monthly reviews are also helpful for me in terms of quickly getting a sense of the passage of time.
  9. I use Similar Posts to recommend other things people might be interested in. There are a number of similar plugins, so try different ones to see which one you like the most. I tried nRelate and the one from Zemanta, but I wasn’t happy with the way those looked, so I’m back to plain text.
  10. I show recent comments. People often comment on really old posts, and this is a great way for other people to discover them.
  11. I use post titles in my next/previous navigation, and I labelled them “Older” and “Newer”. I think they’re more interesting than
  12. I customized my theme pages to make it easier to skim through posts or get them in bulk. For example, http://sachachua.com/blog/2014/02 lists all the posts for February. http://sachachua.com/blog/2014/?bulk=1 puts all the posts together so that I can copy and paste it into a Microsoft Word file. http://sachachua.com/blog/2014/?org=1 puts it in a special list form so that I can paste it into Org Mode in Emacs. You can also pass the number of posts to a category page: http://sachachua.com/blog/category/drawing/?posts_per_page=-1 displays all the posts instead of paginating them. These tweaks make it easier for me to copy information, too.
  13. I give people the option to browse oldest posts first. Sometimes people prefer starting from the beginning, so I’ve added a link that switches the current view around.
  14. I have an “On this day” widget. Sometimes I notice interesting things in it. I used to put it at the end of a post, but I moved it to the sidebar to make the main column cleaner.
  15. For fun, I have a link that goes to a random post. I used to display random post titles in the sidebar, which might be an interesting approach to return to.
  16. I back up to many different places. I mirror my site as a development environment. I back up the database and the files to another web server and to my computer, and I duplicate the disk image with Linode too. I should set up incremental backups so that it’s easier to go back in time, just in case.
  17. I rated my posts and archived my favourite ones as a PDF so that I’ll still have them even if I mess up my database. Besides, it was a good excuse to read ten years of posts again.

Hope that gives you some ideas for things to experiment with! I’m working on organizing more blog posts into trails and e-books. I’m also getting better at planning what I want to write about and learn. If you’re curious about any of the techniques I use or you want to bounce around ideas, feel free to e-mail me at sacha@sachachua.com or set up a chat.

Do you have a large blog? How do you manage it?

  • I use visualstudioonline for my backups I used to use GitHub. But the backup process I have is basically dependent on having a git repository. The bonus is I can get historical backups along the way https://trajano.net/2015/04/backup-wordpress-blog-to-github/

    Disqus [assuming it works correctly (maybe will work better on yours because you had more comments)] will provide “also on this site”

    As for my blog, it’s definitely not as large as yours :) I’ve lost a few blogs along the way but it’s fine for the most part. The Wayback Machine https://web.archive.org/web/*/trajano.net is a very useful tool to see how your site grew over time.

    Although I do like your side bar widgets, it does get a bit overwhelming. I prefer to have a tag cloud and a FEW top level categories. The categories are pretty much like your top level ones which have a lot (i.e. more than 100 posts) I wonder if you can code the tag cloud so that it will show the cloud specific for the category so your “learning” category would have their own cloud.

    Personally as much as I have a bit of distaste on how WordPress is set up, it’s still the best one out there I found because it is kept up to date with regards to the latest blog and social technologies out there such as srcset for responsive images. In addition, you can just plop a few WordPress PHP to do things that you would want it to do.

    I may still switch to jekyll over time if I get really bored.

    • I’m fine with the number of categories I have for now, since I like keeping a wide variety of notes. It might be fun to periodically use the batch categories manager to make my post taxonomy more specific by adding tags and things like that – someday! The blog index I’ve been building is a different approach, and I like that too. It’s useful to be able to manually order the entries and to add cross-references. I could probably use tags instead of categories for some of the smaller groupings, but it’s low-priority, and I do like the hierarchical grouping that might be less obvious in a tag cloud.

      I suspect that I and other people find posts mostly through search instead of clicking around in categories. My categories are more for, say, being able to send people a link to all my Emacs or Org notes so they don’t have to wade through everything else, or for letting people subscribe to things they like. People also probably go sideways to find other posts in a category. Hmm, I should make it easy to get an automatic index view, if I haven’t yet…

  • Oh one other thing and you probably noticed it too. The mundane tasks of managing users and comments are gone with Disqus. i.e. backup, password resets, preventing spam user accounts

    • Oh, didn’t know that was an issue for other people. I think I’ve had user registration disabled from the beginning. :) I mostly use Disqus because of the notification system, since that makes it easier for people to see replies to their comments. I still have the comments mirrored in WordPress and backed up with the rest of my site. Better to trust your own backup system! :)

      • True that, Disqus isn’t as bugfree as they would like you to believe. The one nice part about it compared to others is the “sync” capability. So your blog has a copy of the comments regardless.

  • I would get rid of the Meta block and have a shortcut to /wp-admin instead saves space.

    • No one actually needs a link to log in, since I use browser shortcuts and Emacs integration anyway. ;) It was mostly there for predictable on-page links to the entries feed, although I guess most people who care about stuff like that can just use the URl from the autodiscovery tags in the header.

      • Oh it’s more just to free up some small space. I took it out on mine because I’m the only contributor :)