5067 comments
2220 subscribers
4798 on Twitter
Subscribe! Feed reader E-mail

On this page:

Maintaining a manual topical index for my blog using Emacs

I’ve been blogging for almost ten years. I started with notes from my university classes and snippets of open source code, and became comfortable enough to share decisions I’m puzzling through and things I’m learning about life. There’s a lot of stuff in my archive, and I want to be able to review things again.

Categories would probably make this easier, but I use categories liberally and sometimes inconsistently. I use them like tags, quick keywords that I add so that people might explore a category and bump into other posts. I probably should split it out so that I assign posts to one category and leave everything else as tags. Someday.

In the meantime, it’s easy enough to maintain a manual topical index of my blog posts, and it’s a good opportunity to review what I’ve been writing as well.

I use Emacs Org Mode to manage a large text file divided into headings. Every month, I copy a list of titles into my topical index. I hacked Org-friendly output into my WordPress theme – you can see April’s blog posts as an example (sachachua.com/blog/2012/04/?org=1). I manually organize the list items under different headings, splitting off new headings when I can see a pattern. Working with two windows viewing the same buffer makes it easy to move information around, and org-refile is handy too. I use a checklist structure so that Org can automatically update the number of posts under each heading (C-u M-x org-update-statistics-cookies). When I’m happy with the structure, I use org-publish-current-file to publish it using the settings I’ve configured. The files are in my public Dropbox folder, so they’re automatically published to the Web. It takes me about 10 minutes to add a month of posts to my index and publish the page.

I like seeing how much I’ve written about different topics, and it encourages me to write and organize more posts. Maybe the index might be handy for other people too!


Short URL: http://sachachua.com/blog/p/23343

Sketchnotes: Designing content so that it works – Carl Friesen (#torontob2b)

Designing Content So It Works

Carl Friesen, Global Reach Communications

Like these? Check out my other sketchnotes, visual book notes/reviews, and visual metaphors.

Here’s the text from the sketchnotes to improve people’s ability to search for it:

Designing content so that it works

Designing content so that it works
Carl Friesen, Global Reach Communications

Website for e-book on content design showyourexpertise.com

1 2 3 4 5

stories

The Trend
Client wants customized solution
Show that you understand their world

1. Trend & historic causes
2. current situation
3. Thoughts on developments, reasons
4. Recommendations

The How-To
1.
2.
3.

Example: trustees, communication process

must be:
Relevant + Realistic
not necessarily what you do, but what clients will find helpful

Helpful!
- process with steps or
- a list of success factors

1. outcome
2. supplies/equipment
3. steps
4. avoiding pitfalls/problems

The How-to-Work-With
How to get good results from working with you

cannot be self-serving
include info on saving money
1. wild success experience
2. factors
3. advice

 

The Case Study
Leading-edge thought & sound implementation
Trans-Canada highway story
Wildlife protection

Not about showing how clever you are!

Must have learning points THEY can use
Must be a story
Tell with the client credibility

1. Initial situation
2. Steps
3. Problems & solutions
4. Lessons learned

 

The Survey
Shows that you stay in touch
must be what your audience cares about

More useful with a trend

Distribute appropriately
Level of detail
Consider limited distribution
The Opinion
informed opinion, thought leadership
at no charge

Long form
-situation
-views on good & bad aspects
-recommendations

The Review
-New product/service
-What’s different
-Discuss good/bad

The Comment

Notes by Sacha Chua, @sachac, LivingAnAwesomeLife.com


Short URL: http://sachachua.com/blog/p/23429

Mapping my blog archives

I was thinking about information management and how I could get a better sense of what’s in my blog archive. I’ve written a lot over the years, enough that I’m surprised by what I find in here.

Topical index of blog posts from 2008-2012

I’ll add 2007 and earlier posts over the next week. I’m also looking forward to revisiting the map of things I want to learn, and consciously planning what to write.

To create this index, I used the Org-compatible output that I built into my WordPress theme (it outputs post titles in list format). I copied and pasted the list into an Org file, temporarily changed all the list items to headings, and used org-refile to move items under categories as needed. Afterwards, I converted the link headings back into list items and used org-export to export the HTML. The process was fairly easy, but it took me about four hours to process slightly over three years of blog posts.


Short URL: http://sachachua.com/blog/p/23214

Meetup sketchnotes: The Publishing Side of WordPress, Andy McIlwain

The Publishing Side of WordPress(Click on the image to view a larger version.)

At today’s WordPress Toronto meetup, Andy McIlwain shared tips on brainstorming, scheduling, and sharing blog posts in WordPress. The lively discussion brought out lots of other tips, too.

The key thing I took away from the talk was that Evernote is awesome and that I should definitely look into it more. I’m also looking forward to checking out Content Rules for more writing tips and Plinky.com for blog post ideas.

After the talk, I had a fascinating conversation with Robin McRae and Ann Brocklehurst about information architecture and personal knowledge management. Lots to think about. Glad I went!

Check out Andy’s blog post below for slides and full notes. Looking forward to the next meetup!

Related links:


Short URL: http://sachachua.com/blog/p/23211

Emacs, artbollocks-mode.el, and writing more clearly

Analyzing the text of my blog showed me that I use some phrases way too much. Fortunately, Emacs can shame me into writing better, thanks to artbollocks-mode.el.

Art Bollocks Mode monitors your writing and highlights words or patterns you may want to reconsider. It can detect repeated words which sometimes slip past proof-reading. It has a list of common passive verbs, making it easier for you to rewrite the sentences to use the active voice. It detects weasel words like "many" and "surprisingly". It even comes with jargon catchers for art critics ("postmodern", "ironic", and so forth) – hence artbollocks-mode.el.

Whenever you use a phrase that matches its patterns, Emacs highlights it, turning it an ugly orange-on-white and underlining it for emphasis. You can still go ahead and write it, but at least the words jump out. Like this: it’s really pretty obvious…

image

I want to use it to write clearer notes and blog posts, so here’s how I’ve tweaked my configuration. Many of the items below are words and phrases I want to use less. Others are part of work jargon that I’m trying my best to keep out of my regular use.

(require 'artbollocks-mode)
;; Avoid these phrases
(setq weasel-words-regex
      (concat "\\b" (regexp-opt
                     '("one of the"
                       "should"
                       "just"
                       "sort of"
                       "a lot"
                       "probably"
                       "maybe"
                       "perhaps"
                       "I think"
                       "really"
                       "pretty"
                       "maybe"
                       "nice"
                       "action"
                       "utilize"
                       "leverage") t) "\\b"))
;; Fix a bug in the regular expression to catch repeated words
(setq lexical-illusions-regex "\\b\\(\\w+\\)\\W+\\(\\1\\)\\b")
;; Don't show the art critic words, or at least until I figure
;; out my own jargon
(setq artbollocks nil)
;; Make sure keywords are case-insensitive
(defadvice search-for-keyword (around sacha activate)
  "Match in a case-insensitive way."
  (let ((case-fold-search t))
    ad-do-it))

(Isn’t regexp-opt so cool?)

artbollocks-mode.el also includes some basic readability statistics like the Flesch reading ease and Flesch-Kincaid grade level. When I analyzed my blog contents without source code blocks (all the Emacs Lisp code snippets were throwing off my numbers!), it turned out that my blog hovers around 65 in terms of Flesch reading ease, or around the same as Reader’s Digest (as reported by Wikipedia). The Flesch-Kincaid grade level for my posts in 2011 was around 8.4.

I’d use artbollocks-mode.el’s tools for calculating word count and readability, except that Emacs ends up including source code blocks because Art Bollocks doesn’t know about Org Mode. I might be able to work around that by defining more advice or creating my own functions that extract the relevant text into a temporary buffer before determining the text statistics. I can leave that for another day, though.

I’ll experiment with making it part of org-capture-mode for now. If I find that getting editing feedback distracts me too much from writing, I’ll remove it from the hook and toggle it when I’m ready. Here’s the code to turn it on automatically for org-capture:

(add-hook 'org-capture-mode-hook 'artbollocks-mode)

Thanks to dotemax for tweeting about writegood and artbollocks-mode.el. Onward and upward!


Short URL: http://sachachua.com/blog/p/23072

Reviewing my archives

I’ve been reviewing my posts from 2011. I remember some of the highlights, but other posts trigger memories that had slipped past. This is good. This is the archive working as it should, giving me paths back into things I’ve learned and forgotten.

So I’ve been going back further in time. It turns out that Calibre makes it easy to convert HTML to other formats, including the MOBI format that Kindle uses. I had modified my WordPress archive templates to give me a bulk view that’s useful for copying posts into my archives. I added header tags for the different months, and copied the resulting HTML files (ex: 2011). Then I converted the files and loaded them onto my Kindle.

I’ve been going through my 2010 archive, and I’m surprised by which things resonate with me after this time. I flip past pages and pages about collaboration and presentations and all of these tips. The posts I linger over are the weekly reviews, where one-line mentions expand into memories. Sometimes I come across things I’ve written about life, and they flesh out the memories more. Not often enough. I’m going to write about life more.

Many posts are about looking ahead: things I want to do, sketches of success so I get a better idea of where I’m going. Other posts are about lessons learned. Work-wise, I’m pretty okay at remembering and applying what I’ve learned, or knowing where in my notes I can find things again – configuration snippets, techniques, and so on.

Life is fuzzier. I read my blog posts and wonder how I grew into or out of interests, how my days changed. This is probably what I should write more about: things I’m learning in life, and echoes of the present.

I write about things I’m learning at work because I think that might be useful to other people. I write for myself, too. What will I want to remember five years from now, ten, fifteen, more?

A cat, circling around, settling into my lap as my computer balances on the edge of my knees. This feeling of being in between spheres.


Short URL: http://sachachua.com/blog/p/23064

Blog analysis for 2011: 173,363 words so far; also, using the Rails console to work with WordPress

How many posts did I post per month, not including this or future posts? (See the geek appendix below to find out how I got to the point of being able to run code snippets like this:)

posts = WpBlogPost.published.posts.year(2011)
posts.count(:id, :group => 'month(post_date)').sort { |a,b| a[0].to_i <=> b[0].to_i }

Result: [["1", 32], ["2", 34], ["3", 33], ["4", 33], ["5", 34], ["6", 39], ["7", 33], ["8", 33], ["9", 31], ["10", 33], ["11", 31], ["12", 8]]

This is a straightforward SQL query to write, but ActiveRecord and scopes make it more fun, and I can easily slice the data in different ways. Becuase I’ve connected Rails with my WordPress data, I can use all sorts of other gems. For example, Lingua::EN::Readability can give me text statistics. It’s not a gem, but it’s easy to install with the provided install.rb. Code tends to throw off my word count, so let’s get rid of HTML tags and anything in pre tags, then calculate some text statistics:

include ActionView::Helpers::SanitizeHelper
require 'lingua/en/readability'
# Needs lots of memory =)
post_bodies = posts.map { |x| strip_tags(x.post_content.gsub(/<pre.+?<\/pre>/m, '')) }
all_text = post_bodies.join("\n").downcase
report = Lingua::EN::Readability.new(all_text)
Number of words in 2011 173,363
Flesch reading ease 65.3
Gunning Fog index 11.0
Flesch-Kincaid grade level 8.4

According to this, my writing should be readable by high school seniors, although they’ll probably have to be geeks in order to be interested in the first place.

The Readability library has other handy functions, like occurrences for finding out how frequently a word shows up in your text.

I 4375 #4 – It’s a personal blog, after all
you 1926 #9 – Not so bad
my 1555
time 933
people 897
work 710
W- 200
presentations 190
J- 133
Drupal 111
Rails 97
Emacs 77
zucchini 23 Oh, the summer of all that zucchini…

I want to get better at clear, specific descriptions. That means avoiding adjectives like ‘nice’ and hedging words like ‘really’.

really 227 Hmm, I can cut down on this
maybe 211 This one too
probably 211 Down with hedging!
awesome 88 I overuse this, but it’s a fun word
nice 15 The war on generic adjectives continues.

Let’s look at feelings:

happy / happiness / wonderful 107
busy 33
worried / anxious / worry 30
tired 20
excited / exciting 21
delighted 4
suck 4
sad 2

I recently used the N-Gram gem to analyze the text of Homestar reviews looking for recurring phrases. I suspected that one of the contractors we were considering had salted his reviews, and unusual recurring phrases or spikes in frequency might be a tip-off. I can use the same technique to identify any pet phrases of mine.

csv = FasterCSV.open('ngrams.csv', 'w')
n_gram = NGram.new(all_text, :n => [2, 3])
csv << "NGRAM 2"
n_gram.ngrams_of_all_data[2].sort { |a,b| a[1] <=> b[1] }.map { |a| csv << a };
csv << "NGRAM 3"
n_gram.ngrams_of_all_data[3].sort { |a,b| a[1] <=> b[1] }.map { |a| csv << a };
csv.close

The ten most common 3-word phrases on my blog tend to be related to planning and explaining. It figures. I can stop saying “a lot of”, though.

Phrase Frequency
i want to 158
a lot of 126
so that i 94
be able to 86
that i can 76
you want to 74
one of the 68
that you can 63
in order to 55
i need to 55

Some frequent two-word phrases:

i can 425
you can 408

Two-word phrases starting with “I’m…”

i’m going 52
i’m not 29
i’m looking 25
i’m working 24
i’m learning 23
i’m sure 16
i’m thinking 15
i’m glad 14
i’m getting 12

I wonder what other questions I might ask with this data…

Geek appendix: Using the Rails Console to work with WordPress data

The Rails console is awesome. You can do all sorts of things with it, like poke around your data objects or run scripts. With a little hacking, you can even use it as a smarter interface to other databases.

For example, I decided to get rid of all the syntax formatting that Org-mode tried to do with my blog posts when I published them to WordPress. Fortunately, this was the only use of span tags in my post content, so I could zap them all with a regular expression… if I could confidently do regular expressions in the MySQL console.

In the past, I might have written a Perl script to go through my database. If desperate, I might have even written a Perl script to do a regular expression replacement on my database dump file.

Rails to the rescue! I decided that since I was likely to want to use data from my WordPress blog in my Rails-based self-tracking system anyway, I might as well connect the two.

I found some code that created ActiveRecord models for WordPress posts and comments, and I modified it to connect to a different database. I added some scopes for easier queries, too.

class WpBlogPost < ActiveRecord::Base
  establish_connection Rails.configuration.database_configuration["wordpress"]

  set_table_name "wp_posts"
  set_primary_key "ID"

  has_many :comments, :class_name => "WpBlogComment", :foreign_key => "comment_post_ID"

  def self.find_by_permalink(year, month, day, title)
    find(:first,
         :conditions => ["YEAR(post_date) = ? AND MONTH(post_date) = ? AND DAYOFMONTH(post_date) = ? AND post_name = ?",
                         year.to_i, month.to_i, day.to_i, title])
  end

  scope :posts, where("post_type='post'")
  scope :published, where("post_status='publish'")
  scope :year, lambda { |year| where("year(post_date)=?", year) }
end
# http://snippets.dzone.com/posts/show/1314
class WpBlogComment < ActiveRecord::Base
  establish_connection Rails.configuration.database_configuration["wordpress"]

  # if wordpress tables live in a different database (i.e. 'wordpress') change the following
  # line to set_table_name "wordpress.wp_comments"
  # don't forget to give the db user permissions to access the wordpress db
  set_table_name "wp_comments"
  set_primary_key "comment_ID"

  belongs_to :post , :class_name => "WpBlogPost", :foreign_key => "comment_post_ID"

  validates_presence_of :comment_post_ID, :comment_author, :comment_content, :comment_author_email

  def validate_on_create
    if WpBlogPost.find(comment_post_ID).comment_status != 'open'
      errors.add_to_base('Sorry, comments are closed for this post')
    end
  end

end

I specified the database configuration in config/database.yml, and granted my user access to the tables:

wordpress:
  adapter: mysql
  encoding: utf8
  database: wordpress_database_goes_here
  username: rails_username_goes_here

After I rigged that up, I could then run this little bit of code in Rails console to clean up all those entries.

WpBlogPost.where('post_content LIKE ?', '%<span style="color:%').each do |p|
  s = p.post_content.gsub /<span style="color:[^>]+>/, ''
  s.gsub! '</span>', ''
  p.update_attributes(:post_content => s)
end

Cleaning up subscripts (accidental use of underscore without escaping):

WpBlogPost.where('post_content LIKE ?', '%<sub>%').each do |p|
  s = p.post_content.gsub /<sub>/, '_'
  s.gsub! '</sub', ''
  p.update_attributes(:post_content => s)
end

Now I can use all sorts of other ActiveRecord goodness when generating my statistics, like the code above.


Short URL: http://sachachua.com/blog/p/22664