~/.diary schedule
Priorities - A: high, B: medium, C: low; Status - _: unfinished, X: finished, C: cancelled, P: pending, o: in progress, >: delegated. Covey quadrants - Q1 & Q3: urgent, Q1 & Q2: important
Notes

4. Automatic documentation of code: 18:00

In response to cmarguel's blog entry: (which was probably a joke, but I might as well go ahead... =) )
I think, however, that there this could have much more potential. Can we make the program learn to recognize patterns? "This program sorts an array." "This program creates a socket whose port number is the sum of two numbers." Sure, it would be a weak program at first, but imagine if it works well! A new generation of lazy programmers would be born!

I attended this year's natural language symposium at La Salle, and one of the student groups proposed the exact same system. They'd written a program that translated a C program to an English description. It was a literal translation: "assign C to ....", "if b is true then execute block A, else execute block B. Start of block A... end of block A. Start of block B... end of block B."

For their thesis, they planned to make the program recognize common algorithms such as swap, bubble sort and linear search. If students can learn those in their first year of computing, shouldn't a computer be able to recognize those patterns with just a little more coding? In fact, their project was even more ambitious. Given source code with mistakes, their program was to recognize the attempted algorithm and point out the errors in implementation.

The question-and-answer portion exposed the problems. Recognizing an algorithm through source-code analysis is hard. Why? There are so many different ways to write a bubble sort. Do you bubble the smallest elements up, or bubble the largest elements down? Will you use two loops? One loop? Loop going up? Loop going down? How do you do the swap? The most promising approach would be to reduce the source code to logical elements and then match it with a database of previous checked answers, combining errors from several answers if necessary.

What about the literal translation of the program? Wouldn't that already help students understand their code better? Beginners who have a hard time finding out the statements included in a block might be able to use that kind of tool, but they eventually need to learn how to indent code properly and how to read control structures. Besides, they'd probably benefit more from a zoomable flowchart.

Documentation should not simply repeat what code already says. Rather, documentation should make things clearer for users by answering questions like "How do you use this function?" and "What do you need to keep in mind when using this function?". Comments in your source code can also explain what other approaches you've tried, what traps you need to avoid. Good documentation goes beyond code and shows us the big picture.

Hmm. Hey, that zoomable flowchart idea looks cool. If people still don't have final projects by now, there's a project idea for you... =) If a visualizer for your favorite programming language already exists, pick your next favorite one.

3. Emacs, the self-documenting editor: 17:25

In response to cmarguel's blog entry:

Miguel Arguelles wondered what was so self-documenting about Emacs. Paolo showed him the source code, but Miguel pointed out people have to type those comments in anyway. So what makes Emacs a self-documenting editor and my favorite tool?

Emacs is called a self-documenting editor because the source code to _any_ function can be found with a few keystrokes. Curious about how M-x find-file works? Use C-h f to look up the definition, follow the link in the help buffer, and get as much detail as you want. You can even use the Emacs debugger (edebug) to explore the behavior of functions. Emacs exposes its internals to an extent no other editor has even attempted.

Code? Why are we looking at code? Shouldn't we be looking at neat comments explaining how everything works? The paradigm shift here is that _code_ is often the best documentation for itself. Comments should explain usage and the background reasons for coding, but the code itself should be clear and easy to understand. Programming languages like C and Java tend to encourage short, almost cryptic identifiers. Lisp may initially seem daunting because of the parentheses, but the long identifier names and the simple structure make it easy to read even if you don't have a background in functional programming.

Not only can you look functions up, but you can also _change_ them while Emacs is running. Don't like the way save-buffer works? You can redefine it with a little Emacs Lisp programming. Want to do some pre- or post-processing? There's support for that too. Emacs is a rapid development environment for itself. That's why there are so many modules available for it. Emacs is an editor you can customize to your heart's content.

Documentation is just a few keystrokes away. All the commonly-used functions and variables have clear instructions for usage. Emacs coding style suggests having a documentation string explaining the arguments and usage for each function, and there are tools for checking compliance. Emacs also has a lot of contributed documentation on http://www.emacswiki.org and other Emacs-related sites.

Emacs doesn't hide anything from you. That's why Emacs is called a self-documenting editor. Even after trying out other editors like Eclipse and vim, I still go back to Emacs. I've tasted power, and I'm hooked.

2. "Device Translates Spoken Japanese and English": 12:44

A handheld device that facilitates Japanese-to-English and English-to-Japanese translation of spoken language is expected to make its Japanese debut in the coming months. The NEC gadget uses a speech recognition engine to identify spoken English or Japanese and convert it ... http://www.acm.org/technews/articles/2004-6/1001f.html#item12

E-Mail%20from%20technews@hq.acm.org