October 9, 2004

Bulk view

Automatic documentation of code

In response to cmarguel’s blog entry: (which was probably a joke, but
I might as well go ahead… =) )

I think, however, that there this could have much more
potential. Can we make the program learn to recognize patterns? “This
program sorts an array.” “This program creates a socket whose port
number is the sum of two numbers.” Sure, it would be a weak program at
first, but imagine if it works well! A new generation of lazy
programmers would be born!

I attended this year’s natural language symposium at La Salle, and one
of the student groups proposed the exact same system. They’d written a
program that translated a C program to an English description. It was
a literal translation: “assign C to ….”, “if b is true then execute
block A, else execute block B. Start of block A… end of block A.
Start of block B… end of block B.”

For their thesis, they planned to make the program recognize common
algorithms such as swap, bubble sort and linear search. If students
can learn those in their first year of computing, shouldn’t a computer
be able to recognize those patterns with just a little more coding? In
fact, their project was even more ambitious. Given source code with
mistakes, their program was to recognize the attempted algorithm and
point out the errors in implementation.

The question-and-answer portion exposed the problems. Recognizing an
algorithm through source-code analysis is hard. Why? There are so many
different ways to write a bubble sort. Do you bubble the smallest
elements up, or bubble the largest elements down? Will you use two
loops? One loop? Loop going up? Loop going down? How do you do the
swap? The most promising approach would be to reduce the source code
to logical elements and then match it with a database of previous
checked answers, combining errors from several answers if necessary.

What about the literal translation of the program? Wouldn’t that
already help students understand their code better? Beginners who have
a hard time finding out the statements included in a block might be
able to use that kind of tool, but they eventually need to learn how
to indent code properly and how to read control structures. Besides,
they’d probably benefit more from a zoomable flowchart.

Documentation should not simply repeat what code already says. Rather,
documentation should make things clearer for users by answering
questions like “How do you use this function?” and “What do you need
to keep in mind when using this function?”. Comments in your source
code can also explain what other approaches you’ve tried, what traps
you need to avoid. Good documentation goes beyond code and shows us
the big picture.

Hmm. Hey, that zoomable flowchart idea looks cool. If people still
don’t have final projects by now, there’s a project idea for you… =)
If a visualizer for your favorite programming language already exists,
pick your next favorite one.

Emacs, the self-documenting editor

In response to cmarguel’s blog entry:

Miguel Arguelles wondered what was so self-documenting about Emacs.
Paolo showed him the source code, but Miguel pointed out people have
to type those comments in anyway. So what makes Emacs a
self-documenting editor and my favorite tool?

Emacs is called a self-documenting editor because the source code to
_any_ function can be found with a few keystrokes. Curious about how
M-x find-file works? Use C-h f to look up the definition, follow the
link in the help buffer, and get as much detail as you want. You can
even use the Emacs debugger (edebug) to explore the behavior of
functions. Emacs exposes its internals to an extent no other editor
has even attempted.

Code? Why are we looking at code? Shouldn’t we be looking at neat
comments explaining how everything works? The paradigm shift here is
that _code_ is often the best documentation for itself. Comments
should explain usage and the background reasons for coding, but the
code itself should be clear and easy to understand. Programming
languages like C and Java tend to encourage short, almost cryptic
identifiers. Lisp may initially seem daunting because of the
parentheses, but the long identifier names and the simple structure
make it easy to read even if you don’t have a background in functional

Not only can you look functions up, but you can also _change_ them
while Emacs is running. Don’t like the way save-buffer works? You can
redefine it with a little Emacs Lisp programming. Want to do some pre-
or post-processing? There’s support for that too. Emacs is a rapid
development environment for itself. That’s why there are so many
modules available for it. Emacs is an editor you can customize to your
heart’s content.

Documentation is just a few keystrokes away. All the commonly-used
functions and variables have clear instructions for usage. Emacs
coding style suggests having a documentation string explaining the
arguments and usage for each function, and there are tools for
checking compliance. Emacs also has a lot of contributed documentation
on http://www.emacswiki.org and other Emacs-related sites.

Emacs doesn’t hide anything from you. That’s why Emacs is called a
self-documenting editor. Even after trying out other editors like
Eclipse and vim, I still go back to Emacs. I’ve tasted power, and I’m

“Device Translates Spoken Japanese and English”

A handheld device that facilitates Japanese-to-English and
English-to-Japanese translation of spoken language is expected to make its
Japanese debut in the coming months. The NEC gadget uses a speech
recognition engine to identify spoken English or Japanese and convert it …

E-Mail%20from%[email protected]