NEW: For a prettier blog interface, see the Wordpress version!

Tasks

B1_Reply from E-Mail from Joe Corneli (2004.03.19)
B2XReply from E-Mail from Joe Corneli {{Tasks:759}} (2004.03.21)
B3XReply from E-Mail from Richard Boardman {{Tasks:752}} (2004.03.23)

Notes

13. Language-independent named entity recognition

12. Named entity extraction in Perl

11. Buzzword for the day: named entity extraction

Apparently, it's an established research problem...

10. More thoughts about my research interest

I _would_ very much like a text-based interface that allowed me to easily navigate through all of the data in my personal store. Zoe (http://zoe.nu/) looks interesting, but it's outside the way I work.

I'm interested in the kind of massively hyperlinked personal information management that you describe in TODL. Text-based navigation through an automatically-extracted graph would be fantastic.

As for implicit linking, word vectors are often used to find similar documents. The Remembrance Agent developed at MIT displays a running list of N items relevant to the words around point. Time and location may also cue document retrieval.

It doesn't have to stop at personal information like TODOs or notes. Why not generate source code as well? With literate programming tools in the style implemented by Leo (http://leo.sourceforge.net). Leo is a tree-based organizer. Cloned nodes allow you to have arbitrary graphs and output is customizable. This is close to what you envisioned with TODL, although it seems to be a primarily graphical tool.

Your description of TODL mentioned the KM system developed by P. Clark and B. Porter, but it seems to require explicitly encoded facts and queries. I would like to do research on implicit linking and querying in semi-structured text. As a fresh BS graduate with some research experience (one published paper in a conference about distributed computing, a few programming competitions) and no formal background in text analysis, I really need an adviser interested in this field. Would you know anyone interested in this?

E-Mail from Joe Corneli

9. Howm

http://howm.sourceforge.jp/

Funky hyperlinking in all files. See if I can steal ideas from this...

7. Searching for all entries related to a person

How would I search for all entries related to a particular person? A BBDB search that resolved URLs would get the explicit bbdb links perfectly, but what about the links automatically derived from annotations? I could search for all links and then parse out the name derived from the BBDB if the link is of the form _____ from/to ____ .

The more general question is:

How do you extract entities (persons / resources) from semi-structured text? I'm working with hyperlinked entries, so I can assume that:

  • any e-mail is associated with a person (I hope)
  • websites will frequently be rooted off another person's namespace
  • the contact database is populated by people
  • names will tend to be non-dictionary, capitalized words

I'll start out by getting explicit, consistent links recognized. Then explicit, inconsistent links. Then unlinked names (woohoo).

6. Information retrieval research

Implicit query

Personal information management

5. Microsoft, implicit queries and information retrieval

4. Implicit searches

 <bkhl> sachac, a friend of mine has  done something like that, but useful. (It searches libraries for code similar to
       what youare writitn.)
Chat with arete on zelazny.freenode.net#emacs

Previous day | Next day

I'd love to hear about any questions, comments, suggestions or links that you might have. Your comments will not be posted on this website immediately, but will be e-mailed to me first. You can use this form to get in touch with me, or e-mail me at [email protected] .

Page: Information Retrieval
Updated: 2004-11-21
NOTE: ANTI-SPAM MEASURE NOW IN PLACE. Please answer the following question with the right number in order to send me your comment.
What is two minus one? (hint: one ;) )
Name:
E-mail:
URL:
Comments: