NEW: For a prettier blog interface, see the Wordpress version!
Tasks
| B1 | _ | Reply from E-Mail from Joe Corneli (2004.03.19) |
| B2 | X | Reply from E-Mail from Joe Corneli {{Tasks:759}} (2004.03.21) |
| B3 | X | Reply from E-Mail from Richard Boardman {{Tasks:752}} (2004.03.23) |
Notes
14. Information retrieval course
13. Language-independent named entity recognition
12. Named entity extraction in Perl
11. Buzzword for the day: named entity extraction
Apparently, it's an established research problem...
Apparently, it's an established research problem...
10. More thoughts about my research interest
I _would_ very much like a text-based interface that allowed me to
easily navigate through all of the data in my personal store. Zoe
(http://zoe.nu/) looks interesting, but it's outside the way I work.
I'm interested in the kind of massively hyperlinked personal
information management that you describe in TODL. Text-based
navigation through an automatically-extracted graph would be
fantastic.
As for implicit linking, word vectors are often used to find similar
documents. The Remembrance Agent developed at MIT displays a running
list of N items relevant to the words around point. Time and location
may also cue document retrieval.
It doesn't have to stop at personal information like TODOs or notes.
Why not generate source code as well? With literate programming tools
in the style implemented by Leo (http://leo.sourceforge.net). Leo is a
tree-based organizer. Cloned nodes allow you to have arbitrary graphs
and output is customizable. This is close to what you envisioned with
TODL, although it seems to be a primarily graphical tool.
Your description of TODL mentioned the KM system developed by P. Clark
and B. Porter, but it seems to require explicitly encoded facts and
queries. I would like to do research on implicit linking and querying
in semi-structured text. As a fresh BS graduate with some research
experience (one published paper in a conference about distributed
computing, a few programming competitions) and no formal background in
text analysis, I really need an adviser interested in this field.
Would you know anyone interested in this?
E-Mail from Joe Corneli
I _would_ very much like a text-based interface that allowed me to easily navigate through all of the data in my personal store. Zoe (http://zoe.nu/) looks interesting, but it's outside the way I work.
I'm interested in the kind of massively hyperlinked personal information management that you describe in TODL. Text-based navigation through an automatically-extracted graph would be fantastic.
As for implicit linking, word vectors are often used to find similar documents. The Remembrance Agent developed at MIT displays a running list of N items relevant to the words around point. Time and location may also cue document retrieval.
It doesn't have to stop at personal information like TODOs or notes. Why not generate source code as well? With literate programming tools in the style implemented by Leo (http://leo.sourceforge.net). Leo is a tree-based organizer. Cloned nodes allow you to have arbitrary graphs and output is customizable. This is close to what you envisioned with TODL, although it seems to be a primarily graphical tool.
Your description of TODL mentioned the KM system developed by P. Clark and B. Porter, but it seems to require explicitly encoded facts and queries. I would like to do research on implicit linking and querying in semi-structured text. As a fresh BS graduate with some research experience (one published paper in a conference about distributed computing, a few programming competitions) and no formal background in text analysis, I really need an adviser interested in this field. Would you know anyone interested in this?
E-Mail from Joe Corneli
9. Howm
8. Summarization resources by Stephen Wan
7. Searching for all entries related to a person
How would I search for all entries related to a particular person? A
BBDB search that resolved URLs would get the explicit bbdb links
perfectly, but what about the links automatically derived from
annotations? I could search for all links and then parse out the name
derived from the BBDB if the link is of the form _____ from/to ____ .
The more general question is:
How do you extract entities (persons / resources) from semi-structured
text? I'm working with hyperlinked entries, so I can assume that:
- any e-mail is associated with a person (I hope)
- websites will frequently be rooted off another person's namespace
- the contact database is populated by people
- names will tend to be non-dictionary, capitalized words
I'll start out by getting explicit, consistent links recognized. Then
explicit, inconsistent links. Then unlinked names (woohoo).
How would I search for all entries related to a particular person? A BBDB search that resolved URLs would get the explicit bbdb links perfectly, but what about the links automatically derived from annotations? I could search for all links and then parse out the name derived from the BBDB if the link is of the form _____ from/to ____ .
The more general question is:
How do you extract entities (persons / resources) from semi-structured text? I'm working with hyperlinked entries, so I can assume that:
- any e-mail is associated with a person (I hope)
- websites will frequently be rooted off another person's namespace
- the contact database is populated by people
- names will tend to be non-dictionary, capitalized words
I'll start out by getting explicit, consistent links recognized. Then explicit, inconsistent links. Then unlinked names (woohoo).
6. Information retrieval research
Implicit query
Personal information management
- UC Berkeley: School of Information Management and Systems
- Canterbury, NZ: http://www.cosc.canterbury.ac.nz/research/RG/HCI/:/
- http://www.cs.yale.edu/people/faculty/gelernter.html: lifestreams
- http://www.ics.uci.edu/~jpd/research/temporal.html: temporal and social structures
- http://haystack.lcs.mit.edu/ : (Oh wow... pretty... Cluttered, but pretty...)
- http://kftf.ischool.washington.edu/projKFTF.asp
- http://osafoundation.org/OSAF_Our_Vision.htm: Chandler
- http://blog.mathemagenic.com/2004/02/01.html
Implicit query
Personal information management
- UC Berkeley: School of Information Management and Systems
- Canterbury, NZ: http://www.cosc.canterbury.ac.nz/research/RG/HCI/:/
- http://www.cs.yale.edu/people/faculty/gelernter.html: lifestreams
- http://www.ics.uci.edu/~jpd/research/temporal.html: temporal and social structures
- http://haystack.lcs.mit.edu/ : (Oh wow... pretty... Cluttered, but pretty...)
- http://kftf.ischool.washington.edu/projKFTF.asp
- http://osafoundation.org/OSAF_Our_Vision.htm: Chandler
- http://blog.mathemagenic.com/2004/02/01.html
5. Microsoft, implicit queries and information retrieval
4. Implicit searches
<bkhl> sachac, a friend of mine has done something like that, but useful. (It searches libraries for code similar to
what youare writitn.)
Chat with arete on zelazny.freenode.net#emacs
<bkhl> sachac, a friend of mine has done something like that, but useful. (It searches libraries for code similar to
what youare writitn.)
Chat with arete on zelazny.freenode.net#emacs
3. Another book
2. Survey of resources
1. Information retrieval: online book
I'd love to hear about any questions, comments, suggestions or links that you might have. Your comments will not be posted on this website immediately, but will be e-mailed to me first. You can use this form to get in touch with me, or e-mail me at sacha@sachachua.com .