How would I search for all entries related to a particular person? A BBDB search that resolved URLs would get the explicit bbdb links perfectly, but what about the links automatically derived from annotations? I could search for all links and then parse out the name derived from the BBDB if the link is of the form _____ from/to ____ .
The more general question is:
How do you extract entities (persons / resources) from semi-structured text? I'm working with hyperlinked entries, so I can assume that:
- any e-mail is associated with a person (I hope) - websites will frequently be rooted off another person's namespace - the contact database is populated by people - names will tend to be non-dictionary, capitalized words
I'll start out by getting explicit, consistent links recognized. Then explicit, inconsistent links. Then unlinked names (woohoo).