“Got kitesurfing on the mind, mixed with some search & classification tech, and a dab of political ranting”

Without Context there is no Meaning

Posted by direwolff on November 9, 2006

This afternoon I had the opportunity to sit and chat with really smart young entrepreneur whose company is currently in stealth mode, but is doing some interesting things in the area of search and finance. As we discussed his application and the field in general, our conversation drifted towards what I think is a substantial question of any search technology. Specifically, the issue of context. I find that with every new search technology this point is often missed or mentioned in passing. My recent comments about Powerset touch on this a bit, but I’ll provide a deeper example related to that discussion.

In that post, I refer to Matt Marshall’s VentureBeat post that claims the following about Powerset:

Similarly, if you type in “Who acquired IBM?,” Google will give you lots of results about companies that IBM acquired, even though that’s not what you asked — because Google can’t understand that the difference between an object and subject. Powerset, on the other hand, will give results of the various companies that acquired IBM units, including Lenovo, and AT&T — which is a better answer to the original question.

Now let me make subtle change here to make my point. What if the question asked was “who acquired apple?” or “who acquired blackberry?” or “who acquired international business machines?”, etc.? What would results would Powerset generate? You see the issue here isn’t about Powerset or any weakness that it might have, it’s simply the issue of context. This is not an issue with keyword search engines because these are providing simply keyword matches to a query, context independent, which is also why the results are generally pretty poor once you get past the first page. The first page is only good because of the application of statistical tricks (ie. people who ask this question tend to click on this), not because the technology was very smart about responding to the query.

In addressing the questions above, you would first want to know the meaning of the verb “acquire“, as well as it’s synonyms. Second, you would need to consider who is asking the questions. Humans get lots of queues on this sort of thing, but a text box as a means of providing this information to a machine provides much fewer queues. Was a farmer asking the question, or a financial analyst, or someone who mistyped the query or who speaks the language poorly? All of these could generate different conclusions and results. Where asking “who acquired international business machine?” might make sense in a business context, this is not the definitional use of the term acquire and one would not say that “international business machine” was received gradually nor that it was received in return for effort. One could replace the term ‘acquire’ with ‘purchased’ or ‘bought’ or ‘took possession’, as synonymous uses of the original term, but these are not synonyms that could be easily inferred by the software.

So what’s my point here? Well, that systems that propose to address the woes of search with the application of natural language processing also need to talk about how they propose to tackle the context problem in their solution. Fortunately, in the case of my fellow entrepreneur, he understands this all too well which is why he’s not releasing his application to solve several problems right away, but instead is focused on solving some very specific ones that directly apply to the issue of saving people time. Once he has tackled this one, he’ll be ready to move on to the next set of problems that can benefit from the application of his technology.

Where I frequently talk about Readware, this is an area that we spend a lot of time tackling as well because we understand that without an overarching ontology or worldview, there is no context, and without context there is no meaning.

*** 11/9/06 UPDATE: Don’t know why I forgot to include the following link, but Danny Sullivan, who founded Search Engine Watch and has been writing about search for a long time, really put together a nice piece of the issues around search and why natural language interfaces aren’t the answer alone. Check out his blog post on Search Engine Watch here.

Tags: , , ,


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: