The Web can be divided into three components: content (pages, images, videos, blogs, feeds), people (readers, writers, creators, commenters), and actions (queries, clicks, pageviews). Current search engines have taken advantages of "keywords" to link those three components together. But the keyword model has reach it's limits.
One phenomenon that's challenging keywords is the explosive growth of content. Multimedia content is especially difficult The scale requirements are huge. Another challenge is that the Web is becoming more dynamic: people want to interact. Search engines have a long way to go to satisfy user needs. To make progress, we have to stop worrying about just the content. We need to consider the context.
Users are not anonymous, but they form a community with specific interests. Actions are not random, but are driven by intent. Semantics is important. Extracting semantics is difficult.
There is a practical approach to semantics: understand->extract->expose. It should be data-driven, incremental, and interactive. We need to derive concepts from content, people from users, and intent from actions.
Understanding content has three vectors: intra-page intelligence, inter-page intelligence, and temporal understanding.
The technologies more useful for understanding users as people have been personalization, collaborative filtering, and analyzing social graphs. Personalization has failed to live up to it's promise. Harry demos Gianxi, a Microsoft Research project that searches the social network. This isn't online yet as far as I could see. Reminds me of something Rohit Khare shoed me at the last WWW in Banff.
Deriving intent requires contextual intelligence, mobile awareness, and intent refinement. The better we d with query classification, the better we do with user intent. Is there commercial intent? Is it location sensitive? Harry shows a demo (actually it was his trusty sidekick "Graham") where user action (dragging a particular picture to a special zone on the page) reorders the search results and filters them according to additional user action. This is a great example of how understanding intent give much better results than mere keywords. "Give me things that look like this..." This demo actually generated applause from the audience.
One of the demos was actually hobbled by the "Great Firewall of China" according to Graham. Interestingly it was searches of video from Hillary Clinton. The demo extracted the most relevant portions of long videos and showed just the relevant snippets. Seeing the relevant portion, viewers could then select the whole video.
In order to get more out of search, we have to understand semantics, extract it, and then expose it to the user for further refinement.