Google Blogoscoped

Forum

Google Gets One Step Closer to Semantic Web

Search-Engines-Web.com [PersonRank 10]

Monday, September 24, 2007
16 years ago6,715 views

http://googleresearch.blogspot.com/2007/09/openhtmm-released.html

tatistical methods of text analysis have become increasingly sophisticated over the years. A good example is automated topic analysis using latent models, two variants of which are Probabilistic latent semantic analysis and Latent Dirichlet Allocation.

Earlier this year, Amit Gruber, a Ph.D. student at the Hebrew University of Jerusalem, presented a technique for analyzing the topical content of text at the Eleventh International Conference on Artificial Intelligence and Statistics in Puerto Rico.

Gruber's approach, dubbed Hidden Topic Markov Models (HTMM), was developed in collaboration with Michal Rosen-Zvi and Yair Weiss. It differs notably from others in that, rather than treat each document as a single "bag of words," it imposes a temporal Markov structure on the document. In this way, it is able to account for shifting topics within a document, and in so doing, provides a topic segmentation within the document, and also seems to effectively distinguish among multiple senses that the same word may have in different contexts within the same document.

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!