Thursday, January 16, 2003

Latent Semantic Indexing

http://javelina.cet.middlebury.edu/lsa/out/lsa_intro.htm

Promising technique for search engines. "Regular keyword searches approach a document collection with a kind of accountant mentality: a document contains a given word or it doesn't, with no middle ground. Latent semantic indexing adds an important step to the document indexing process. In addition to recording which keywords a document contains, the method examines the document collection as a whole, to see which other documents contain some of those same words. LSI considers documents that have many words in common to be semantically close, and ones with few words in common to be semantically distant."

Post a Comment

Disclaimer

I'm probably required to say that the views expressed in this blog are my own, and do not necessarily reflect those of my employer. Also, except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 3.0 License, and code samples are licensed under the BSD License.