Clustering vs Similarity

Off late, I’ve been looking at search and classification tools, so it’s only appropriate that I make a post on the topic.

These terms can be confusing to differentiate. Similarity is a more general term eg. all documents in a set that talk about linux. The act of clustering similar documents further classifies them eg. documents containing information about linux applications, linux kernel development, linux security etc .

Still confused? Try out , a cluster classifying search engine and enter a search term . The results are all related (similar) but are classified into discrete clusters.


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s