Tag clouds abound these days - and are, I think, a nice way for users to discover deeply embedded content within a site. Useful tag clouds are built on the idea of extracting useful keywords from content. So how do you go about doing that without extracting all the noise - after all, the word ‘the’ probably appears frequently in your content, but you sure don’t want to display that in your tag cloud.
One nice way might be to use the Yahoo! term extraction API. In short, bung it some content and it’ll return you a list of relevant / related tags. It uses the Yahoo! search technology and will filter out all the chaff for you. So you could, for example, display a post on a forum and use this API to provide the user with a list of significant and related keywords to search (eg) blogs, technorati or wherever.
Further reading can be found here and here with the official documentation in the Yahoo! developer network.


 









 
Comments