News, Views, Rants and Raves About Technology and More

Robots-Nocontent for Page Sections

leave a comment »

From my relatively little but significant “web-crawling” experience, one of the major problems is to scavenge meaningful content from the page- which requires that no navigation crap,  menus,  javascript and adverts should be indexed. Since there is no standard way web-devs design navigation, menus etc. it is impossible to code a parser that works 100% and is a big PITA.
However this piece on Yahoo! Search Blog is welcome news

webmasters can now mark parts of a page with a ‘robots-nocontent’ tag which will indicate to our crawler what parts of a page are unrelated to the main content and are only useful for visitors.

If the trend catches on, and becomes a standard (has to get Google’s support), it would be greatly helpful.


Written by Brajesh

May 3, 2007 at 9:47 am

Posted in Coding, Search, Trends, Yahoo!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: