popup ad killer,ftp client,free download popup ad killer,ftp client,free download
popup ad killer,ftp client,free download
About k.soft Download - popup ad killer, ftp client, popupkiller, free downloads Contact k.soft
popup ad killer,ftp client,free download  
Features section

The ksoft Newsletter RSS Feed Subscribe to the
ksoft Newsletter
enter email address
 
 
Dumpster Diving for robots.txt Files | ksoft Newsletter
< Back to Main

Dumpster Diving for robots.txt Files3/18/2006 @ 10:09pm


You might be surprised what one can find these days, hiding within obscure web files, such as the robots.txt file.

Just for a brief intro, the robots.txt file is used by webmasters to tell the search engines which pages on their site should be ignored. As with most encoded files, the robots.txt file can include comments.

The geek in me found it interesting to hit a few popular sites for their robots.txt file, just to see what's there. Check this out:

Alexa.com

They block all of their search engine colleagues from indexing their own search results. I think that is a little ironic. Although their list of robots is somewhat dated.

Webmasterworld.com

An entire blog hidden within the robots.txt file? It's like looking at an ezine from the dial-up days. Even more amazing, it appears updated daily. There is even an advertisement banner! We're talking about a robots.txt file here.

Google.com

A long list of URLs. Some are more interesting than others. At least you can tell what they consider important enough to keep out of the search engines. This one stuck out though: /microsoft

What could that be? Last time I checked, those two were strict corporate enemies. Curious to see, I navigated over to the link. I am somewhat confused by the resulting page and even more confused by the title bar: Microsoft - Google Search

Isn't that a copyright violation?

 

Comments
Archives
http://www.dummysoftware.com
   
         ksoft Blog and Newsletter  ksoft RSS Feed  ksoft on Twitter  ksoft on FriendFeed