GET /news.html HTTP/1.1 User-Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +htt...

dbaupp · on Nov 27, 2012

I'm probably just missing the context of the discussion, but what does this mean & what's its significance?

fijal · on Nov 27, 2012

it's a technical way to prevent google from reaching your site. You easily can do it right now (even better, use robots.txt instead of obscure hacks), but the publishers don't want it. They don't want google to stop showing their content, they want money for it and google is unwilling to pay.

sek · on Nov 27, 2012

If they don't want to be crawled, they could just change the robots.txt. But nobody does.

dagw · on Nov 27, 2012

Almost nobody: http://www.guardian.co.uk/media/greenslade/2012/oct/22/googl...

I'm looking forward to see in a years time what the result of this little experiment is.