The crawler better be undetectable as such. For instance, it better send "expected" headers (User Agent, Accepts, etc.), and it better have cookies enabled, and also operate from many distinct and perpetually changing IP addresses.
Otherwise, the spammer will be able to run his/her own URL shortener service in a 5USD/month VPS and be able to show a spammy link to the users and a regular-looking link for the crawler.
BTW: a "crawler" implemented with Mechanical Turk workers would be a little bit harder to detect, but would also have its downsides.
Otherwise, the spammer will be able to run his/her own URL shortener service in a 5USD/month VPS and be able to show a spammy link to the users and a regular-looking link for the crawler.
BTW: a "crawler" implemented with Mechanical Turk workers would be a little bit harder to detect, but would also have its downsides.