|
ArtistServer
Friday, Jun 09, 2006 10:13:31 AM
Attack of the Killer Spiders! The Bots are Coming!
The other day, ArtistServer seemed like it was under attack. I was working on some files, and when I tried to save out to the server... I received a timeout. So I login to the Web server to see what's going on and find that she was in fact running, but that there was a huge amount of requests coming in which were tying up all the sesssions on the app server. For around an hour, I sifted through the requests coming into the site and for our streaming to find out what's going on, and if there was something odd about all these requests. On the server, I was watching the number of requests climbing - which gave me the impression that I was dealing with some bots indexing the content on the site.
And yes, it turned out that we were being blasted by various bots all at the same time. In all, there were more than 6 different types of bots hitting the site and in one case, there were nine bots from the same service. Here's a sampling of the UserAgents and IPs I found hitting ArtistServer all at the same time:
user agent: Sphere Scout&v4.0 (beta) - scout at sphere dot com IP: 64.40.115.57 user agent: Sphere Scout&v4.0 (beta) - scout at sphere dot com IP: 64.40.115.48 user agent: Sphere Scout&v4.0 (beta) - scout at sphere dot com 64.40.115.55 user agent: Sphere Scout&v4.0 (beta) - scout at sphere dot com IP: 64.40.115.35 user agent: msnbot/0.9 IP: 65.55.246.46 user agent: Mozilla/5.0 (compatible; Yahoo! Slurp) IP: 68.142.250.169 user agent: Mozilla/5.0 (compatible; Yahoo! Slurp) IP: 72.30.102.208 user agent: Snapbot/1.0 IP: 38.98.19.89 user agent: Snapbot/1.0 IP: 66.234.139.204 user agent: Snapbot/1.0 IP: 38.98.19.71 user agent: Snapbot/1.0 IP: 38.98.19.67 user agent: Snapbot/1.0 IP: 38.98.19.72 user agent: Snapbot/1.0 IP: 38.98.19.95 user agent: Snapbot/1.0 IP: 66.234.139.199 user agent: Snapbot/1.0 IP: 66.234.139.194 user agent: Snapbot/1.0 IP: 66.234.139.197 user agent: YahooFeedSeeker/2.0 IP: 66.163.187.77
And even Google was hitting us at the same time.
There are so many people starting services that index content on the Web that it's starting to get out of hand. I was reading a blog yesterday about a study on bot behavior - and in this study, Yahoo! slurp made over a million requests in a year to a mid-sized site. That's quite a bit considering that is only one of 100's or 1,000's of bots out there.
If you or someone you know is planning to build/launch yet another Search engine, or blog index or Meme tracker, please, make nice bots/spiders - or better yet, don't crawl a site with 10 different bots across two different IP blocks like Snap.com apparently does - as it certainly DOES end up looking and smelling like an attack.
I also found duing this fine combing through the requests that there were two servers setup in Bejing China that appeared as if they were re-serving our content to other IPs. I couldn't access the server's via a browser, but the HTTP header data was showing the files being served from an index page at the root of their domain. After I put up a block on their IP - the activity stopped... then started up w/ a new IP, etc. So I ended up having to block their whole IP range.
This is all the kind of stuff I really do not enjoy - dealing w/ the servers and network issues gets a bit stressful - especially when I don't have anyone staff to rely on for help. Fortunately, my ISP is very cool and has helped out when I've needed it. Thanks David!
bots
spiders
- ADD TO:
-
Blink
-
Del.icio.us
-
Digg
-
Furl
-
Google
-
Simpy
-
Spurl
-
Y! MyWeb
|
 |