Shawn Posted August 5 Posted August 5 Server is acting up, lots of timeouts, will review and fix tomorrow, sorry! Shawn Quote
Shawn Posted August 10 Author Posted August 10 We are getting hammered with bot traffic all of a sudden, trying to wring it in but apparently we've been 'discovered' ughhh Quote
sketchley Posted August 11 Posted August 11 (edited) 3 hours ago, Shawn said: We are getting hammered with bot traffic all of a sudden, trying to wring it in but apparently we've been 'discovered' ughhh Could be a bunch of people training their AI's. It happened to me at the end of December on my small—very small—website (bots were using up the monthly allotment of bandwidth in a matter of days). It stopped when I added a no-bots file that excluded everything but Google. Here's the script that I used in the robots.txt that stopped the hemorrhaging: Quote User-agent: Googlebot Allow: / User-agent: * Disallow: / Edited August 11 by sketchley Quote
Seto Kaiba Posted August 11 Posted August 11 It's happening all over for the last week or so, so much so that a bunch of web hosts are now saying they will block AI crawlers by default. Quote
Shawn Posted August 11 Author Posted August 11 Yes I saw multiple AI bots crawling the site starting this last week, hammering the hell out of it. I've got htaccess and robots blocking as many as I can, but they are just ignoring. We have a LOT of data here these last 25 years, 1.6 million posts and hundreds of thousands of pictures. If any AI really cares about Macross its a lot of stuff to digest! Quote
treatment Posted August 11 Posted August 11 46 minutes ago, Shawn said: Yes I saw multiple AI bots crawling the site starting this last week, hammering the hell out of it. I've got htaccess and robots blocking as many as I can, but they are just ignoring. We have a LOT of data here these last 25 years, 1.6 million posts and hundreds of thousands of pictures. If any AI really cares about Macross its a lot of stuff to digest! You prolly need to try and implement various tarpits on the site to at least contain these AI crawlers. Quote
azrael Posted August 11 Posted August 11 40 minutes ago, Shawn said: We have a LOT of data here these last 25 years, 1.6 million posts and hundreds of thousands of pictures. If any AI really cares about Macross its a lot of stuff to digest! Maybe it was ChatGPT-5 being retrained after the laughable release this week. AI companies love data and their bots will scrap from where ever, how ever. There was a report out this week that Meta has a new lawsuit for training their AI on torrented p0rn (probably to train their filter 🤷♂️). You can bet AI companies are just flat out ignoring and bypassing htaccess and robots.txt. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.