Lisias Posted September 10 Share Posted September 10 (edited) 7 hours ago, ColdJ said: I am surprised that @Lisias didn't tag @Vanamonde@Deddly and @Gargamel to make sure they read the info on that link. I plain forgot. I think we are getting similar issues on DayJob©, besides with different results - and the aftermath is that I'm not getting enough sleep for some days already. 7 hours ago, ColdJ said: "The homepage was being reloaded 200 times a second, as the [OpenAI] bot was apparently struggling to find its way around the site and getting stuck in a continuous loop," added Coates. "This was essentially a two-week long DDoS attack in the form of a data heist." I got similar issue from my side. The Forum link tree is circular, with lots and lots of pages linking each other. A dumb scrapping tool (like my initial version) would unavoidably reach the same problem. Solution could not be simpler: keep track of the URLs and don't scrap them again during a moratorium period - I set mine to one month (redis to the rescue). And, of course, I limited my hits to way more reasonable values (about 30 to 45/minute nowadays). As we can see, Artificial Intelligence is an oxymoron. === == = POST EDIT = == === However, this doesn't means that Forum's software (Invision, I think) could not be doing a better job... This is the very oldest still visible thread on this Forum: https://forum.kerbalspaceprogram.com/topic/2-ksp-forums-is-now-online/ And this is two HTTP HEAD for it, issued with 60 seconds from each other: curl -v --head https://forum.kerbalspaceprogram.com/topic/2-ksp-forums-is-now-online/ | pbcopy HTTP/2 200 date: Tue, 10 Sep 2024 16:12:08 GMT content-type: text/html;charset=UTF-8 cf-ray: 8c10b29fbca34ed1-GRU cf-cache-status: DYNAMIC cache-control: no-cache="Set-Cookie", max-age=180, public, s-maxage=180, stale-while-revalidate, stale-if-error expires: Tue, 10 Sep 2024 16:14:13 GMT last-modified: Tue, 10 Sep 2024 16:11:13 GMT set-cookie: ips4_IPSSessionFront=cni456csa2af36r30h8fmvpo0h; path=/; secure; HttpOnly vary: Cookie, Accept-Encoding cf-apo-via: origin,host content-security-policy: frame-ancestors 'self' referrer-policy: strict-origin-when-cross-origin set-cookie: AWSELB=997D7B590A5AD5B3A7BEBA69831746FDCBBFA28BFFB6AF9BC62DD5BDE535C0A4EDDFB5D8584901B237423519EF2DA8736BDDD877EC1746EE7F33AD352C8B5A095E21920F898533440F3B0CDDFCA739EBCDEA44BAE2A26BC6473FFD5A65BDABE61775AE7992;PATH=/;SECURE;HTTPONLY x-content-security-policy: frame-ancestors 'self' x-frame-options: sameorigin x-ips-loggedin: 0 x-powered-by: PHP/8.1.19 x-xss-protection: 0 server: cloudflare and 60 seconds later: HTTP/2 200 date: Tue, 10 Sep 2024 16:13:17 GMT content-type: text/html;charset=UTF-8 cf-ray: 8c10b44d4cec4edd-GRU cf-cache-status: DYNAMIC cache-control: no-cache="Set-Cookie", max-age=180, public, s-maxage=180, stale-while-revalidate, stale-if-error expires: Tue, 10 Sep 2024 16:15:21 GMT last-modified: Tue, 10 Sep 2024 16:12:21 GMT set-cookie: ips4_IPSSessionFront=pdktt2g34l2cip15i3n1jf7d0g; path=/; secure; HttpOnly vary: Cookie, Accept-Encoding cf-apo-via: origin,host content-security-policy: frame-ancestors 'self' referrer-policy: strict-origin-when-cross-origin set-cookie: AWSELB=997D7B590A5AD5B3A7BEBA69831746FDCBBFA28BFFB6AF9BC62DD5BDE535C0A4EDDFB5D85845D5C3EC116C11401E6BA78D080408321746EE7F33AD352C8B5A095E21920F898533440F3B0CDDFCA739EBCDEA44BAE2A26BC6473FFD5A65BDABE61775AE7992;PATH=/;SECURE;HTTPONLY x-content-security-policy: frame-ancestors 'self' x-frame-options: sameorigin x-ips-loggedin: 0 x-powered-by: PHP/8.1.19 x-xss-protection: 0 server: cloudflare The interesting bits are the last-modified: header: last-modified: Tue, 10 Sep 2024 16:11:13 GMT last-modified: Tue, 10 Sep 2024 16:12:21 GMT The last-modified header is essentially the timestamp the resource were accessed, meaning that Invision (I'm right?) is rendering the page every single time it's accessed no matter the content were changed or not. And this specific page is essentially binary equal on both GET requests I did in parallel. So my attempts to ask the page's HEAD to avoid scraping an unchanged Forum's page was fruitless. Since I keep track of the last time I scrapped something, I could just ask for the page's header and see if it had changed since them, saving Forum's bandwidth and CPU juice when not. And this would not only help me on my efforts. It would allow a full blown page caching system that would benefit everybody. squid is still there. This rant is towards Invision (I'm right?), not Forum. I think. Edited September 10 by Lisias POST EDIT Quote Link to comment Share on other sites More sharing options...
Vanamonde Posted September 10 Share Posted September 10 This was the rumor that we'd been hearing as to the cause of the recent troubles though we had no means of confirming it. Things seem to be settling down at last, though. Quote Link to comment Share on other sites More sharing options...
ColdJ Posted September 12 Author Share Posted September 12 Not getting 502s is now a rare thing for me. And this is at all times of day. Quote Link to comment Share on other sites More sharing options...
Grenartia Posted September 12 Share Posted September 12 1 hour ago, ColdJ said: Not getting 502s is now a rare thing for me. And this is at all times of day. Yeah, I haven't seen any settling down of the issue, either. Quote Link to comment Share on other sites More sharing options...
Lisias Posted September 12 Share Posted September 12 2 hours ago, ColdJ said: Not getting 502s is now a rare thing for me. And this is at all times of day. 1 hour ago, Grenartia said: Yeah, I haven't seen any settling down of the issue, either. I confirm both impressions. Things are escalating terribly again since the first hours of Sep 9th. Quote Link to comment Share on other sites More sharing options...
Lisias Posted September 13 Share Posted September 13 On 7/4/2024 at 1:04 AM, ColdJ said: I did a shallow dive into the phenomenom and I have a theory. If as I understand it, the company that hosts the KSP servers is used by over 40% of the web sites world wide and if I am right from stuff I have read that Chat GPT has access to the server. I think that @ColdJ had nailed it at the first shot. Aproximately at 10 or 11AM Zulu, we had a hiatus on the occurrences. However, about 5 or 6 hours later we got back the http 429 , suggesting that the pressure increased as whatever it is being done is done. I want to stress that I'm keeping the same pace during the whole period (about 20 to 25 hits per minute, given or taken, topping at near 40 very occasionally). So it's my opinion that whoever is hammering the site, is increasing their hammer's cadence as they realizes the site is getting more responsive (while I keep the pace at best, down it to 0 at worst). I have, at this moment, the following working theories: Forum's original IP is not protected by a firewall from accesses from IPs outside the Cloudflare's range, and the hammer knows and use this IP directly; Forum's original IP is protected, but someone punched a hole on the firewall to allow a 3rd party to directly access the server; Whoever is hammering the site, is probably being whitelisted by Cloudflare and are hammering Forum unchecked using Cloudflare's infra structure. In all, absolutely all by attempts to find what would be the maximum hit rate I could get by (ab)using parallelism, Cloudflare kicked by balls after less than 5 minutes near 60 hits per minute in average. PER MINUTE. I can reach 2 or 3 hits per second for less than a minute before Cloudflare axing me out for an hour (and, yeah, you can find the axe dropping on me on the charts), and this is the reason I ended up with the 3 working theories above. The only way to check these working theories is by analyzing the Forum's NGINX logs - the http 502 messages spilt the beans (disclosing even the NGNX version used - and, yeah, I'm kinda liquided about it too, because it happens the same on DayJob©). Pinging @Vanamonde@Deddly and @Gargamel (and now @Anth) as suggested by @ColdJ above. Quote Link to comment Share on other sites More sharing options...
ColdJ Posted September 13 Author Share Posted September 13 The question is, "Is it possible to deny access to the IP addresses that are causing the problems?" Quote Link to comment Share on other sites More sharing options...
Lisias Posted September 14 Share Posted September 14 (edited) On 9/13/2024 at 8:51 AM, ColdJ said: The question is, "Is it possible to deny access to the IP addresses that are causing the problems?" Yes. It demands some highly privileged access to the Forum's server itself, however. Spoiler I will tell you something... If by some miracle someone would manage to let me get my heretic, dirty paws on the needed resources, I would not block it: I would silently redirect the request to an internal, custom made service that would render some random babbling on a layout similar to Forum's (since it's plain impossible they would let me use "more interesting" content.... ). Train your AI with this!!! The problem is not the bandwidth, it's the Database's quota being exhausted quickly. So, "investing" a bit of bandwidth and CPU to screw that <insert your non forum compliant favorite expletive here> up would be cheap - and highly satisfying. Edited September 14 by Lisias Being serious now. Quote Link to comment Share on other sites More sharing options...
Mr. Kerbin Posted September 16 Share Posted September 16 This guy predicted this whole thing. Quote Link to comment Share on other sites More sharing options...
AtomicTech Posted September 18 Share Posted September 18 On 9/16/2024 at 9:59 AM, Mr. Kerbin said: This guy predicted this whole thing. Huh, I guess he did. Quote Link to comment Share on other sites More sharing options...
Lisias Posted September 18 Share Posted September 18 7 hours ago, AtomicTech said: Huh, I guess he did. Interesting enough, I don't think this is going to happen due TTWO unwillingness to keep this site ongoing. They had a point on the thread @Mr. Kerbin mentioned... it's dirty cheap for them to keep this ongoing, unless adversarial players start to make things harsher for them. Not trying to sweet he pill here, TTWO really dug this hole in which they are now. But they are not the ones still digging. Follow the money. Quote Link to comment Share on other sites More sharing options...
Lisias Posted October 3 Share Posted October 3 Just to formalize what everybody else's probably know by now: Things are not improving. (the gaps on 26th and 27th are me not monitoring the site, but I had posted on that timeframe and got some 50x errors the same). Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.