Jump to content

Lisias

Members
  • Posts

    7,677
  • Joined

  • Last visited

Everything posted by Lisias

  1. There's no more convenient time to apply a fix for a critical bug than NOW. I lost another sleeping night due the same error that had bit me last time, that are fixed for almost an year in all the machines but one - that belongs to a client that it's problematic since the first day and, so, someone decided that a scheduled stop to update the damned machine would make us look bad. Well, I believe that we are already looking pretty bad now - for something that it's not happening anywhere else on our stack for 11 months - and that are triggered by some pretty low standard service from a 3rd party we are tied to. Bugs are not a problem. Unfixed bugs are a problem. Undeployed fixed bugs are a huge problem - it's very, very tricky to write a report about something that are happening regularly just because someone doesn't approved scheduling a one hour pause on the service (as it's the minimal billable time window, the update itself takes 45 seconds. On a bad day). I lost my sleep, I'm losing my temper, I'm running out of patience. I'll try to play something relaxing - going back to bed is out of question by now.
  2. I came here essentially once a month- when I reboot the machine and have to login again. The rest of the time I essentially left the browser on this site 24x7
  3. Oh, by the Krakens, thank you very much for it!
  4. Working on it. Once I manage to publish the pywb archive, the next step will be a search engine. Interesting enough, this last step will be the easier - I already have a FTP search engine project for retro-computing working (on a bunch of raspnberry pis!!), and if we dig enough, I'm absolutely sure we will find even better solutions nowadays (mine was a novelty 5 years ago). Discord was an experiment that gone bad. Reddit is less worst, but the site's format is not the best for what we need. I agree, this Forum is the best format. Orbiter-Forum is running with 240USD/month, if we accept the last round of donations as a source this information. I think this Forum, right now, would need something more due the larger workload. What we would really need, assuming this Forum will be decommissioned, would be a Federated model with many voluntary servers running under some kind of distributed operating system. Boy, I miss the times in which Plan9 could be something... This is where I think things would not be so bad. If T2 decides to complain, it would be because they want to do something with the IP - what means that Forum will be alive. We need to keep in focus that we are not working to replace the Forum, we are working to guarantee content preservation and to have a lifeboat available if the ship sinks. It's still perfectly possible that we could be just overreacting, and nothing (still more) bad are going to happen and Forum will be available for a long time. IMHO, if we are going the extra mile and setup a Forum to be used in the unfortunate (and, at this time, hypothetical) absence of this one, we should consider going Open Source the most we can to keep the costs down. I agree that closed/licensed solutions are way more polished, but a non-profit community that will rely on donations (at best) and/or sponsoring (more probably) need to keep the costs down. Voluntary work is cheaper than licensed Software. I think we need to look around and see what are the current alternatives - but something we may be sure: it will not be exactly like this Forum for sure. The problem I see is that the distributed model initially envisioned for the Internet was murdered and buried by commercial interests. The ideal solution would be distributed computing, with many, many, really many small servers volunteered by many, many, really many individual contributors. We are having this problem on Forum exactly due the monolithic nature of the solution (that matches the commercial interest of the owner). This "business model" is unsuited for a non-profit community effort. Granted, I'm unaware of any other widely adopted alternative. I doubt we could go WERC on this one. Sponsorship, IMHO, is going to be the best chance. But how to gather sponsor on a project those existence depends of the failure of this Forum? "Here, we are asking for some donations to keep this new Forum - but it will not be used, unless the main one goes down..." Companies sponsor things for a reason: they want some visibility in exchange, "look at us, we are sponsoring this!". They will not get this counterpart unless the thing goes live for good, aiming to replace this Forum - something that, to the best of my knowledge, is not the aim of all this effort.
  5. I have the tool working since early Saturday, and the think works. Setting it up is a bit of a pain in the SAS, but IMHO worth the pain. I'm gradually building up instructions here: https://github.com/net-lisias-ksp/KSP-Forum-Preservation-Project, and, unsurprisingly forked the pywb project to publish some small fixes I did (or are still doing) here: https://github.com/Lisias/pywb/tree/dev/lisias . There's an additional benefit of my approach (I gone the hard way for a reason - on the UNIX way!!!): the scraper script can be customized for distributed processing - there're about 450K pages around here, we set up a pool of handful and trusted collaboratours, we will be able to keep the mirror updated with relatively low efforts from our site (2 people doing half the job at the same time is way faster) and less load on Forum (as it would be way better than the 2 people doing all the job for themselves). Disclaimer Monday night, after Working Hours, I will further update the github repository with the instructions to fire up the scraping infrastructure. Then, hopefully, we can start discussing how to go multiprocessing with the thing. === == = POST EDIT = == === I think that the best way to distribute the files will be by torrent. Torrent files can be updated, by the way, and so one single torrent will be enough for everybody. Since to serve the files using the WARC.gz format will be probably the best option, the host serving the mirror can also help to distribute the files via torrent - but keep your quotas in check!
  6. Some pearl from the distant past...
  7. Bystanders? Yes, they do. But in the face? No way. Fake attempts aim to the chest and belly, where a good bullet proof vest would take the hit, or in the arms arms or legs where the wound would not be fatal. This one was real. "Interesting" things are going to happen in the next months. Fasten your seat belts.
  8. Announce! New release 2024.07.13.0 for the TweakScale Companion ÜberPaket with everything (and the kitchen's sink) included for the lazy installers !! Updates the Companions: Firespitter to 1.3.0.2 Frameworks to 0.4.0.4 See the project's main page to details. Your attention please Completely remove all the previous contents in the GameData/TweakScaleCompanion directory, or you will trigger some FATALities on TweakScale's Sanity Checks! This thingy needs TweakScale v2.4.7 or superior to work Download here or in the OP. Also available on CurseForge and SpaceDock.
  9. Added or updated: CTTP Community Terrain Texture Pack DART D.A.R.T. (Double Asteroid Redirection Test) Range Challenge GEP Grannus Expansion Pack JFA JebFarAway (?) JNSQ JNSQ (Je Ne Sais Quio) Planet Pack KSRSS Kerbal Size Real Solar System SVE Stock Visual Enhancements SVT Stock Visual Terrain Thanks for @OhioBob and @D4RKN3R! https://github.com/TweakScale/Companion/blob/master/Database/Abbreviations.csv
  10. Announce! TweakScale Companion for Firespitter 1.3.0.2 is on the wild. Fixes a screw up on the distribution. Thanks to kmsheesh for the heads up! Download here or in the OP. TweakScale Companion for Frameworks 0.4.0.4 is on the wild. Correctly scales System Heat. Download here or in the OP. The ÜberPaket will be updated Soon™
  11. Please don't! I'm planning to use your metadata do double check what I'm doing - I don't wanna loose content due some unexpected condition not handled by the stack! Cross checking is the key to guarantee that. Thank you!
  12. Your report is incomplete and alarmist. Yes, you found a problem. But failed to report that other threads were fetched Allright. I agree that this is still work in progress. I disagree that it's useless. It's just not ready yet. My guess is that @bizzehdee's crawler is failing to detect when the reponse returns an empty page under an http 200. I suggest to check if the response is valid and, if not, to sleep a few seconded and try it again. It's what I was doing, by the way, when I accidentally fired the crawler without auto throttle and got a 1015 rate limited from cloudflare... oh, well... I will do some KSP coding in the mean time. What reminds me: NEWS FROM THE FRONT I gave the 1 finger salute to the pywb's wombat. I'm doing the crawling using Scrapy and a customized script to detect the idiosyncrasies, and pywb is now setup as a recording proxy - something it really excels at. The only drawback is the need to setup a redis server for deduplication. On the bright side, the setup ended up being not too much memory hungry, I'm absolutely sure I will be able to setup a Raspberry PI 4 (or even a 3) to do this job! Setting up a public mirror, however, may need something more powerful (but I will try the RaspPi the same). For replaying, you need a dedicated CDX server in Java to be minimally responsive. And, yes, the thing is creating WARC files as a champ. This solution is 100% interoperable with Internet Archive and almost every other similar service I found. If I understood some of the most cryptical parts of the documentation, we can confederate each other mirror's on the pywb itself, saving us from NGINX and DNS black magic. Note to my future self: don't fight the documentation, go for the Source! === == = POST EDIT = == === Oukey, the 1015 ban was lifted while I typed this post from my mobile. Back to @bizzehdee, follows a thread that were fetched correctly: https://github.com/bizzehdee/kspforumdata/blob/main/topics/1000-meta.json https://github.com/bizzehdee/kspforumdata/blob/main/topics/1000-1-articles.json Again, the crawler needs some work to workaround Cloudflare's idiosyncrasies (being the http 200 with an empty page the most annoying), but the tool is working almost fine. And this parsed data will be very, very nice to feed a custom search engine! === == = POST EDIT² = == === I found an unexpected beneficial side effect of using a local, python based, crawler - now it's feasible to distribute tasks! Once we establish a circle of trusted collaborators, we can divide the task in chunks and distribute them between the participants. This will lower the load on Forum, save bandwidth for each participant and accelerate the results. As soon as I consolidate the changes and fixes I did during the week on this repo, I will pursue this idea.
  13. Fairly interesting approach. I will give this a peek during the night - you are essentially "competing" with WARC. From my side, I will bite the bullet and insist with pywb, besides not exactly happy with the multidaemon solution they choose (external CDX indexer). The direct alternative, OpenWayback, is deprecated and the Internet Archive tools are yet less user friendly. I found a external crawler, by the way, that I can rely to doing the crawling instead of injecting wombat into the javascript land no the browser. The single binary solution was already gone trough the window, anyway...
  14. It would not be retaliation, it would be damage control. Do you know the phrase "We don't negotiate with terrorists"? It's the same principle. If they budge on any harassment campaign, they will encourage more harassment campaigns in the future. It's simple like that. As a matter of fact, it would be probably what I would do - if this Forum became a magnet for toxic people I would not want my name associated to it... They have other games and other Forums to care about. There're places for harsh measures against bad company policies on a Society - but this, definitively, is not one of them. You can't bully people into caring.
  15. Both @Lonelykermit and @Fizzlebop Smith had already addressed why this would be a terrible idea. My turn is to explain how to rework it in a way that it could help us. First things first, they know they screwed the pooch. Badly, they don't need us to remember them all the time. Being them guilty, responsible or victim of the problem, they are still humans and people don't like to have their borks rubbed in their face all the time. So, what exactly we want? Well, we want to keep Forum running. Who can do that? T2. Now, put yourself in their shoes: if YOU were the one responsible for all this mess, including Forum, how do you would want to people reach you about? Being called all day, as you would be a deadbeat being pursued by collectors? I don't know you, but if something I'm paying for starts to cost me my money and my patience beyound the potential earnings I could get from it, I would shut down the damned thing. So, another line of action is needed. We need to reach a win-win situation - we keep the Forums, and they get something back (or avoid losing something). What, so, bring this question to the table: what we are willing to do to help them helping us on keeping the Forum alive? I'm open to ideas. But one possible idea would be, well, writing a letter. Good, old and out of fashion polite and supporting letter - hand written and sent by postal, paying for the stamp. Or something like that. People writing letters and paying for the stamp shows engagement and interest. Sending emails and electronic messages are low effort, but a properly written letter o the T2's ombudsman or whoever is responsible there for handling customers is something else. Of course, I don't have the slightest idea if this would work or not, but at least this will not hurt neither - what makes it a way better option than peskying them until someone decides that enough is enough and pull the plug to cease the harassment. Just my 2 cents, anyway - perhaps doing nothing would be the best option.
  16. Ugh... You are right, I remember cursing this once but completely forgot about.... Well, I will not forget it again! https://github.com/net-lisias-ksp/DistantObject/issues/43
  17. They didn't They aimed somewhere behind the channel, hoping that some of the V2s would hit London! They managed to be right about 50% of the time!
  18. Yes. This is happening since the first time that horrible perversity called PD-Launcher. Some people just don't grasp the idea that you just can't launch a program from a different directory it was meant to be launched... Private Division didn't helped neither, as they were the first ones to change the CWD on KSP (that were the same for more than 10 years). What add'ons authors are doing (and what I'm going to do with DOE too on the next release) is to just sweep the dirty under the rug, doing some shenanigans to find the right place instead of trusting KSP will tell us the correct place (as it is being fooled by the problem itself), and call it a day. Not because this is a good fix, because it's not - it's terrible, because KSP will still misbehave shoving files on the wrong place - but because no one is going to really fix the problem, and most people just blames the add'on author instead of understanding where the problem really is. Thank you very much, by the way, for being one of the people that really understand the problem instead of shooting the messenger! There's a pretty complete essay on : DOE is the result of the efforts of many people, and in the name of @Rubber Ducky, @MOARdV and @TheDarkBadger (the previous authors before me), I thank you. Cheers!
  19. Oukey, so now I know what to do. This launcher stunt used to work in the past, but PDLauncher screwed even this trick. Currently, the less worst option is to use KSSL. Better solutions are possible, however. But I'm not seeing people willing to implement them, unfortunately.
  20. The <KSP_Root>/GameData/DistantObject/PluginData/Settings.cfg file is a template, it's used only when no working settings is found. The real settings, once you change something and click on the Apply button, can be found on <KSP_Root>/PluginData/DistantObject/Settings.cfg. You not being able to find your KSP.log means that you are launching KSP using that Steam launcher unfortunate hack. Please don't do that, KSP itself dosn't behaves correctly when you do that. Without the KSP.log, my hands are tied anyway. The MiniAVC.log is useless, I do't maintain MiniAVC. In true, you should delete all instances of the MiniAVC.dll file in your GameData. There's just on need for it nowadays. Since I'm guessing you are using the PD-Launcher override (mis)stunt, I think you may find it inside the PDLauncher directory. Check it, please.
  21. This tool is not simple thing, neither. I spent a lot of time just trying to setup the damned thing - but once you do the dirty work, it just works. There're tools to build WARC files from dumped files, but you lose what's most important - the metadata that guarantee the data wasn't tampered. In a way or another, once I manage to have this grimacing on the works, I will share everything (somehow), so you can have them if you want. Yes. Being the reason I decided to "go rogue" and do things the hard way. Good idea. The problem I have with archive.org is that I already detected that some pages are missing from the historic, and every time I tried to add this pages to the crawler, I was greeted with a error message to the point I started to think that TTI issued a take down on them. Good to know I was wrong, but I still have the missing pages problem to cope with. But, still, it's a good idea - they are not mutually exclusive solutions. As a matter of fact, in theory I can add the waybackpack WARCs to be indexed and served by pywb the same. Once I finish to install this Kraken damned tool (see below), I will pursue this venue too. === == = NEWS FROM THE FRONT = == === The tool is working (finally), except by crawling. There was no instructions about how to deploy some browser side dependencies, not to mention that I'm using Firefox that have some javascript shenaningans that demanded some changes while deploying - so, yeah, once this thing is working, some pull requests will be made. Right now, I'm cursing the clouds because a browser side library was migrated to TypeScript, and I'm installing a node.js (blergh) environment to compile the damned thing into javascript and then deploy it. All this work will be available to whoever wants it, I will publish a package with batteries included to make the user's life easier - or less harsh, this tool is a "professional" archiver, it's way less friendly than httrack, for example.
  22. Finally answering your original question, yes. That material is hot. However... Only the application/http documents are saved on the WARC files. I would like to have the images hosted on forum archived too, but whatever. This is easily fixable with the tool I choose, pywb. But there's a catch - the pywb tool apparently doesn't agree with Internet Archive about how to calculate the digests, and so I ended up wasting some time redownloading the damned thing thinking that the download was corrupted somehow (the dumb-ass typing this post only thought on using gzip --test after redownloading the freaking gzipballs). I'm currently reindexing the archive to see if we it will ignore the digest, or if I will need to fix about ~65 GB of http dumps to fix them myself - Kraken save the BTRFS with compression activated - it's saving a lot of I/O here. I will come back to you as soon as I manage to import this data on my current pywb collection.
  23. Please give a peek on this post and this other one. Oh, and on this one too! There's a WARC "dump" up to 2023-05 already on WebArchive. It's about 8G of packed data, but it's already fetched from Forum and, so, there's no need to fetch them again. Currently, I'm working on downloading that thing and then I will create a complementary WARC over the 2023-05 one, and then I will see how to feed these data files on the wild (probably by a torrent). With all these data files on hand, we will be able to do some interesting things - but please read my post above where I discuss about legalities. === BRUTE FORCE POST MERGE === To anyone willing to download the Internet Archive data, this dataset doesn't have a torrent, unfortunately. So I made this little script to download that huge basket of bytes using wget with the option to recover the downloads if things goes south in the process. Worst case scenario, you run the script again, no data loss. #!/usr/bin/env bash for f in forum.kerbalspaceprogram.com-00000.warc.gz forum.kerbalspaceprogram.com-00000.warc.os.cdx.gz forum.kerbalspaceprogram.com-00001.warc.gz forum.kerbalspaceprogram.com-00001.warc.os.cdx.gz forum.kerbalspaceprogram.com-meta.warc.gz forum.kerbalspaceprogram.com-meta.warc.os.cdx.gz forum.kerbalspaceprogram.com_202305.cdx.gz forum.kerbalspaceprogram.com_202305.cdx.idx forum.kerbalspaceprogram.com_202305_files.xml forum.kerbalspaceprogram.com_202305_meta.sqlite forum.kerbalspaceprogram.com_202305_meta.xml ; do wget --continue https://archive.org/download/forum.kerbalspaceprogram.com_202305/$f done
×
×
  • Create New...