Jump to content

Large amount of data from early years of SETI@home is at risk of being lost forever


Recommended Posts

https://www.vice.com/en/article/epxjjj/years-of-early-alien-hunting-data-are-at-risk-of-being-lost-forever

This is just sad. The title mentions data, yet the article talks about hardware. I suppose data is on those machines.

I joined SETI@home I think like two years after it started, grinding its bytes with my 1 GHz CPU. Nice memories.

Link to comment
Share on other sites

I'm quite surpising about such that things can also mentioned China: Honestly, what's it to us?

Kind a joke: -why have there been fewer UFO sightings and rumours in recent years?

-Because the pixels of mobile phone cameras have increased.

Edited by steve9728
Link to comment
Share on other sites

The power companies, and also AMD and NVidia, sincerely thank the participants for the long-term and fruitful cooperation.

So many nice expensive graphic cards have found the good-hearted people to adopt them for the adopters' money...
So many green energy has been sold to power them for same...

Everyone, have a nice day!

P.S.
If you really need the calculation results (we have no idea, why, as they were just randomly generated), you have a week to save them on your personal HDD.

P.P.S.
We sell the HDD, too.

Link to comment
Share on other sites

1 hour ago, steve9728 said:

I'm quite surpising about such that things can also mentioned China: Honestly, what's it to us?

I guess a lot of SETI@home participants lived in China lol.

1 hour ago, steve9728 said:

Kind a joke: -why have there been fewer UFO sightings and rumours in recent years?

-Because the pixels of mobile phone cameras have increased.

I don't think UFOs can be conflated with SETI and the SETI Institute. SETI at least focuses on actually reasonably possible things (radio signals from distant celestial bodies) as opposed to blurry balls of light filmed by bumpkins.

No offense to bumpkins by the way, but objects captured on an F/A-18's FLIR are certainly worth more investigation than a 15 second video from a Nokia.

Link to comment
Share on other sites

They just mentioned obsolete and legacy racks of computers.  From what I remember, the real trove of SETI would have been on tapes.  It took lots of tapes shipped from Puerto Rico to wherever SETI was located, downloaded to all the SETI fans, and re-uploaded back.  I'd have to assume that all the processed data got dumped back on tapes, but I wouldn't be surprised if there was a lot of re-use.  Tapes should easily outlast 20 years, but I'd recommend putting them on something slightly more modern.  And at something like $75 for a 12TB (uncompressed) LTO-8 tape, it can't be hard to store arbitrary amounts of 20th century data (I suspect the amount of data at the end would be larger, but towards the end the amount of data that could be saved might be prohibitive (then again, that data is likely stored in a more modern form anyway).

Even if they never found any little green men (Hint: try "iha rop areiuqlauc" instead of primes for little green men) it was a marvel of shoestring science.  24x7 data from one of the worlds biggest radio telescopes (the sensor on the counterbalance at Arecibo) combined with more computing power than any one computer on Earth (I suspect it was probably bigger than the combined top 500 in the early years), all provided for almost nothing but a cool screen saver.

Link to comment
Share on other sites

when you have non-trivial amounts of data, hardware and data become synonymous. the lhc needs supercomputer level storage and processing. so much data that its cheaper/faster to make a hard copy on a storage array and air lift it to another lab than it is to transmit it over the internet. its insane the kind of data some of these science installations can generate. 

Link to comment
Share on other sites

On 6/7/2022 at 6:25 AM, kerbiloid said:

The power companies, and also AMD and NVidia, sincerely thank the participants for the long-term and fruitful cooperation.

So many nice expensive graphic cards have found the good-hearted people to adopt them for the adopters' money...
So many green energy has been sold to power them for same...

Everyone, have a nice day!

P.S.
If you really need the calculation results (we have no idea, why, as they were just randomly generated), you have a week to save them on your personal HDD.

P.P.S.
We sell the HDD, too.

paying a positive amount of currency for a video card in this day and age is absurd.  my 2070 super cost me -$1500.

an aside: ksp uses so little gpu horsepower that my framerate doesnt stutter at 4k with the miner on in the background. 

another aside: moneies are wierd.  on the outside they seem to follow the same conservation laws as the rest of the universe. yet its possible to make it generate more. becoming wealthy is the physics equivalent of making an em drive hover. 

Edited by Nuke
Link to comment
Share on other sites

This wasn’t crypto mining, BOINC and SETI@home would run on just about anything, and was primarily designed as a screen saver.    Let’s not get off topic as the article isn’t referring to these types of hardwares, its referring to obsolete massive storage devices.  
 

I had 15+ years into SETI, and I’m disappointed to see this.  But I’m also not too worried, as long as the data lost is either processed non-candidates, or unprocessed data.   The loss of processed candidate data would be a concern.   Anything else is just fluff or can be reacquainted through a new survey.   Which should be done anyways, with new search parameters. 
 


 

 

Link to comment
Share on other sites

2 hours ago, Gargamel said:

This wasn’t crypto mining

This wasn't crypto mining.

But this was permanently spending money of the participants on electric power and on their hardware lifespan, like in every such volunteer calculations project,

The importance of this study results looks like "Please, stay online. Your call is very important for us....
Btw, store the data at your own hardware. Who knows, what if we need it decades later.
What do you say? Moore's law? Never heard about it...
Ah, you mean that two decades ago, when the project started, computers were much weaker,..  So what took hours then,  now takes just minutes and could be counted in a year instead of a decade...
What? Twenty years later a quantum computer will calculate this in seconds? Come on, just relax and listen, how important your work is!"

At least, this doesn't look like usual "Calculate this biochemical data for all humanity for your money, and later buy this medicine from us for your money again."

Crypto miners just honestly eat your money and hide.

Voluteered calc is not bad per se, but if it doesn't bring visible results in a year and is not available for free, it's either not free, or just spends your money on pixel art drawing with mouse in Paint instead of using Photoshop five years later.
"Crowdfunding? Crowd, fund!"

Edited by kerbiloid
Link to comment
Share on other sites

24 minutes ago, kerbiloid said:

This wasn't crypto mining.

But this was permanently spending money of the participants on electric power and on their hardware lifespan, like in every such volunteer calculations project,

The importance of this study results looks like "Please, stay online. Your call is very important for us....
Btw, store the data at your own hardware. Who knows, what if we need it decades later.
What do you say? Moore's law? Never heard about it...
Ah, you mean that two decades ago, when the project started, computers were much weaker,..  So what took hours then,  now takes just minutes and could be counted in a year instead of a decade...
What? Twenty years later a quantum computer will calculate this in seconds? Come on, just relax and listen, how important your work is!"

At least, this doesn't look like usual "Calculate this biochemical data for all humanity for your money, and later buy this medicine from us for your money again."

Crypto miners just honestly eat your money and hide.

Voluteered calc is not bad per se, but if it doesn't bring visible results in a year and is not available for free, it's either not free, or just spends your money on pixel art drawing with mouse in Paint instead of using Photoshop five years later.
"Crowdfunding? Crowd, fund!"

What?

Link to comment
Share on other sites

53 minutes ago, Gargamel said:

What?

Electricity and hardware cost money, Your money.

A year of calculations on a decade-old computer = a month on a modern one = a day on a future one, a decade later.

When you calculate some pharm company data for future medicine, they anyway won't give it to you for free, so you pay twice.

In SETI noise shuffling they even't didn't care about the storage, and even don't promise a medicine.

Link to comment
Share on other sites

1 hour ago, kerbiloid said:

Electricity and hardware cost money, Your money.

A year of calculations on a decade-old computer = a month on a modern one = a day on a future one, a decade later.

When you calculate some pharm company data for future medicine, they anyway won't give it to you for free, so you pay twice.

In SETI noise shuffling they even't didn't care about the storage, and even don't promise a medicine.

And none of that applies to anything Germaine about our topic here, with is SETI losing data to ancient hardware failures.  

Link to comment
Share on other sites

Have followed SETI only very superficially.

Just looking at how much current and future survey telescopes have improved in every aspect and what happened with data science since the early 2000s I would argue not to cry too much and leave it to a museum or (better) recycling. Data on tapes doesn't last that long (20 years ?) any way and needs frequent copying, so some of it is probably already unreadable.

Link to comment
Share on other sites

7 hours ago, Gargamel said:

And none of that applies to anything Germaine about our topic here, with is SETI losing data to ancient hardware failures.  

im honestly not sure it doesn't apply. its a distributed computing problem after all. crypto works by making sure all the miners have a copy of the complete ledger or at least a piece of it. with currencies that's miners, who are incentivized financially to hold this data. all the data is backed up and constantly checked for data integrity and to integrate new transactions on the ledger. it is intentionally inefficient as everyone needs to check each others work and watch out for maliciously mangled transactions. which is why it uses so much power. that kind of thing is on its way out soon. different use cases have different requirements. nfts use the blockchain as data storage and drm.  for the purposes of storing large amounts of scientific data you would want to store the data, with parity, and keep that data in as many places as possible while also trying to keep the number of copies out on the participant nodes more or less the same from block to block. this avoids rare blocks (those who remember the early days of peer to peer file sharing may remember the situation where your file is done, expect for that one block that nobody seems to have). 

people complain about the energy costs of the blockchains, but long term, reliable, bulk data storage is not free energy wise (nor cost for that matter). you need big drive arrays that consume a lot of power just to have plugged in. and you constantly need to be checking the data for bit rot. failures are a regular occurrence on these systems just because they have a lot more moving parts. those errors need to be tolerable and recoverable. even cold storage has its limits. part of the reason the data may be lost is that some of the tapes are corrupted. you need to periodically check their tapes and see if they match their hashes of them and make fresh copies to ensure the tapes produce a good signal when they pass the read head. a tape thats been sitting in a warehouse for 10 years might produce a weaker signal and at some point the drive wont be able to tell a 1 from a 0. on a large enough data set, you will need to have a guy who's full time job it is to maintain those tapes. i kind of think that is where seti screwed up or simply did not have the budget. i wouldn't be surprised if storage oriented data centers spend most of their power on keeping the data legible, or at least keep the hardware that does that cool.

as for the tapes, you can run them really slow over a very sensitive head with a really high resolution analog sampling, and some top notch signal processing. maybe a neural net that can sniff the patters and find problem areas. that would probibly take a couple orders of magnitude more storage than the contents of the tape itself. not to mention non-trivial compute power and data recovery specialists. again aint no budget there.

Edited by Nuke
Link to comment
Share on other sites

Tangentially related:  

Neil deGrasse Tyson, "We can launch a probe from one moving, rotating planet and land on a COMET... we can measure, via LIGO, a wiggle less than the width of a photon... and yet Congress spends its time and money because someone saw a Tic-Tac on the screen of a Navy jet in a restricted airspace???  That's your best evidence for 'little green men'???  C'Mon!"

Link to comment
Share on other sites

This thread is quite old. Please consider starting a new thread rather than reviving this one.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...