The long road to recover Frogger 2 source from tape drives

The long road to recover Frogger 2 source from tape drives

GITHUB.COM
508
207
WhiteDawn
6d

Comments

@ogurechny 6d
Modern backup would simply state “API keys and settings are here:”, and a link to collaboration platform closed after 3 years of existence.
@ilamont 6d
In The Singularity Is Near (2005) Ray Kurzweil discussed an idea for the “Document Image and Storage Invention”, or DAISI for short, but concluded it wouldn't work out. I interviewed him a few years later about this and here's what he said:

The big challenge, which I think is actually important almost philosophical challenge — it might sound like a dull issue, like how do you format a database, so you can retrieve information, that sounds pretty technical. The real key issue is that software formats are constantly changing.

People say, “well, gee, if we could backup our brains,” and I talk about how that will be feasible some decades from now. Then the digital version of you could be immortal, but software doesn’t live forever, in fact it doesn’t live very long at all if you don’t care about it if you don’t continually update it to new formats.

Try going back 20 years to some old formats, some old programming language. Try resuscitating some information on some PDP1 magnetic tapes. I mean even if you could get the hardware to work, the software formats are completely alien and [using] a different operating system and nobody is there to support these formats anymore. And that continues. There is this continual change in how that information is formatted.

I think this is actually fundamentally a philosophical issue. I don’t think there’s any technical solution to it. Information actually will die if you don’t continually update it. Which means, it will die if you don’t care about it. ...

We do use standard formats, and the standard formats are continually changed, and the formats are not always backwards compatible. It’s a nice goal, but it actually doesn’t work.

I have in fact electronic information that in fact goes back through many different computer systems. Some of it now I cannot access. In theory I could, or with enough effort, find people to decipher it, but it’s not readily accessible. The more backwards you go, the more of a challenge it becomes.

And despite the goal of maintaining standards, or maintaining forward compatibility, or backwards compatibility, it doesn’t really work out that way. Maybe we will improve that. Hard documents are actually the easiest to access. Fairly crude technologies like microfilm or microfiche which basically has documents are very easy to access.

So ironically, the most primitive formats are the ones that are easiest.

@huehehue 6d
Fascinating read that unlocked some childhood memories.

I'm secondhand pissed at the recovery company, I have a couple of ancient SD cards laying around and this just reinforces my fear that if I send them away for recovery they'll be destroyed (the cards aren't recognized/readable by the readers built into MacBooks, at least)

@LeoPanthera 6d
I really wish they would name the data recovery company so that I can never darken their door with my business.
@smokel 6d
Heh, I remember playing .mp3 files directly from QIC-80 tapes, somewhere around 1996. One tape could store about 120 MB, which is equal to about two compact discs' worth of audio. The noise of the tape drive was slightly annoying, though. And it made me appreciate what the 't' in 'tar' stands for.
@robotnikman 6d
I've always admired the tenacity of people who reverse engineer stuff. To be able to spend multiple months figuring out barely documented technologies with no promise of success takes a lot a willpower and discipline. It's something I wish I could improve more in myself.
@h2odragon 6d
Truly noble effort. Hopefully the writeup and the tools will save others much heartbreak.
@masto 6d
This brings back (unpleasant) memories. I remember trying to get those tape drives working with FreeBSD back in 1999, and it going nowhere.
@jimbob45 6d
F2 was a really neat game. It almost invented Crypt of the Necrodancer’s genre decades early.

It’s a little sad that it took such a monumental effort to bring the source code back from the brink of loss. It’s times like that that should inspire lawmakers to void copyright in the case that the copyright holders can’t produce the thing they’re claiming copyright over.

@db48x 6d
Wow, that backup software sounds like garbage. Why not just use tar? Why would anyone reinvent that wheel?
@crazygringo 6d
Wow, this part makes my blood boil, emphasis mine:

> This issue doesn't affect tapes written with the ADR-50 drive, but all the tapes I have tested written with the OnStream SC-50 do NOT restore from tape unless the PC which wrote the tape is the PC which restores the tape. This is because the PC which writes the tape stores a catalog of tape information such as tape file listing locally, which the ARCserve is supposed to be able to restore without the catalog because it's something which only the PC which wrote the backup has, defeating the purpose of a backup.

Holy crap. A tape backup solution that doesn't allow the tape to be read by any other PC? That's madness.

Companies do shitty things and programmers write bad code, but this one really takes the prize. I can only imagine someone inexperienced wrote the code, nobody ever did code review, and then the company only ever tested reading tapes from the same computer that wrote them, because it never occured to them to do otherwise?

But yikes.

@tombert 6d
This is giving me some anxiety about my tape backups.

I have backed up my blu-ray collection to a dozen or so LTO-6 tapes, and it's worked great, but I have no idea how long the drives are going to last for, and how easy it will be to repair them either.

Granted, the LTO format is probably one of the more popular formats, but articles like this still keep me up at night.

@bluedino 6d
This will be fun in 20 years, trying recover 'cloud' backups from servers found in some warehouse.
@omnibrain 6d
Is anyone else calling it “froggering/to frogger” if they have to cross a bigger street by foot without a dedicated crossing?
@xigency 6d
As a kid, I got this game as a gift and really, really wanted to play it. But after beating the second level, the game would always crash on my computer with an Illegal Operation exception. I remember sending a crash report to the developer, and even updating the computer, but I never got it working.
@hlandau 6d
Absolutely amazing story. Fantastic!

I've actually long been stunned by the propensity of proprietary backup software to use undocumented, proprietary formats. I've always found this quite stunning, in fact. It seems to me like the first thing one should make sure to solve when designing a backup format is to ensure it can be read in the future even if all copies of the backup software are lost.

I may be wrong but I think some open source tape backup software (Amanda, I think?) does the right thing and actually starts its backup format with emergency restoration instructions in ASCII. I really like this kind of "Dear future civilization, if you are reading this..." approach.

Frankly nobody should agree to use a backup system which generates output in a proprietary and undocumented format, but also I want a pony...

It's interesting to note that the suitability of file formats for archiving is also a specialised field of consideration. I recall some article by someone investigating this very issue who argued formats like .xz or similar weren't very suited to archiving. Relevant concerns include, how screwed you are if the archive is partly corrupted, for example. The more sophisticated your compression algorithm (and thus the more state it records from longer before a given block), the more a single bit flip can result in massive amounts of run-on data corruption, so better compression essentially makes things worse if you assume some amount of data might be damaged. You also have the option of adding parity data to allow for some recovery from damage, of course. Though as this article shows, it seems like all of this is nothing compared to the challenge of ensuring you'll even be able to read the media at all in the future.

At some point the design lifespan of the proprietary ASICs in these tape drives will presumably just expire(?). I don't know what will happen then. Maybe people will start using advanced FPGAs to reverse engineer the tape format and read the signals off, but the amount of effort to do that would be astronomical, far more even than the amazing effort the author here went to.

@readyplayernull 6d
A few months ago I was looking for an external backup drive and thought that SSD would be great because it's fast and shock resistant. Years ago I killed a Macbook Pro HD by throwing it on my bed from few inches high. Then I read a comment on Amazon about SSD losing information when unpowered for a long time. I couldn't find any quick confirmation in the product page, took me a few hours of research to find some paper about this phenomenon. If I remember correctly it takes a few weeks for the stored SSD to start losing its data. So I bought a mechanical HD.

Another tech tip is not buying 2 backup devices from the same batch or even the same model. Chances being these will fail in the same way.

@FearNotDaniel 6d
> the ADR-50e drive was advertised as compatible, but there was a cave-at

I'm assuming the use of "cave-at" means the author has inferred an etymology of "caveat" being made up of "cave" and "at", as in: this guarantee has a limit beyond which we cannot keep our promises, if we ever find ourselves AT that point then we're going to CAVE. (As in cave in, meaning give up.) I can't think of any other explanation of the odd punctuation. Really quite charming, I'm sure I've made similar inferences in the past and ended up spelling or pronouncing a word completely wrong until I found out where it really comes from. There's an introverted cosiness to this kind of usage, like someone who has gained a whole load of knowledge and vocabulary from quietly reading books without having someone else around to speak things out loud.

@phkahler 6d
>> The tape was the only backup for those things, and it completes Frogger 2's development archives, which will be released publicly.

In cases like this can imagine some company yelling "copyright infringement" even though they don't possess a copy themselves. It's a really odd situation.

@aidenn0 5d
TIL there are three completely different games named "Frogger 2" I assumed this was for the 1984 game, but this is for the 2000 game (there is also a 2008 game).
@caycep 5d
At some point, I feel as if it may be easier just to rewrite the code from the ground up vs. going through all that computational archaeology....

Or in a few years, just have an AI write the code...

@dabiged 5d
I work in the tape restoration space. My biggest piece of advice is never NEVER encrypt your tapes. If you think restoring data from an unknown format tape is hard, trying to do it when the drive will not let you read the blocks off the tape without a long lost decryption key is impossible.
@userbinator 5d
As the other comment here says, any company claiming to do data recovery, and damaging the original media to that extent, should be named and shamed. I can believe that DR companies have generic drives and heads to read tapes of any format they come across, but even if they couldn't figure out how the data was encoded, there was absolutely no need to cut and splice the tape. I suspect they did that just out of anger at not likely being able to recover anything (and thus having spent a bunch of time for no profit.)

Melted pinch rollers are not uncommon and there are plenty of other (mostly audio) equipment with similar problems and solutions --- dimensions are not absolutely critical and suitable replacements/substitutes are available.

As an aside, I think that prominent "50 Gigabytes" capacity on the tape cartridge, with a small asterisk-note at the bottom saying "Assumes 2:1 compression", should be outlawed as a deceptive marketing practice. It's a good thing HDD and other storage media didn't go down that route.

@kookamamie 5d
Someone was wise enough to erase the evidence in Party.
@sydbarrett74 5d
This is a masterful recovery effort. The README should be shared as an object lesson far and wide to every data restoration and archival service around.
@dark-star 5d
I'm pretty sure that even with the substantial damage done by the recovery company, a professional team like Kroll Ontrack can still recover the complete tape data, although it probably won't be cheap.
@Const-me 5d
CD-R drives were already common in 2001: https://en.wikipedia.org/wiki/CD-R

I wonder would a CD-R disk retain data for these 22 years?