Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

Backseat Driver

(4,377 posts)
Fri Sep 11, 2020, 12:36 PM Sep 2020

Dozens of scientific journals have vanished from the internet, and no one preserved them

https://www.sciencemag.org/news/2020/09/dozens-scientific-journals-have-vanished-internet-and-no-one-preserved-them

By Jeffrey BrainardSep. 8, 2020 , 4:10 PM

Eighty-four online-only, open-access (OA) journals in the sciences, and nearly 100 more in the social sciences and humanities, have disappeared from the internet over the past 2 decades as publishers stopped maintaining them, potentially depriving scholars of useful research findings, a study has found.

An additional 900 journals published only online also may be at risk of vanishing because they are inactive, says a preprint posted on 3 September on the arXiv server. The number of OA journals tripled from 2009 to 2019, and on average the vanished titles operated for nearly 10 years before going dark, which “might imply that a large number … is yet to vanish,” the authors write. [snip]


11 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies

SWBTATTReg

(22,044 posts)
1. I doubt in the age of the Internet, the digital age, that these documents have really disappeared.
Fri Sep 11, 2020, 12:40 PM
Sep 2020

And besides, if they were worth something, or presented facts worth saving, they would have been saved/preserved. Nothing truly disappears in the digital world.

CloudWatcher

(1,842 posts)
4. Not true at all, but I wish I agreed with you!
Fri Sep 11, 2020, 09:39 PM
Sep 2020

I've been using the Arpanet/Internet since the 1970's, and in spite of archive.org and the wayback machine, an enormous amount of digital data is lost all the time ... and it's not just the garbage.

As Jeff Rothenberg famously said: Digital objects last forever -- or five years, whichever comes first.

He's got much less hair now than when I knew him:



SWBTATTReg

(22,044 posts)
6. I still believe it's somewhere, one just doesn't know where it is. And I've been involved since the
Sat Sep 12, 2020, 12:48 PM
Sep 2020

mid 70s. Just because one can't find it in a particular place doesn't mean anything, especially w/ the advent of archival services, etc.

There are some companies that I'm aware of that takes a snapshot of everything once every week (in their own operations). And these sorts of back ups occur via magnetic tapes, where vast amounts of data can be stored for long periods of time. These tapes etc. are then retained for ungodly amounts of time, some I am aware of for 40+ years (retention periods vary depending upon the material).

I think personally that the government may do this too, if anything, to monitor us and our traffic. This is why I think that somewhere, somebody out there is monitoring/recording/watching everything. I know it's a little far fetched, but I am paranoid and I truly don't believe that government/etc. isn't so benevolent.

CloudWatcher

(1,842 posts)
7. backups ...
Sat Sep 12, 2020, 04:28 PM
Sep 2020

I've spent way too long doing IT. Including managing the people that did backups of our computers. Yes, we used tapes. A lot of tapes. And sometimes we'd make archive tapes that we'd keep for many years. But the vast majority of the tapes were in a 'pool' that were reused after just a few months. We couldn't afford to buy (and store) tapes forever. Our focus was to be able to recover from computer failures or quickly-observed human error. Saving copies of everything for years just wasn't practical.

A company's retention policies can vary a lot. I've seen some go from "pack rat" to "lean" as the result of having to respond to a single legal discovery action. It seems most people still haven't figured out to be careful with what they put in email.

Yes, it's well known that the NSA loves to vacuum up enormous amounts of data and save it as long as they can. See Utah Data Center and The Puzzle Palace. But even assuming they have access to internal (within a company) communications, they would need some way of getting the data back to them. And while the Internet can be fast, bandwidth is expensive, and companies don't tend to buy more than they need. One of the jobs of a decent IT group is to keep an eye on Internet usage, and if the NSA (or hackers) suck up a lot of data it stands out like a red flag.

Btw, Rothenberg's critical insight was that while digital storage might physically be "readable" after many years, the software and hardware required to read the media is often just not available after about 5 years. I've got "archive" magtapes from the 70s that I've not been able to read in 40 years (yeah, I'm a pack rat). But I've also got tax returns from just 5 years ago that I can't open because the old tax software hasn't been upgraded to run on the current macOS.

So yes, digital data can be around for a long long time. And the government is storing as much as they technically can. But they aren't interested in saving the same things we want to save. And they're really bad at sharing what they've got.

So we still have to work to keep access to useful stuff (like online-only journals) from disappearing forever.

SWBTATTReg

(22,044 posts)
8. Yes, I have the Puzzle Palace. A good book. I haven't read about the 'Utah Data Center'
Sat Sep 12, 2020, 04:45 PM
Sep 2020

but I'm not surprised to see that they have such a thing, more than likely, they have multiple sites such as this (or at least, for backup purposes, they do have multiple back up sites). A good portion of my job was ensuring that the millions and billions of call records (both data (merging the 15 minute segments into 24 hour call record snapshots) and voice (standard AMA recordings) got recorded and then retained (often for many many years since they pertained to billing, thus needed to be retained).

Our biggest concern was the sheer amount of dasd required to back up/store this data (which eventually aged off to mag. tape, which then was stored indefinitely). We literally had data centers (tape libraries) full of the stuff and then afterwards, it all went to offsite storage (such as caves in the Ozarks and such places), where the data was permanently stored. A pain in the a&& but we had to do it for legal and financial reasons (the appropriate retention periods set by our legal dept). Occasionally, we had legal reasons to drudge up some old files for the legal team, you can imagine that we hated these types of requests, at least they didn't happen often, but they did happen enough.

You do make a good point in that technology does change and media used before hand gets obsolete, thus possibly being unread to read at a future point in time. That's why occasionally we read and/or migrated such files to more recent media types, especially if we were wanting to get rid of a class of devices used, e.g., 9-track vs. 18-track, silo drives, etc.

Ah, technology, you got to love it, eh? I did get tired of changing our JCL all of the time to reflect the new tape drives etc. (or other new class of storage media, oftentimes we had to change JCL for stuff that may or may not have been run for years literally).

CloudWatcher

(1,842 posts)
9. JCL!
Sat Sep 12, 2020, 06:38 PM
Sep 2020

And of course ... migrating files to newer media is only the first step. All too often you needed to rewrite the software as well. OS's and languages are far from static, and just keeping something working for more than a few years can be a major pain. Combined with companies going out of business, it's no real mystery why so much has been lost to time.

But now I'm getting nostalgic over JCL! My first job in college was helping anyone that asked with their Fortran, PL/1 and JCL problems. I clearly remember the feeling of awe when I learned 360 assembler and that you could use the "open type=j" macro to avoid having to specify filenames in the DD cards ... how cool!

Lol, IEBGENER! IEFBR14! ABEND 0C4! Those were the days

SWBTATTReg

(22,044 posts)
10. Ha ha hah...those were heady days my friend. In one of my jobs at the ...
Sat Sep 12, 2020, 08:31 PM
Sep 2020

company, it was teaching IBM utilities (and non IBM utilities, panvalet, etc.), languages such as cobol, pli, etc., and misc. such as jcl, etc. to our new hires coming in (a 16 week all day long class, it was exhausting).

Jesus, the IBM 360 compiler, all of those buzz words, etc. brings back memories!! It seems like eons ago!

Thanks for the fond trip down memory lane, and take care of yourself!

csziggy

(34,131 posts)
11. Yeah, even in the 38 years I've been using computers, technology has changed
Sun Sep 13, 2020, 03:52 PM
Sep 2020

I lost access to all the digital data I had when I converted from my original Apple ][ (acquired in 1982) to a PC clone in 1987, except for hard copies I printed out of business data. Since going to MS-DOS, then Windows, I have worked hard to copy my data from old technology to new - with my 1993 Windows 3.1 computer, I had both 5.25" and 3.5" floppy drives and copied the data over to the smaller disks.

When I moved to Windows 98 in 1999, I got a CD-Writer (I totally skipped zip drives) and copied all my data from floppies to CDRs - all of which are still readable, though that data is now on backup hard drives. With my next computer and backup drives I plan to move to all SSDs.

My Dad started with a Commodore 64 and kept his old Commodore floppies for decades. At one point I had the technology to copy his data to Windows readable floppies, but he never would let them out of his possessions. With his last computer, my sisters and I convinced him to discard those old floppies - some had gotten so bad the magnetic material was flaking off the discs. When he got a CD-Writer he did not listen to me and used CDRWs the same way he used floppies. Not one of the CDRWs he burned were fully accessible since he never remembered to copy the old data into the new sessions and he wouldn't close them so they would be readable in a different system.

Since Dad gave me his old computers and I saved the hard drives out of them (from the drive on his 285 MS-Dos machine all the way to his last computer with Windows 98 - he died in 2013) I actually ended up with copies of much of his old data that was still on the drives, even when the boot sectors were trashed. Everything that is still relevant is now copied to CDR and/or sorted into folders on hard drives.

Considering how complicated it can be to keep personal data preserved, trying to keep up with massive numbers of online journals and other sources is daunting. Back when AOL was closing down the old CompuServe forums, I copied a huge amount of info from them. I'm not sure those records are still accessible. I should check since they might be of interest to someone someday!

eppur_se_muova

(36,246 posts)
2. Damned hard to list those on your cv.
Fri Sep 11, 2020, 01:01 PM
Sep 2020

I think some of the research I did as a postdoc may have been later published online, but have no idea where.

NNadir

(33,449 posts)
5. Well, we seem to have misplaced Archimedes discovery of calculus for about 1800 years or so...
Sat Sep 12, 2020, 07:54 AM
Sep 2020

...but it all worked out in the end, didn't it?

Latest Discussions»Culture Forums»Science»Dozens of scientific jour...