General Discussion
Related: Editorials & Other Articles, Issue Forums, Alliance Forums, Region ForumsUS flights grounded because engineer accidentally 'replaced one file with another'
With the Federal Aviation Administration's Notice To all Air Missions, or NOTAM, system back up and running, staffing remains high and systems monitoring is at an urgently high level this morning, a senior official told ABC News Thursday.
MORE: Software maintenance mistake at center of major FAA computer meltdown: Official
Computer traffic on the NOTAM system is at super-high levels as airlines, pilots and airports start the day with normal flight operations while also trying to make up for delays and cancellations yesterday. At the same time, public and media computer traffic on the NOTAM system is running high because of global interest in the antiquated system that crashed on Wednesday.
The ground stop order that paused all airplane domestic departures and the FAA systems failures Wednesday morning appear to have been the result of a mistake that occurred during routine scheduled systems maintenance, according to a senior official briefed on the internal review.
An engineer "replaced one file with another," the official said, not realizing the mistake was being made. As the systems began showing problems and ultimately failed, FAA staff feverishly tried to figure out what had gone wrong. The engineer who made the error did not realize what had happened.
https://www.msn.com/en-us/news/us/us-flights-grounded-because-engineer-accidentally-replaced-one-file-with-another/ar-AA16glMc
CurtEastPoint
(18,664 posts)Effete Snob
(8,387 posts)Torchlight
(3,361 posts)Blues Heron
(5,944 posts)no wonder stuff stopped working! (kidding)
GoCubsGo
(32,094 posts)Too bad the "Drown the Government in a Bath Tub" crowd is too busy trying to blame Pete Butegeig for something they have been deliberately neglecting for DECADES to do that.
lapfog_1
(29,226 posts)an architecture failure...
Someone should have constructed the software to make replacing a critical file nearly impossible... we typically do that in Unix / Linux systems by making such files "immutable" so that to change the contents one has to first issue a command to remove the restriction and then copy the new file over the old file. This prevents accidental replacement. In addition, if the file is truly critical, the architect should have designed the software to check the contents to ensure that the new file contents are "acceptable" format before resolving to use the new contents. This can be done many ways... but at the very least the new data should not cause the software to fail completely.
In the Object Storage world there is even the ability to require two different user accounts to change the contents of the object... i.e. the person with the authority to change the object t with a different object is NOT the same person / account needed to make the object even changeable. This is sort of like a nuclear launch sequence. Two people each with their own keys are required to make a change to a critical data file / object.
Not to say that even then accidents can't happen. Of course they can... but such systems make it much harder to do. Obviously, nuclear launch keys demand more attention from the operators than a pilot notification system.
Ray Bruns
(4,111 posts)in the house every dime they can wring out of the budget so they can give another tax cut to the 1%.
lapfog_1
(29,226 posts)part of the problem is money... part is the horrible inefficiency which which the government acquires technology... huge RFPs and contracts that often take years to deliver (by which time the technology is already obsolete). Contractors who rip off the taxpayer with bloated contracts and little oversight. Not to mention that scale of activity is simply not planned for in the requirements... Things are still in use 20 or 30 years (or even longer) after they were first designed... and often tracking or doing things for 10x or 20x the activity originally called for in the RFP.
I was on a committee that did a review of other agency technologies in the federal government... OMG there was stuff that absolutely should not be in use anymore at the time of the review.
jmowreader
(50,562 posts)My unit in Berlin had requested a new system to do a very important function and by the time they were ready to start making it, the unit had closed.
Jim__
(14,083 posts)I think most up-to-date systems have implemented the type of checks your citing.
lapfog_1
(29,226 posts)and implemented in either 1980s or 1990s.
My over / under guess is 1985.