General Discussion
Related: Editorials & Other Articles, Issue Forums, Alliance Forums, Region ForumsWhat the Huge AWS Outage Reveals About the Internet
https://www.wired.com/story/what-that-huge-aws-outage-reveals-about-the-internet/#:~:text=Problems%20began%20around%203%20am,additional%20time%20to%20fully%20process.%E2%80%9D"Amazon Web Services experienced DNS resolution issues on Monday morning, taking down wide swaths of the weband highlighting a long-standing weakness in the internet's infrastructure.
A massive cloud outage stemming from Amazon Web Services key US-EAST-1 region, its hub in northern Virginia, near the US Capitol, caused widespread disruptions of websites and platforms around the world on Monday morning. Amazon's main ecommerce platform and other properties, including Ring doorbells and the Alexa smart assistant, suffered interruptions and outages throughout the morning, as did Meta's communication platform WhatsApp, OpenAI's ChatGPT, PayPal's Venmo payment platform, multiple web services from Epic Games, multiple British government sites, and many others.
The outages stemmed from Amazon's DynamoDB database application programming interfaces in US-EAST-1, and AWS said in status updates that the problem was specifically related to DNS resolution issues. The domain name system is a foundational internet service that essentially acts as an automatic phonebook lookup to translate web URLs like www.wired.com into numeric server IP addresses so web browsers show users the right content. DNS resolution issues occur when DNS servers aren't accurately connecting these dots and, to keep with the phonebook analogy, are providing the wrong numbers for a given name, or vice versa.
An AWS spokesperson did not immediately respond when asked for details about the nature of the failure. DNS resolution issues can be maliciousknown as DNS hijackingbut there is no indication that Monday's AWS outages were nefarious.
When the system couldn't correctly resolve which server to connect to, cascading failures took down services across the internet, says Davi Ottenheimer, a longtime security operations and compliance manager and a vice president at the data infrastructure company Inrupt. Today's AWS outage is a classic availability problem, and we need to start seeing it more as data integrity failure.
Initech
(108,932 posts)When that company goes down, it will affect yours too!
CentralMass
(16,992 posts)Initech
(108,932 posts)CentralMass
(16,992 posts)slightlv
(7,824 posts)It seems the bigger the company, the less likely they practice safe, best practices when they update... no backup made of the original files and database, perform it on a "test" server running the same software as the one needing to be updated, etc. They just apply the update and hope for the best!
ancianita
(43,312 posts)This problem is more likely to come with massive scale global computing. Its customer base are entities like
NASA, the CIA, more than 80% of Germany's listed DAX companies, the U.S. Navy, DISH Network, GCHQ, MI5, MI6, the Ministry of Defence, and since 2022, Amazon shared a $9 billion contract from the United States Department of Defense for cloud computing with Google, Microsoft, and Oracle
https://en.wikipedia.org/wiki/Amazon_Web_Services
That doesn't mean that there no nefarious occurrences, there are; just not this time.
If there were, we wouldn't know about them, anyway, because a lot of its customers operate under national security classified rules.
While WIRED gives a bit of "reveal" here, what could be a bigger and more serious 'reveal' about the Internet won't come to public knowledge until after it's happened (as in this case), and more likely than not, never.
DNS attacks are increasing mostly because of AI use, which tells you that AI isn't in good customer hands, and itself exponentially causes big scale computing problems.
https://www.csoonline.com/article/4055796/why-domain-based-attacks-will-continue-to-wreak-havoc.html
hunter
(40,758 posts)Big brother is watching you.
--or--
???
lostnfound
(17,544 posts)CentralMass
(16,992 posts)hunter
(40,758 posts)haele
(15,453 posts)And now everything just went to shit...looks like the server's trying to randomly auto complete domain names on its own whenever it goes to retrieve them..."
That's what I think happened....
Mr. Evil
(3,459 posts)looks like George Soros is at it again!
(yeah, yeah I know, just in case)
BattleRow
(2,544 posts)fujiyamasan
(1,863 posts)This kind of thing can happen with any single point of failure and it sounds like it started cascading.
I wonder if it will get IT departments to reconsider their strategy of having only one big cloud provider handle everything. They may go multi cloud (avoid vendor lock in) or hybrid, though I doubt many companies want to reopen data centers again (huge capital expenditures on the books and its a lot of infrastructure to maintain). The reality is cloud computing offers too many benefits, without having to invest as much in hardware.