SEATTLE – Amazon on Thursday blamed an employee who entered an input command “incorrectly,” causing a larger set of servers than was expected to be removed from service, producing a five-hour chain-reaction outage on its very popular Amazon Web Services platform.
That meant that certain critical systems had to be rebooted — and while they were restarting, Amazon’s S3 wasn’t working as normal.
Impacted were sites including Sound Cloud, MyEmma, and even the US Securities and Exchange Commission were down for much of Tuesday afternoon.
Amazon S3 is used by developers to store files, so some websites may have stayed up but were slow to load or couldn’t load any images.
At one point the “dashboard,” where Amazon tells its users which of its services are operational, wasn’t working because of the S3 issue.
The issue wasn’t an “outage” because the entire system wasn’t down, only some services. Amazon said that it has included new safeguards so that it won’t happen again.
Amazon CEO Jeff Bezos had this to say:
We want to apologize for the impact this event caused for our customers. While we are proud of our long track record of availability with Amazon S3, we know how critical this service is to our customers, their applications and end users, and their businesses. We will do everything we can to learn from this event and use it to improve our availability even further.”
Yan Ness, CEO of Online Tech, a private Internet network service with five data centers in Michigan and Indiana, said in a column companies that reply exclusively on public Internet service to so at their own peril.
To read the rest of his comments, click https://dev.mitechnews.com/news/online-tech-ceo-aws-failure-provides-reality-check-public-cloud-networks/




