The importance of data storage security

The importance of data storage security

In 2018, the cloud computing market not only grew rapidly, but also had problems. The contradiction between cloud providers and the open source community continues to escalate, and mainstream cloud vendors have not escaped outages, and there are many service outages a year.

Google Cloud:

On January 18, 2018, Google’s cloud automation mechanism failed, resulting in a 93-minute outage of its computing engines in two availability zones, us-central1 and europe-west3. Google’s response was that the Autoscaler service was not working properly due to a “network programming failure”, which meant that new or newly migrated virtual machines could not contact other availability zone virtual machines.

Action taken: The engineering team manually switched to the replacement task to restore the data persistence layer to normal operation.

Downtime: 93 minutes

After the accident: Google promises that in the future, if the configuration data becomes outdated, Google will stop the virtual machine migration, and the data persistence layer will re-resolve the peer during the long-running process, so that it can quickly switch to the replacement task in the event of a failure.

AWS Cloud:

In the early morning of March 2, 2018, some Alexa deployed on the AWS cloud host began to experience voice loss problems, and the fault indicator of the smart speaker kept flashing to indicate that the service was interrupted. It is understood that the fault stems from a problem with Amazon AWS network services, and other applications that rely on AWS network services were also affected on the same day, including cloud communications company Twilio and software development company Atlassian.

Take action: Amazon AWS’s online support team manually intervened and fixed it.

Downtime: hours

Follow-up: Amazon AWS did not elaborate on this failure, only revealing that it was related to network connectivity.

On May 31, 2018, AWS again experienced connectivity issues due to a hardware failure at a data center in the Northern Virginia region. In this incident, AWS’s core EC2 services, Workspaces virtual desktop service and Redshift data warehouse service were all affected.

Take action: Human repair

Downtime: about 30 minutes

Following the incident: Mai-Lan, vice president and general manager of Amazon S3, said in an interview that Amazon has never seen a data center crash. This means that every incident in the past has not led to an entire data center crash, and AWS has made improvements at the system design level to prevent such incidents from happening.

In July 2018, Amazon launched a 36-hour member promotion activity around the world, and the activity had just begun, and the Amazon website and App had a serious outage at the same time, which not only halted e-commerce business, but also affected other products and services of Amazon to varying degrees. AWS explained this as a global issue with AW’s management console. Downtime: about 6 hours. An AWS spokesperson said the intermittent AWS management console issue did not have any meaningful impact on Amazon’s consumer business.

On November 9, the K8s service (GKE) node pool provided by Google public Cloud was abnormal, and operation and maintenance personnel could not establish a new cloud node through the Console UI. Take action: manual intervention and repair. According to Google, affected enterprise users can instead use gcloud command built into GCP to build new K8s nodes. Downtime: Nearly 19 hours

Ali Cloud: According to incomplete statistics, Ali Cloud has had more serious outages almost every year in recent years:

On June 27, 2018, around 16:20, Ali Cloud had a major technical failure, and began to recover at 16:50, the official failure time was about 30 minutes, and the recovery time took about an hour. After the technical review, Ali gave the cause of the failure as a change verification operation when the engineer team launched the new automated operation and maintenance function, which did not occur in the test environment and triggered unknown bugs after the launch.

Action: Human intervention to locate and resolve the problem.

Downtime: 30 minutes, recovery time takes about an hour.

Follow-up to the accident: This accident is defined as the failure of important functions of the core business, affecting a large number of users and causing great losses. Alibaba Cloud later issued an official statement, saying, “For this failure, there is no excuse, we can not and should not have such a mistake!” We will seriously review and improve automated operation and maintenance technology and release verification process, reverence every line of code, reverence every commitment.”

In the early morning of March 3, 2019, Ali Cloud suffered a sudden large-scale outage, which affected many areas in North China, and all the apps and websites of enterprises were paralyzed. Because the time of the incident was in the middle of the night, some netizens joked that “a large wave of programmers and operation sisters were forced to climb out of their warm beds and walk into the cold night in the north against a level 5 wind.” For the cause of the outage, Alibaba Cloud official response said that “North China 2 regional availability zone C part of the ECS server and other instances appeared IO HANG, and gradually recovered after emergency investigation and treatment.” The incident ended with Ali Cloud giving a compensation plan, but left people thinking is continuing – as the cloud service application field is more and more extensive, cloud manufacturers a short-term downtime accident, the impact is unprecedented, the financial term “eggs do not put in a basket” in the cloud field is also widely applicable?

Azure:

On June 17-18, 2018, Azure storage and network outages occurred due to a temperature control system failure in a data center in Ireland.

Downtime: The duration of the fault is more than 5 hours

On the morning of September 4, 2018, severe weather such as thunderstorms occurred near the Microsoft Azure US Central South Region data Center, affecting the voltage of the cooling system, causing connectivity issues in several Azure availability zones and making it difficult for customers to access data stored in the data centers in the area. The affected Services include Active Directory, Visual Studio Online, Office365, Visual Studio Team Services, and others.

Steps taken: By the morning of September 5, Microsoft engineers had restored power and most network equipment to the data center, and other services were being restored.

Downtime: more than 24 hours

Tencent Cloud:

On July 24, 2018, when users logged in to Tencent Cloud, timeout and exit repeatedly occurred, even if they changed operators, the result was the same. Subsequently, Tencent Cloud issued a notice saying that it was initially determined that the operator’s cable was interrupted, and the operator had found the break point and was being connected, mainly affected by some users in the Guangzhou region.

Take measures: operators first intervene in repair.

Downtime: Downtime is unknown, recovery time takes about 40 minutes

On August 5, 2018, Beijing Qingbo Numerical Control Technology Co., Ltd. published a blog post entitled “The disaster brought by Tencent Cloud to a startup company” on the official Weibo, which said that on July 20, 2018, Tencent Cloud storage failed (Tencent later gave the reason for the failure), resulting in the complete loss of the company’s data. And can not be recovered, this is the startup’s evaluation of nearly 10 million yuan of platform data, including registered users and content data accumulated after long-term promotion and diversion. Measures taken: Tencent said it informed users of the fault status as soon as it detected the anomaly, and organized storage experts and storage manufacturer technical experts to try to repair the data. However, after many efforts, some data integrity check failed and could not be recovered. Accident follow-up: Tencent proposed a follow-up plan of “compensation + compensation”, and promised to continue to maintain communication with “frontier CNC” to help its business recovery.

So are there other ways to make data more secure and more privacy-protecting?

PFS interplanetary file system is a peer-to-peer distributed file system. IPFS is an underlying Internet protocol that aims to replace the HTTP protocol. The HTTP protocol is the centralized server we use on the Internet today. Ari cloud, Tensing cloud and so on.

The IPFS protocol has been online for 4 years, and due to its characteristics, the server nodes are more decentralized, never losing user data, never going down, and data transmission is N times that of centralized storage. It will also reduce user storage costs by 40-60%.

On the IPFS network, our data will be distributed as widely as possible on different computers, which means that some of the data will be distributed on many different computers, and these computers are complete strangers.

Customer stored data will be broken down into a number of fragments sent to the hard disk of different storage miners, to maximize the decentralized storage of data, compared to centralized storage, in terms of data privacy, there is no doubt that decentralized storage is dominant.

This is a timebook landing application that cannot be missed by IPFS – a product that combines blockchain landing applications and IPFS decentralized storage projects:

Starblue TimeBook is a data center designed for the home. Through the APP, you can access photos, videos, music, etc. anytime and anywhere, multiple people share, exclusive space, privacy protection, and no longer worry about insufficient memory of the phone, no longer worry about privacy disclosure, and no longer worry about the loss of precious data such as photos.

The Star Blue TimeBook uses Intel processors, is deeply developed based on Linux system, uses NVME transmission protocol, DDR4 2400 dual channel, and configs M.2 solid state drive, read and write speed up to 2200-3200MT/s. Ultra-high configuration designed for distributed storage mining. Distributed storage shares hard drives and broadband, gets token incentives, and differs from Bitcoin’s computing power mining.

You may also like...