Securing Personal Data for AI/ML Computing in the Cloud
In this presentation, Serge Vilvovsky, a Cyber Security and Big Data Engineer for MIT Startup Exchange, discusses some of the obstacles and solutions to protecting private and sensitive data in the cloud. Oftentimes accompanying cloud storage is cloud computing power which presents a different set of challenges. However, together cloud storage and cloud computing create a powerful tool that is very appealing for companies to take advantage of.
As we have seen, and perhaps been personally affected by, data breaches are occurring now more than ever. In 2018 there were over 12,000 data breaches, a 424% increase compared to 2017. Of these data breaches, almost two thirds of them can be categorized as errors made by individual users or traced back to internal administrators. In other words, data breaches are overwhelmingly coming from either inside the organization that is breached or from the customers of the organization itself.
“There is more data and more private data today than ever before. Taking the right precautions to keep private data private is a growing concern.”
amount of data being collected is increasing ten times every five years. Genomic data may be increasing as much as 200 times every five years. This enormous amount of data cannot be stored in traditional ways for cost efficiency and security reasons. Companies are turning to cloud computing and storage as a much cheaper alternative rather than investing in the proper software and hardware necessary themselves. Companies such as Amazon, Microsoft, and Google all offer these services at a pay as you go pay structure. With the quantity of data increasing, the risk for breaches naturally increases. Of all data breaches that occurred, 24% of them cost companies on average more than $500,000. From a financial as well as a brand image perspective, securing your company’s data has become a top priority in todays’ world.
As the demand for storage and processing power increases companies have turned to cloud storage and processing as a solution. As Serge points out, in order to protect private and sensitive data there are different approaches that companies can take to solve this. The first is server-side encryption tools similar to those offered by Amazon and Microsoft. This encrypts data so that the system administrators of the cloud storage company do not have access to the private information.
This does not protect from the company’s own internal system administrators from accessing this information however. It supports machine learning and AI tools as the users needing to run these algorithms have the key access to decrypt the data. Server-side encryption operates by creating user policies and groups that define different access to different buckets on the cloud storage system. Problems and breeches can occur by a simple mistake of placing a file in the incorrect bucket and exposing it to users or access levels that it should not have been.
An alternative to server-side encryption is client-side encryption. This method helps protect against internal system administrators from causing a breach in data. The data is first encrypted and then put on to the cloud. This process does not grant the decryption key to the internal or cloud company system administrator. While this format does offer benefits for storing and backing up data, it makes it more problematic for running machine learning algorithms as the tools do not have the decryption key required to view the data.
Whether private data refers to an individual’s information on their phone or a global company’s records of their employees and clients, we can all relate to the importance of keeping private information private. With recent major data breaches occurring more often and personal identity theft on the rise the methods that are required to protect data have increased in scope and efficiency. Ignoring these risks in today’s world is asking for trouble. People like Serge are here to help protect that data and make the world a safer place, even if it is in a way that has not traditionally happened in the past.