At The Office & At Home: Why We Need Machine Learning to Prevent Data Breaches


Ed Bishop
Co-Founder & CTO
Tessian

Despite there being thousands of cybersecurity products on the market, data breaches are at an all-time high. Worse still, with workforces around the world having suddenly transitioned from offices to their homes, organizations are even more vulnerable to new and increasingly sophisticated threats.

The reason? For decades, businesses have focused on securing the machine layer — layering defenses on top of their networks, devices, and finally cloud applications. But these measures haven’t solved the biggest security problem — an organization’s own people.

In this article, I’ll outline how machine learning can help organizations solve their “people problem” and why this new human-centric approach is more important now than ever.

 

The “people problem”

While we don’t believe that employees are the weakest link, we do know that, unfortunately, to err is human. People make mistakes, break the rules, and are easily hacked.

The fact is, when faced with overwhelming workloads, constant distractions, and jam-packed schedules, cybersecurity just isn’t top of mind for the average employee. Cybersecurity training can go out the window in moments of stress. Best practice and policies and procedures can be pushed aside in favor of an easier (albeit less secure) path.

Let’s look at email. Imagine an employee has to submit a project proposal by 8 PM. Because they’re working from home, they’re eager to finish up and spend some much-needed time with their family. After finalizing the document with just 5 minutes to spare, they draft an email, and hit send.

The problem is, the email was sent to Jane Green, not Jane Grier, including revenue projections, Intellectual Property, and confidential client data.

An employee could just as easily fall for a phishing scam. It shouldn’t be a huge surprise, then, that nearly half (43%) of people say they’ve made mistakes at work that compromised cybersecurity. 

That’s why we shouldn’t leave people as the last line of defense against security threats.

But, that’s easier said than done. Why? Because no two humans are the same. We all communicate differently — and with natural language, not static machine protocols. These complexities make solving human layer security problems substantially more difficult than addressing those at the machine layer — we simply can’t codify our behavior with “if-this-then-that” logic.

But, this isn’t the only issue we face when trying to prevent human error. There’s also the issue of time. Our relationships and behaviors change and constantly evolve. We make new connections, take on new projects, and talk to different people about different things.

 

The time factor

We can use machine learning to identify normal patterns and signals, allowing us to detect anomalies when they arise in real-time. This technology has allowed businesses to detect attacks at the machine layer more quickly and accurately than ever before.

One example of this is detecting when malware has been deployed by malicious actors to attack company networks and systems. By inputting a sequence of bytes from a computer program into a machine learning model, it is possible to predict whether there is enough commonality with previously seen malware attacks — while successfully ignoring any obfuscation techniques used by the attacker. Like many other threat detection problem areas at the machine layer, this application of machine learning is arguably “standard” because of the nature of malware: A malware program will always be malware.

But, this method of detection won’t work for human behavior. As mentioned, it changes over time. That’s why, in order to solve the threat of data breaches caused by human error, we need stateful machine learning.

Consider the example of trying to detect and prevent data loss caused by an employee accidentally sending an email to the wrong person. Harmless mistake, right? Not quite. Misdirected emails were the leading cause of online data breaches reported to regulators in 2019. All it takes is one clumsy mistake – like adding the wrong person to an email chain – for data to be leaked.

It’s worth mentioning that this is happening a lot more than IT leaders think. Research shows that, while IT leaders in organizations with 1,000+ employees think just 480 misdirected emails are sent every year, the actual number of misdirected emails sent is more than 800. That’s a big difference.

But, how do you accurately predict whether an email is being sent to the right (or wrong) person? You need to understand — at that exact moment in time — the nature of the sender and recipient’s relationship. What do they typically discuss, and how do they normally communicate? You also need to understand the sender’s other email relationships to see if there may be a more appropriate intended recipient for this email. You essentially need an understanding of the sender’s entire historical email relationships up until that moment.

That’s why understanding “state,” or the exact moment in time, is absolutely critical.

 

Why stateful machine learning?

With a “standard” machine learning problem, you can input raw data directly into the model, like a sequence of bytes in the malware example, and it can generate its own features and make a prediction.

As previously mentioned, this application of machine learning is invaluable in helping businesses quickly and accurately detect threats at the machine layer, like malicious programs or fraudulent activity.

But, the most sophisticated and dangerous threats occur at the human layer when people use digital channels, like email. To predict whether an employee is about to leak sensitive data or determine whether they’ve received a message from a suspicious sender, for example, we can’t simply give that raw email data to the model. It wouldn’t understand the state or context within the individual’s email history.

 

What is stateful machine learning?

Stateful machine learning allows us to look across each employees’ historical email data set and calculate important features by aggregating all of the relevant data points leading up to that moment in time. We can then pass these into the machine learning model. The time variable makes this a non-trivial task; features now need to be calculated outside of the model itself, which requires significant engineering infrastructure and a lot of computing power, especially if predictions need to be made in real-time. But failure to adopt this type of machine learning means you will never be able to truly protect your people or the sensitive data they access.

The bottom line: people are unpredictable and error-prone and training and policies won’t change that simple fact, especially when employees are working remotely. Nearly half (48%) of employees say they’re less likely to follow safe security practices when working from home. 

Businesses need a more robust, people-centric approach to cybersecurity. They need advanced technologies – like stateful machine learning – that understand how individuals’ relationships and behaviors change over time. Only then can we truly succeed in detecting and preventing threats caused by human error in order to reduce the frequency data loss incidents and breaches.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts

Recent Posts

How AI is Revolutionizing Education -   Artificial intelligence has become increasingly relevant in a number of major industries. We read a lot about how it’s…
Three Amazing Ways AI is Revolutionizing Healthcare - It may not seem like it was too long ago when the idea of artificial intelligence playing a major role…
How 5G is Going to Impact AI in Automation Within Telecom - During this webinar, an industry expert discussed how an automation project comes to life from the initial business problem through…
How Automation Projects Come to Life in Telecom - During this webinar, an industry expert discussed how an automation project comes to life from the initial business problem through…
The Future of AI in Marketing - During this webinar, industry experts discussed where AI in marketing was heading in the future. We’ve included a short transcription…
How AI Has Changed Marketing - During this webinar, industry experts discussed how AI has changed the marketing industry. We’ve included a short transcription of the…
Key Takeaways From Ai4 2020 - Artificial Intelligence Creates the Demand of Innovation, Autonomy, and Personalization Amidst a Crisis There is a seemingly quiet, yet enormous…
Computer Vision Versus Other ML Projects - During this webinar, industry experts discussed computer vision projects versus other machine learning projects within an enterprise setting. We’ve included…
Computer Vision in the Enterprise - During this webinar, industry experts discussed if computer vision computer is commonplace within enterprises that have machine learning models in…
How AI is Enabling Banks to Provide a Better User Experience - During this webinar, industry experts discussed how AI is enabling banks to provide a better user experience for having both…

Popular Posts

Does Healthcare AI Meet Basic Ethics Principles? - Ingrid Vasiliu-Feltes Chief Quality and Innovation Officer MEDNAX, Health Solutions Partner Over the past decade we have noticed an exponential…
Machine Learning and Artificial Intelligence in Banking - Artit "Art" Wangperawong Distinguished Engineer US Bank Introduction Every company’s AI journey is different. We’re all trying to figure out…
Machine Learning for Pricing and Inventory Optimization @ Macy’s - Jolene Mork Senior Data Scientist Macy's Iain Stitt Data Scientist Macy's Bhagyesh Phanse VP, Data Science Macy's Overview In this…
Artificial Intelligence & Cybersecurity: Math Not Magic - Wayne Chung CTO FBI Introduction The field of cybersecurity has slowly progressed from an art to a science. It has…
AI/ML in Investment and Risk Management: Recent Applications, Use Cases, and Implementation Challenges - Arvind Rajan Managing Director - Head of Global & Macro PGIM Fixed Income Introduction Investing is a completely different ballgame…
Top AI Conferences - Interested in learning the latest in AI this year? We’ve compiled a list of the top artificial intelligence conferences in…
Machine Learning in Production: From Research to the Customer - Ameen Kazerouni Lead Data Scientist Zappos Overview In this presentation Ameen Kazerouni, the Lead Data Scientist at Zappos, walks through…
How COVID-19 is Impacting the State of AI in Banking - On this panel, industry experts (listed above) discussed The State of AI in Banking and how COVID-19 is affecting it.…
“Ask Me Anything” with Zappos’s Head of AI/ML Research & Platforms, Ameen Kazerouni - Ameen Kazerouni Head of AI/ML Research & Platforms Zappos Family of Companies Ai4 recently hosted an "Ask Me Anything" session…
The Autonomous Pharmacy: Applying AI and ML to Medication Management Across the Care Continuum - Ken Perez VP of Healthcare Policy Omnicell, Inc. Ken applies artificial intelligence (AI) and machine learning (ML) solutions to medication…