Secure Your Code Via AI


Eliezer Kanal
Technical Manager, Cyber Security Foundations, CERT Division
Carnegie Mellon University Software Engineering Institute

Introduction

In this presentation Eliezer Kanal, a Technical Manager at CERT, talks about the possibilities involved with writing secure software that is not vulnerable to cyber-attacks. Developers have created many techniques to help write secure code. While many of these techniques have existed for a while, applying machine learning techniques is able to enhance the efficiency of which these methods are implemented.

Eliezer frames the solution to solve this problem from an NLP perspective. Natural Language Processing is a machine learning algorithms’ attempt to understand, categorize, and predict language. This is done in three steps. First, the data needs to be acquired, and as is typical of machine learning algorithms the more data the better. This data consists of anything that is written and is combined with an algorithm to produce a model.

The next step involves processing this data within the model and using it to take new raw data and produce a representation of what that raw data means. The final step is to generate new language. This last step can be thought of as an autocomplete function that you might see when performing a google search or writing a text message.

Watch Eliezer Kanal’s full presentation here

“You try to know a word by the company it keeps.”

How a machine learning algorithm attempts to dissect language can be thought of in terms of morphology, lexical analysis, and semantics. Morphology includes breaking up words in to component parts. Lexical analysis converts a sequence of characters into a sequence of tokens or strings with an assigned and thus identified meaning. Finally, semantics tries to determine what you are supposed to do with the information gathered. All of this can become quite complicated when looking at normal speech and text. Fortunately, coding is more structured than normal language practices which actually makes NLP better suited for applications related to coding.

One way that machine learning algorithms attempt to tackle these NLP problems is through N-grams. N-grams remove all of the context within a body of text and only analyzes the last “n” words to try and predict the next word. In this case “n” is the variable. A bigram would be a 2n-gram, where the algorithm would look at the last two adjacent words to predict the next. Essentially these n-gram algorithms tell the probability of the next word given the previous “n” words. Again, since code is much more regular than normal language Eliezer explains that looking back 3 grams is usually enough for accurate predictions. Additionally, to the benefit of code, it is very easy to get large data sets to train and test on though sites like GitHub. 

Word to vector is a newer machine learning process that uses ontology to build a giant linked dictionary that contains relationships between words. While these are difficult to make, they are accurate. By looking at the words around a single word, the algorithm can start to build a relational understanding between words. It is able to translate these relationships into a mathematical interpretation. A few examples of how this relationship can be defined are:

  • man + many = men
  • king – man + woman = queen

Eliezer goes on to give a few examples of how these different NLP algorithms are specifically applied to code. A common convention within programming is writing clean code. This is code that is understandable by others and allows for different processes to be passed and shared amongst individuals and team. Machine learning can look for similarities between an agreed upon correct code base and compared to new code to give warnings about what is similar and what is not. Additionally, NLP algorithms can be written to try and find bugs within the code itself. This concept might look for code that is very similar to other code with only small differences that might represent errors. If these tokens are almost identical then a warning would pop up notifying the user of a potential error.

NLP is a powerful tool that has been implemented to determine sentiment analysis, aid in spell check, allow voice text messaging, and is used in home voice recognition devices like Siri and Alexa. The application base is broad and seems adequately suited to be implemented towards cleaning and writing code as Eliezer points out in this presentation. As these processes continue to be perfected the security that they are able to provide increases as well.

For more information, please visit the Software Engineering Institute website (www.sei.cmu.edu) or send me an email at [email protected].


Tags   •   Cybersecurity

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts

Recent Posts

F - Test
Ai4 - Ai4 2020 is industry’s most impactful AI event.
How Machine Intelligence is Saving Lives - Artificial intelligence is improving healthcare and changing the lives of patients. In 2019, the AI in the global healthcare market…
Dilli Labs LLC - This is a test.
The Changing Roles within Cybersecurity Due to AI - During this panel, industry experts (showed above) discussed the changing roles within the cybersecurity industry due to AI. We’ve included…
How COVID-19 Is Impacting Cybersecurity - During this panel, industry experts (showed above) discussed how COVID-19 is affecting cybersecurity. We’ve included a short transcription of the…
Artificial Intelligence for Secure Payments - David SmithConsultantSmart Card Institute In recent years, the world has seen a transformation of all industries to a digital world.…
Leveraging AI in the Travel Industry at Airports - https://youtu.be/T7OpY6P0cE8 Bespoke is the developer of “Bebot”, the first AI-powered chatbot developed specifically for travel and emergency response. Bebot assists…
“Ask Me Anything” with Reid Blackman, PhD & AI Ethics Consultant - Reid Blackman, PhDAI Ethics Consultant & CEOVirtue Ai4 recently hosted an “Ask Me Anything” session with Reid Blackman, PhD on…
How Hackers are Using AI - During this panel, industry experts (showed above) discussed how hackers are using AI and the changes that they've noticed. We’ve…

Popular Posts

Does Healthcare AI Meet Basic Ethics Principles? - Ingrid Vasiliu-Feltes Chief Quality and Innovation Officer MEDNAX, Health Solutions Partner Over the past decade we have noticed an exponential…
Artificial Intelligence & Cybersecurity: Math Not Magic - Wayne Chung CTO FBI Introduction The field of cybersecurity has slowly progressed from an art to a science. It has…
AI/ML in Investment and Risk Management: Recent Applications, Use Cases, and Implementation Challenges - Arvind Rajan Managing Director - Head of Global & Macro PGIM Fixed Income Introduction Investing is a completely different ballgame…
Machine Learning for Pricing and Inventory Optimization @ Macy’s - Jolene Mork Senior Data Scientist Macy's Iain Stitt Data Scientist Macy's Bhagyesh Phanse VP, Data Science Macy's Overview In this…
Top AI Conferences - Interested in learning the latest in AI this year? We’ve compiled a list of the top artificial intelligence conferences in…
“Ask Me Anything” with Zappos’s Head of AI/ML Research & Platforms, Ameen Kazerouni - Ameen Kazerouni Head of AI/ML Research & Platforms Zappos Family of Companies Ai4 recently hosted an "Ask Me Anything" session…
Advancements at Siemens Healthineers in AI for Medical Imaging - Bimba Rao Head of Global Artificial Intelligence Engineering Siemens Healthineers Ultrasound Siemens Healthineers background  Siemens Healthineers builds healthcare products and…
Leveraging AI in Cybersecurity Risk Modeling & Mitigation - Christopher Novak Director, Threat Advisory Research Verizon Wireless Introduction Originally, there was a poor understanding of why cyber breaches were…
An Ensemble Approach to Predict Default Risk in Stress Testing - Yun Zheng VP of Innovation & Global Risk Analytics HSBC Overview This presentation discussed the importance of performing stress tests…
Machine Learning and Artificial Intelligence in Banking - Artit "Art" Wangperawong Distinguished Engineer US Bank Introduction Every company’s AI journey is different. We’re all trying to figure out…