The Role of Artificial Intelligence in Software and API Security
Baljeet Malhotra, PhD
Security and reliability of software systems and web-driven software services such as Application Programming Interface or simply API are enormously important to our modern economy. APIs in particular are playing an important role in connecting our digital worlds at a massive scale. On the other hand, Artificial Intelligence (AI) is revolutionizing the way we live, work and think. In recent times, computing machines have become intelligent enough to recognize real world objects, recognize speech, learn programs, paint like an artist, or even dream like humans. Software and API solutions are also benefiting from the advances in AI.
Open Source Software and Web API, i.e., software and API that are publicly available through the Web, have added another dimension to cyber security. Before I discuss the importance of AI research in cyber security, let me first highlight that Open Source Software and Web APIs have become an important part of our digital world. Besides their obvious benefits of transparency and openness, Open Source solutions face greater security challenges. Due to the inherent open nature, i.e., availability of publicly facing Web APIs, source codes; detection and exploitation of security vulnerabilities, Open Source solutions are more prone to cyber-attacks. Figure 1 below reveals the number of vulnerabilities reported in National Vulnerability Database (NVD). Note that, there are many more vulnerabilities that never make it to NVD, a topic that we addressed in our previous research
Figure 1: Year-wise distribution of vulnerabilities reported in NVD
The recent exploitations of vulnerability (CVE-2017-5638) in Apache Struts reminds us of severe consequences that enterprises (as well individuals) face from Open Source solutions. As various Open Source solutions expand to different industries and markets, the timely discovery and mitigation of publicly known vulnerabilities has become increasingly important. Unfortunately, security experts who often discover these vulnerabilities (with the intention of mitigating the risks) are finding it extremely difficult to analyze the vulnerabilities. For instance, to determine various threat levels and exploitability factors, security experts are often required to determine: (1) access/authentication complexity, (2) confidentiality, integrity and availability impacts of vulnerabilities, and (3) numerical scores to quantify the items mentioned in (1) and (2). A good source for (one of the several) vulnerability assessment methodologies can be found here
Overall vulnerability analysis is a time-consuming task, which ironically must be done in a time-sensitive manner without compromising with the essential steps of analysis that are much needed to mitigate the risks in an effective way. Unfortunately, this situation is becoming worse due to the increased number of vulnerabilities that are being discovered (recall Figure 1 again). On a given day, security experts end up analyzing tens of vulnerabilities (discovered within millions of Open Source Software and APIs that we are publicly available) to make the consumers (of the affected Open Source Software and APIs) more secure and compliant. In this context, we are using cutting edge AI solutions to help security experts in conducting API security assessments and vulnerability analysis at a large scale; yet in a time-sensitive and accurate manner. It will not only be time-effective but also cost-effective, if computing machine (powered by AI solutions) can do such analysis independently and automatically, but before addressing that ambitious goal, let’s try to understand where the challenges are.
An important part of AI driven security solutions is training computing machines with real world datasets. At TeejLab we have a large database of APIs supplemented by other important pieces of meta data such as publicly known vulnerabilities, licenses, vendor information, etc. to assess the security posture of publicly available Web APIs. Our data scientists and security experts are effectively utilizing these data to build next generation of cyber security solutions. In this context, training a computing machine is very important, which essentially means providing relevant and sufficient amount of data to algorithms that can continue to learn from the evolving data as new Open Source solutions become available and new API deficiencies and software vulnerabilities are discovered.
These constantly evolving data pose several challenges that need to be overcome before AI driven security solutions can be realized. Many of these challenges are primarily due to the fact that Open Source solutions entail large volumes of structured and unstructured data that are difficult to find, manage and analyze. We are applying various Data Mining, Machine Learning and Natural Language Processing solutions to solve some of the most challenging problems related to the security of Open Source solutions. Following are some examples of our AI driven solutions.
- Automatically discover web APIs that could pose security risks even before they are consumed in various software products and services.
- Automatically map publicly known vulnerabilities to Open Source projects (which could be known differently within various open source and security communities).
- Automatically conduct a preliminary analysis of vulnerabilities to determine their severity and importance so that vulnerability analysis can be prioritized. Our AI driven solution evaluates these risks in the context of applications and their business impact.
- Automatically find relationships between various Open Source projects that are detected within your code. Our AI driven solution helps in a better understanding of your code dependencies to mitigate security and compliance risks at file and directory level.
- Automatically analyze hundreds of legal documents (licenses, terms of services, privacy statements, privacy laws such as HIPPA, DMCA, etc.) to determine the compliance risks.
To sum this up in a quite admittedly way, AI cannot fully automate the process of managing cyber security risks originating from Open Source solutions. Nonetheless, tech-community is making good progress in leveraging upon advanced AI technologies for cyber security. As a responsible member of the society, TeejLab-UNBC Centre of Excellence is also playing an important role in combining academic and industry research to build cutting edge solutions for managing risks from Web APIs that are vital to connecting our digital worlds and economy.
Do you want to know more about our cutting edge / AI-driven solutions or want to get involved in our research projects? Contact us for more details.
About the Author: Dr. Malhotra is the Founder and Managing Director of TeejLab. Previously he was the Vice President of Research at Black Duck Software. He founded Black Duck Software Canada, a research division of Black Duck Software that got acquired for US $565 million. He concurrently holds three Adjunct Professor positions at the University of British Columbia, University of Victoria and the University of Northern British Columbia (UNBC). Previously, he was Research Director at SAP, where he derived IoT standards strategy. Before that he was a Computational Scientist with the Earth Observation Systems Laboratory and a Senior Software Engineer at Satyam Computers during 1999. Dr. Malhotra holds a PhD in Computing Science from the University of Alberta. He did his post-doc work at the National University of Singapore. He has published various scientific reports, blogs and patents. His PhD thesis was nominated for CAGS/UMI distinguished dissertation award. He received Industrial R&D Fellowship, iCORE Postgraduate Scholarship; Walter H Johns Graduate Fellowship; Queen Elizabeth II Graduate Scholarship; ASI Graduate Scholarship from the Advanced System Institute of BC, Canada. He was NSERC (Canada) scholar during 2005-2010, and Global Young Scientist (Singapore) in 2011-2012. He was awarded Distinguished Alumni of UNBC in 2017.