Securing AI Systems — Researching on Security Threats

Arun Prabhakar
DataDrivenInvestor
Published in
9 min readFeb 21, 2023

--

Image by Noel Bauza from Pixabay

“Research is to see what everybody else has seen and to think what nobody else has thought”, a powerful quote from a renowned Hungarian biochemist Albert Szent-Györgyi — the man behind the discovery of Vitamin C and the components and reactions of the citric acid cycle. That quote reminds us of the importance of recognizing the work done by researchers and at the same time, exploring further to innovate for better insights and get more creative. The extensive learnings from the existing knowledgebase helps many of the researchers and scientists to improve on the works of others and build new knowledge. Such is the impact created by Research, and this Research is an integral element with evolving technologies like Artificial Intelligence where many innovations are happening. In the last decade, we have seen rapid developments in the field of AI which comes from the betterment of the existing knowledge and also, different and creative thinking.

Advancements in algorithms, architectures, and computing abilities in AI have not only helped in developing cutting-edge solutions but also introduced new patterns of risks to everyone involved in the complete chain. In this article, we will look at those new patterns, particularly, from the perspective of Adversarial AI that are helpful in launching the security threats. We have briefly looked at these concepts, in the previous paper and now we will do a deeper analysis of this segment. Again, it is the research community that has contributed heavily to this space in showing the various possibilities and the impact of breaking into AI systems.

Researching on Security Threats

There are five segments of risks that have we have dealt with in first paper. They are neat and straight-forward. For example, we know the impact of regulatory violations — the articles, rules in each of these regulations (eg: GDPR or FINRA) are nicely detailed to help us adhere to the requirements while developing AI based solutions. The same applies to other types of risks namely AI Safety and AI Principles. Although continuous developments and improvements are happening there, the basic concepts are well laid out, and the AI practitioners have sufficient guidance to implement them. However, there are a few other areas of risk that are not easily referenceable, and we could better implement them only through exploration and experience.

Specifically, the innovations and advancements in security threats could be better learnt from the research papers written on the topic of Adversarial attacks in ML, where the mindset of the adversaries and attacking patterns are investigated and experimented from different dimensions. To make it easier to for us to understand the length and breadth of the research involved in learning security threats, we will put them into two categories — Completeness and Comprehensiveness. In the following sections, we will explore them in detail. As much as we appreciate the research efforts, we need to understand the pragmatism involved in the implementation including its applicability in real-world scenarios. We will deal with these in a separate section called Capability. Now let’s get started…

1) Completeness

The term “Completeness” when it comes to researching about risks in AI solution means that the objective of practitioners should be to go deeper and do everything possible within their scope while assessing potential threats. To understand this further, we had looked at the risks in the Software Security part in the last paper. There, we had learnt about the attacker entry points that are relatively easier to observe and act upon (eg: when the models are exposed as an API) but there are few other attack methods that we are aware of but do not give attention to the depth involved, from the perspective of the patterns used, or the cascading effect that they create, etc.

A good example of this type is the possibility of exploiting the serialization process. [Serialization implementation using Pickle library is detailed in the python documentation]. Serialization involves converting the model into a byte stream format so that it is easier perform operations like storing and loading the models as they are productionalized. There are many references where researchers have shown the possibility of injecting malicious code in pickle files and at the time of deserializing, the code injected could bring in adverse functioning of the overall solution by disturbing the model integrity. Our assessments must give more attention to these implementation logics as well.

Security warning about pickle library from Python documentation

2) Comprehensiveness

If Completeness refers to the entirety of the research, comprehensiveness refers to completeness over a broader scope. As AI practitioners, we need to think about the many components in the complete AI lifecycle and the potential security threats to each of them. All these components have not only gone through many advancements but also introduce new ways of threats. In the following sections, we will study the security threats involved in the advancements of these AI components that are undervalued.

a) Architectural Patterns

Many Neural Network architectures have been proposed in the last few years. Computer Vision and Text processing domains are great examples where many advancements have happened. One among them is the emergence of Transfer Learning. As mentioned earlier, there are not only value adds, but also security threats associated with this concept. There is a possibility of weight poisoning attack while data scientists implement Transfer Learning.

This paper on Weight Poisoning Attacks on Pre-trained Models explores the likelihood of threats associated in using pre-trained weights from untrustworthy sources, and how the adversaries could inject vulnerabilities in them, leading to weight poisoning attacks. Leveraging pre-trained weights improves the computation efforts and data scientists could use this directly in their model to fine-tune the target task (eg: classification). But the vulnerabilities in them could be triggered and exploited by attackers leading to misclassification.

To explain this attack at a high-level, the authors of this paper have used a combination of regularization method and an initialization procedure to build a concept called RIPPLES that work behind the scenes in successfully executing the weight poisoning attacks. A picture depicting how this initialization procedure called Embedding Surgery is triggered and the way it impacts the Embedding matrix is shown below.

b) Learning Algorithms:

One more key area where advancements have been happening are with the Learning Algorithms. We have seen in our examples so far, the use of Supervised Learning, that works on labelled data. But there are Unsupervised, Semi-supervised and Reinforcement Learning algorithms as well. Many of the Adversarial ML attacks are predominant in labelled datasets. For instance, many researchers have successfully shown the data poisoning attacks on Supervised Learning Algorithms. But there are possibilities of launching backdoor attacks on models implementing Semi Supervised Learning (SSL) algorithms by adversarially poisoning the unlabeled data.

As mentioned in this paper on Deep Hidden Backdoor Attack on Semi-supervised Learning via Adversarial Perturbation. The authors have proposed the idea of DeHiB — Deep Hidden Backdoor attack that could be performed using a combination of adversarial perturbations and trigger patterns to misclassify the model outcome. Based on the experiments done on SSL algorithms, research scientists who have authored the paper on The Perils of Learning From Unlabeled Data: Backdoor Attacks on Semi-supervised Learning, believe that backdoor poisoning attacks on unlabeled data could be performed by adversaries with limited knowledge but have a severe impact. Strongly encourage everyone to read this paper to get a deeper understanding of the attack patterns in SSL algorithms.

c) Computation Methods:

“Let’s think outside the box”. Well, Attackers do not necessarily use the adversarial examples to attack the Machine learning models. Some of the threats prevailing in the environment could be exploited as well and they are relatively easier. But there are few other attacks that require quite a lot of engineering efforts and one such example is the Memory exploitation performed on Deep Learning frameworks.

Popular frameworks like TensorFlow (there are other popular frameworks too) for building AI models have been proven vulnerable. TensorFlow leverages GPU to run deep learning algorithms because of large number of computations and the high-performance requirements. However, Research scientists have come up with an exploitation technique called Code warpping. As per their research published in Mind control attack: Undermining deep learning with GPU memory exploitation, the goal is to hijack the control flow of the GPU to achieve the arbitrary code execution ability by exploiting GPU function vulnerabilities. The outcome of this attack could help modify/reduce the prediction accuracy. The attack pre-requisites including the major steps of the process are nicely depicted in the figure below.

3) Capability

We appreciate the research involved in Adversarial AI. However, we need to be very practical when it comes to applying these ideas in real-world scenarios. Let us ask a couple of questions here to better understand the capability of Adversarial Attacks. “How much engineering effort does it take to generate these adversarial examples?”, “how much of cost does it take for the adversarial training?”. It is essential to ask these questions since a lot of times, the research work is proved with certain assumptions. There are hypothetical scenarios, pre-configured targets that are clearly mentioned in the research observations made. Hence the inferences made are more empirical and less practical. As practitioners try to adopt these to their real time scenarios, we need to be cognizant about the process and resource constraints including the outcome and benefits to all the stakeholders involved.

a) Complexity of Adversarial Attacks

The research paper on Adversarial Examples: Opportunity and Challenges clearly describes the constraints involved in Adversarial Examples and how to evaluate them as they detail the Cause, Characteristics and Evaluation Metrics. Based on what the researchers have observed, the cost of constructing these adversarial examples for adversarial training is expensive. Secondly, the Adversarial Examples are not going to yield a higher success rate, even after meticulously planning them. Additionally, the researchers from this paper say that careful considerations need to be given when constructing the adversarial perturbations to achieve a fine balance between developing them and the human visual system so that they are not easily distinguishable by human eyes.

The graph above is a great way to understand the complexity of the attack, given the abilities and goals of the attacker. If the goals of the attacker are to perform targeted misclassification, then the complexity of performing an attack is more. Of course, there are a few other information about metrics and other useful observations made in this paper that is worth a read.

b) Applicability of Adversarial Attacks in Data Processing Methods

Many of us would have come across the research work on Adversarial attacks focusing on scenarios where the adversary could observe the complete sample and then add perturbations at some point in the sample. Those attacks work well when the static data has been stored over a period and later inputted to the model. But today, we have many live tweet streaming applications, logs and traces of web apps that are monitored in real time, and hence the ML models are designed to work with these streaming data. To attack these models that works with streaming data, researchers have proposed a Real-Time Adversarial Attack Scheme.

As mentioned in this paper Real-Time Adversarial Attacks, the target system takes streaming input; only past data points can be observed, and adversarial perturbation can only be added to future data points. The use of adversarial perturbation generator continuously leverages the observed data to approximate an optimal adversarial perturbation for future data points (Figure describing the high-level steps below.)

What’s Next

We have looked at a different perspectives of security risks — the threats from Adversarial ML, including the length and breadth of research required and most importantly, the constraints and its applicability in the real world. Of course, the evolution and advancements are a never-ending thing with AI and so is the risk landscape, where we will get to experience many trends and new patterns. But whatever we have seen so far, gives us a better picture of the present risk landscape in AI. In our next article, we will focus more on the steps to be taken by organization and be prepared to face the challenges posed by the AI risks to build reliable solutions.

Subscribe to DDIntel Here.

Visit our website here: https://www.datadriveninvestor.com

Join our network here: https://datadriveninvestor.com/collaborate

--

--

Arun is a DevSecOps consultant with a strong interest in Product security and Security Data Science.