Generative AI (LLM) Security Assessments

At SilentGrid Security, we've observed increased interest in security testing for applications using generative AI Large Language Models (LLMs). These models, which power platforms like ChatGPT, offer compelling business benefits but require careful security consideration. This post outlines our approach to LLM security testing and examines key considerations for organisations implementing these technologies.

A lot of customers have come to us, both with specific testing in mind, but also seeking information about what might be the implications of deploying LLMs into their environments. The answer to this question is complicated, and will depend on a number of factors:

The application's intended use: "What you are doing with your application?"
User access patterns: "Who has access to the application?"
User access patterns: "What does the application have access to inside your organisation?"
Data sensitivity levels: "What is the sensitivity of the data exposed by the application?"

LLM-based applications face similar security risks to traditional applications, ranging from information disclosure to code execution vulnerabilities. However, the exploitation methods differ due to the unique way LLMs process information.

Traditional applications separate data and instructions in well defined ways. One example is JavaScript in an HTML page. The region containing the instructions (JavaScript) is clearly separated from the rest of the HTML that is the content of the page. In a cross-site scripting attack, that clear definition is exploited so that what is intended to be data is interpreted as instructions. For LLMs, which generally operate on human languages, there is no division between data and instruction, the data which is sent to the LLM contains the instructions for what to do. Interaction with the application by a user sends data which can contain instructions as part of it.

At SilentGrid, we have developed a general framework for testing such applications, and it is based around three stages.

Stage 1: Ingress

The first stage is controlling the input into the model. This could be through the standard interface in a chat bot, for example, or via data that the model ingests when asked to do something like perform a search, or review a document.

Stage 2: Escape

Once a suitable entry point is achieved, stage two is about finding a way to circumvent the models protections, access any data which might be sensitive or perform any actions which may not be evident to normal users.

Stage 3: Egress

The third stage is to exfiltrate the data obtained, that is, get it out of the normal interface and into the hands of the attacker. It should be noted that these attacks, in some circumstances can be crafted so that they are executed by a normal user of the system, with no knowledge that they are occurring.

In addition, we can help in testing beyond traditional security flaws found in LLMs. By their nature LLMs have the potential to generate content, and in some situations, the content which is generated has the potential to be harmful. By working with our clients, we can assess the effectiveness of controls implemented to prevent the generation of this type of content and prevent damage to users or company reputation. The techniques used are a subset of those used in our general approach, but the focus is significantly different.

The field of generative AI security is dynamic, with new challenges emerging regularly. Our team has been working hard to maintain current knowledge of emerging threats and testing methodologies to provide comprehensive security assessments. OWASP has recently updated the Top 10 for LLMs, which describes further threats in different parts of the LLM supply chain, including the training of models. This requires a very specialised approach which we can discuss further over a call with one of our consultants who are experts in this space.

If you are using LLMs in your business, or considering their implementation, contact us to discuss how we can help evaluate and enhance their security.

📧: info@silentgrid.com

℡: +61 1300 615 201