Mainframe security: From attack paths to best practices

November 7, 2025 | James Boorman

Summary

This article outlines prevalent mainframe attack paths, such as surrogate chaining, application breakouts, network-based entry, and permission hopping. It emphasizes a whole-system testing approach that combines platform- and application-level reviews, supplemented by purple teaming and attack path mapping, rather than superficial unauthenticated scans.

Best practices include applying vendor guidance, hardening security subsystems, segmenting networks, intelligent monitoring, modernizing authentication and access management, and encrypting sensitive data. Organizations are urged to move beyond minimum compliance toward proactive, regular, and context-aware testing.

Despite their importance, mainframes are often under-tested or misunderstood. In this article, we dive into the best practices of mainframe security testing and practical safeguards with our Security Consultant and Global Mainframe Lead, James Boorman.

Intro to mainframes

Strictly speaking, “mainframe” refers to systems like IBM z/OS, Unisys ClearPath, or certain Fujitsu models. Whether you’re calling it a mainframe or a midrange system with mainframe DNA, the security challenges, operational behaviors, and strategic importance remain largely the same.

These applications are built up over time and are highly interconnected to meet strict regulatory demands. What looks simple on the surface involves numerous systems working together to review customer information, verify transactions, and manage accounts.

Common attack paths in mainframe environments

Most mainframe vulnerabilities stem from access control misconfigurations, which makes privilege escalation and unintended data exposure real concerns. Let’s break down the most common and practical attack paths.

Most vulnerabilities stem from access control misconfigurations.

1. Surrogate chaining
One of the standout techniques is surrogate chaining, where an attacker abuses job submission permissions to run code under higher privileged user IDs. If User A can act as User B, and User B can act as User C, then User A effectively gains User C’s privileges, even without direct rights. These chains can grow complex and be exploited if not carefully managed, especially in large systems with sprawling permission structures.

2. Application breakouts
In online applications, attackers often look for a way to break out of the app’s restricted environment. By triggering errors or manipulating inputs, they aim to reach the underlying command environment, such as the CICS region. Having gained broader control, an attacker can start probing what access is possible at the platform level, even if they entered with a low privilege account.

3. Network-based entry points
Network-focused attack paths exist but are less common in practical testing. Gaining entry without credentials or direct access is difficult, given that most controls live inside the mainframe. Still, misconfigured services or stolen credentials through phishing can open doors, and these shouldn’t be overlooked.

4. Permission hopping via shared jobs or datasets
Mainframe security often revolves around resource protection and access assignments. If an attacker can alter a job that runs under another user’s credentials, they inherit that user’s access. Misconfigured datasets, improperly secured USS (Unix System Services), and over-permissioned profiles can make these jumps surprisingly easy.

Mainframe security largely comes down to navigating complex webs of access. System owners should stay vigilant, review permission hierarchies regularly, and test not just individual applications, but also how they connect across the environment. With thousands of users and highly interconnected apps, one weak link can expose the rest.

Mainframe security testing: platform vs. application

Security testing for mainframes typically happens on two levels:

Platform-level reviews. Broad assessments, similar to a build review. They target the overall platform, checking subsystems and mainframe specific controls. Because mainframes can host hundreds of applications and thousands or tens of thousands of databases and tables, going deep isn’t practical. Understanding why each person has access to every table would take mind-numbingly long. Instead, the assessment focuses on the wider configuration of the operating system and its subsystems, identifying systemic misconfigurations and weaknesses at the cost of specificity.

Application-level reviews. More focused and context-rich. Testers examine specific apps, like a payment review tool, along with the data it uses, who can access it, and how it behaves. This approach is typically suitable for assurance driven compliance and provides visibility into the security posture of the specific application and its resources. However, since mainframe applications rarely exist in isolation, this method can miss both the wider solution and the bigger system picture, especially when applications rely heavily on each other.

The gap can be bridged with strategic middle-ground assessments. These look at clusters of related applications (such as the 40 systems that make up a payment engine) to understand how data flows between them, where interdependencies lie, and how attackers might move laterally.

It’s a space where purple teaming or attack path mapping can add real value, combining offensive and defensive testing to simulate realistic threats through complex environments as part of a mainframe security assessment.

Mainframe Security

Strengthen your critical systems with a holistic mainframe security assessment that prevents breaches and reduces your attack surface.

Explore service

The reality of mainframe security testing

Mainframe systems often sit at the heart of an organization’s infrastructure, quietly powering critical operations behind the scenes. Despite their importance, they are frequently under-tested or misunderstood. This stems not just from their legacy nature but from a skills gap and a tendency to treat them as untouchable black boxes.

Mainframe security testing isn’t commonly practiced; it’s a niche skill set, and few professionals specialize in it. In many cases, organizations say they test mainframes but restrict those efforts to unauthenticated network scanning. When services don’t respond or appear hardened, testers often walk away satisfied. The result is superficial assessments that miss systemic vulnerabilities.

Organizations say they test mainframes but restrict those efforts to unauthenticated network scanning.

Why a combined platform and application approach matters

The organizations that truly commit to effective mainframe testing take a whole-system approach, separating platform-level configuration and application-level security within a comprehensive mainframe security assessment. This dual perspective provides:

Broad visibility into how infrastructure wide settings affect individual components
Granular understanding of application specific risks, such as poorly configured authentication or insecure data storage
Coverage for legacy weaknesses that stem from outdated assumptions about how the systems work

Testing both dimensions helps uncover attack paths rooted in decades-old code and documentation gaps.

Testing frequency: Compliance vs. proactivity

Security testing cycles vary widely. Some enterprises test mainframes only to meet compliance thresholds, annually or once every few years, depending on system criticality. Others take a more proactive stance, building regular mainframe testing into their operational security roadmap.

Some organizations still neglect to perform routine, comprehensive mainframe testing.

Common misconceptions and hidden risks

One recurring theme in mainframe testing is the prevalence of assumptions:

“The application passwords are encrypted” until closer inspection reveals they’re stored in plaintext.
“Application IDs can only be used by one user” until a tester successfully impersonates another user.
“No one can access this system” until someone tries… and succeeds.

These myths persist because documentation is missing, incomplete, or lost to time. The problem is made worse by knowledge silos and shrinking subject matter expertise.

Mainframe security deep dive — Leading vendors like IBM publish extensive documentation for securing their platforms and subsystems.

Mainframe security: Best practices for organizations

Mainframes continue to run critical services in many industries. Their longevity does not make them secure by default. Here are mainframe best practices to prioritize:

1. Follow vendor guidance, and actually apply it
Leading vendors, such as IBM, publish extensive best practice documentation for securing their platforms and subsystems. These resources offer practical architecture guidance and proven security configurations. Too often they’re overlooked or only partially implemented. The first security win is making sure they’re followed properly.

2. Perform regular, context-aware security testing
Mainframes shouldn’t be tested once every few years just to tick a compliance box. Security assessments should be recurring and adapted to each organization’s setup. Testing should span:

Platform-wide evaluations to examine overall configuration and access control
Application-level reviews that uncover business logic flaws and interface risks
Strategic audits tailored to the organization’s architecture and threat landscape

No two mainframe environments are identical, even when they run similar software. Testing should reflect how each system is deployed and configured.

3. Configure the security subsystem correctly
Mainframes use internal security subsystems to enforce access controls and define privilege boundaries. Misconfigurations here can undermine the entire platform. Organizations should:

Audit subsystem roles, groups, and policies regularly
Enforce least privilege for users and services
Ensure separation between test and production environments

4. Lock down network access with firewalls and segmentation
Production mainframes should be isolated from internal networks wherever possible. Strict firewalling and zone segregation help prevent lateral movement and unauthorized access. Air-gapped designs or tightly controlled access paths are ideal for critical workloads.

5. Monitor and audit with intelligence
Basic logging isn’t enough. Security teams should implement monitoring with alerts tied to behavioral anomalies, for example:

A Db2 table accessed at unusual hours
An unexpected admin login from a new source
Privilege escalation attempts outside approved workflows

Correlating these events across logs can surface hidden attack paths or misconfigurations.

6. Modernize authentication and access management
Many mainframes still rely on legacy password practices. Eight-character passwords with low complexity are common. Modern integrations exist and should be adopted:

Enable multi-factor authentication (MFA)
Use privileged access management tools
Implement just-in-time access for high privilege accounts
Rotate sensitive credentials on a schedule or per use

Security maturity varies, but updates here can improve defenses with relatively little friction.

8. Encrypt sensitive data at rest
A surprising number of legacy systems still store sensitive data in plain text. Apply current encryption standards wherever sensitive data resides:

Within databases and application tables
In archival or regulatory storage
For data in transit, avoid legacy SSL and TLS versions

Data-at-rest encryption helps protect against insider threats and improperly scoped access.

Industry-specific mainframe security practices

Mainframes are deeply embedded in sectors like finance, healthcare, aviation, insurance, and government. Even within the same industry, they’re configured and deployed in very different ways. While broad mainframe security principles apply, some contexts call for targeted best practices and awareness of specific regulatory frameworks.

Financial institutions: More than compliance

In banking and finance, mainframes often handle core transaction processing and customer data. These systems may fall under mandates such as PCI DSS when cardholder data is involved. PCI DSS offers checklists-tyle guidance for testing and encryption, but it should be treated as a starting point:

Treat PCI DSS as a baseline, not the finish line.
Customize testing to reflect actual platform risk.
Ensure sensitive data such as identifiers, credentials, and financial records is encrypted at rest and in transit.

Legacy systems can slip out of modern compliance pipelines simply because they’re hard to test.

Aviation: Legacy systems that still fly

Airlines rely heavily on mainframes for reservations, departure control, and crew scheduling. Despite that dependence, these systems are often considered untouchable. Even if direct testing is challenging, aviation operators should prioritize:

Network segregation of operational systems from passenger-facing applications
Monitoring of batch jobs and scheduling access to detect potential abuse
Updating authentication mechanisms across reservation platforms that integrate with legacy backends

While not bound by PCI DSS across the board, aviation companies still handle sensitive data such as passport numbers, personal information, and payment details, and must comply with international privacy laws and standards.

Avoid minimum viable testing

Regulated or not, requirements are only a floor. Best practice is simple: don’t settle for the bare minimum. Legacy systems are as critical as newer applications, often more so. Visibility is limited, the technology stack carries decades of change, and integration with modern platforms is tight. That combination makes mainframe environments both highly valuable and easier to misjudge.

Organizations should:

Treat mainframes as active components of modern architecture
Test regularly and purposefully, beyond compliance thresholds
Integrate mainframe security into enterprise-wide strategies

How Reversec can help

Our mainframe security assessment provides a holistic view of your environment, with end-to-end coverage that includes applications, network exposure, operating system controls, and other interactions within the wider environment. This approach helps you apply layered security controls and meaningfully reduce your attack surface.

We blend scarce mainframe security skills with modern testing techniques and tailor each assessment to your needs, providing maximum coverage and assurance aligned with mainframe security best practices. Our team also brings industry-specific experience from finance, aviation, healthcare, and government clients.

Q&A

Question:
Why isn’t unauthenticated network scanning enough for mainframe security?

Short answer:
Because inauthenticated scans often find little and give a false sense of security, missing:

Misconfigured security subsystems and privilege boundaries
Surrogate and dataset permission chains
Application-to-platform breakouts and lateral paths

Effective assurance requires authenticated, context-aware testing that exercises platform configurations, application behavior, and how systems interact, not just surface-level port checks.

Question:
What’s the difference between platform-level and application-level testing, and how do you bridge the gap?

Short answer:
Platform-level reviews provide broad visibility of operating system and subsystem configurations, surfacing systemic misconfigurations but not per-app nuances. Application-level reviews go deep on specific apps but can miss cross-application dependencies. To bridge the gap, use middle-ground assessments across related app clusters, combined with purple teaming and attack path mapping.

Question:
Which best practices should organizations prioritize to harden mainframes?

Short answer:

Apply vendor security guidance fully.
Perform regular, context-aware testing across platform, application, and architecture-specific attack paths.
Configure the security subsystem correctly.
Lock down network access with strict firewalling and segmentation.
Monitor intelligently: alert on anomalies and correlate events.
Modernize authentication and access.
Encrypt sensitive data at rest and use modern protocols in transit.

Mainframe Security

Penetration testing: Our approach

February 6, 2026

Mainframe Security

September 25, 2024

Our thinking

DORA compliance: Testing that makes sense

February 23, 2026

Top 5 common
misconfigurations in
cloud environments

January 28, 2025 | Christian Philipov and Mohit Gupta

SPEAKER: Janne Kauhanen

Many organizations believe that once their cloud environment is configured, their job is done. But this “set it and forget it” mentality can lead to major security vulnerabilities. Cloud environments are dynamic and require regular updates and reviews – to adapt to new threats and changes. Not staying on top of your cloud configurations can leave you with outdated security measures and put you at increased risk.

Another common misconception is that an initial entry point breach isn’t likely to happen to you. Organizations often set up security controls to restrict user logins and limit access to development environments. This in turn results in these controls being often treated as a complete security boundary that cannot be crossed. However, that is not entirely true. Attackers are becoming increasingly sophisticated, which is why it’s crucial to operate under the assumption that a breach can, and eventually will, happen to you. This mindset will help you evaluate the potential impact of a breach and implement measures to minimize damage.

Ready to see what we typically encounter in our engagements? Let’s dive in!

It’s crucial to operate under the assumption that a breach can, and eventually will, happen to you.

#1 Excessive privileges

One of the most common problems in cloud environments is the granting of excessive privileges to users and machine accounts. This misconfiguration can lead to privilege escalation, where a malicious actor gains higher-level access than intended. Cloud providers offer thousands of granular permissions, making it hard for administrators to fully understand the potential impact of each one.

For example, an application might only need permissions to read and write in a specific storage bucket or database resource but it might be granted additional, unnecessary permissions, increasing the risk if the account is compromised. The solution involves carefully reviewing and limiting permissions to the minimum level necessary for each role. This process can be time-consuming, especially for larger organizations, but it’s key to maintaining security.

Here are some practical ways to limit excessive privileges

Segregation of organizational workloads:
Where possible, different applications and the resources required by them should be provisioned into their own granular unit (AWS account, Azure subscription or GCP project, etc.). This would provide a strong control to minimize the impact of excessive permissions and limit the impact of a compromised identity.
Review current permissions:
Regularly audit the permissions granted to users and machine accounts. Identify and remove any unnecessary privileges.
Implement role-based access control (RBAC):
Use RBAC to assign group permissions based on roles within an environment and the level of access appropriate to that workload. This helps ensure that users only have access to what they need.
Use temporary credentials:
Where possible, use temporary credentials that expire after a certain period. This reduces the risk of long-term exposure.
Automate where possible:
Utilize automation tools to help manage and review permissions. While some manual review is always necessary, automation can significantly reduce your workload.

#2 Poor secrets management

Another common misconfiguration that impacts organizations of all sizes is poor secrets management. This issue often leads to privileged escalation opportunities within a cloud environment. With the widespread adoption of cloud infrastructure, ClickOps is just no longer practical, especially for organizations running multiple projects. As a result, many companies have turned to infrastructure as code (or IaC) to manage their cloud environments.

IaC allows you to define your company’s cloud infrastructure using templates in a human-readable code format. These templates define how your cloud environment should look and how individual resources should be configured. But while this approach does simplify the management of large estates, it can also introduce poor development practices, such as storing credentials or secrets in plain text within these templates.

The risks of poor secrets management

Storing secrets in plain text is a major security risk. Any user with access to these files can potentially use the exposed credentials to perform unauthorized actions on resources they typically wouldn’t have access to. For example, if a storage bucket’s credentials are stored in plain text, an attacker could gain access to sensitive data stored in that bucket. Similarly, hard-coded credentials could allow unauthorized access to other applications or services, leading to data breaches or other security incidents.

Follow these steps to improve secrets management

Use secure key storage:
Store secrets in secure key management services or vaults. These services provide a secure way to manage and access secrets without exposing them in plain text.
Reference secrets securely:
Instead of hard-coding secrets into templates, reference them from secure storage. This way, the actual secrets are fetched dynamically during execution, reducing the risk of exposure.
Regular audits:
Conduct regular audits of your IaC templates and other deployment scripts to ensure no secrets are stored in plain text. Automated tools can help you identify and remediate these issues.
Educate developers:
Make sure your developers understand the importance of secure secrets management and follow best practices when writing IaC templates.

#3 Long-lived credentials

Closely related to poor secrets management is the issue of long-lived credentials. While not inherently a vulnerability, long-lived credentials can pose significant security risks if not managed properly. These credentials, which have long expiry windows, can be exposed in various ways, such as being accidentally committed to public code repositories, left unprotected in deployment templates or more generally insecurely handled by human users.

Take these practical steps to mitigate the risks of long-lived credentials

Avoid storing credentials in plain text:
As with secrets, avoid storing long-lived credentials in plain text. Use secure storage solutions instead.

Use short-term credentials:
Where possible, use short-term credentials that expire after a short period. For example, AWS recommends using temporary credentials gained from a federated identity provider instead of using long-lived IAM user credentials.

Regularly rotate credentials:
Implement a policy to regularly rotate machine credentials to minimize the risk of exposure.

Monitor and audit:
Continuously monitor and audit the use of credentials to detect any unauthorized access or anomalies.

#4 Exposure of public endpoints

Cloud providers design their services to be user-friendly and easily manageable, often more so than traditional on-premise infrastructure. But this convenience comes with its own set of risks. One of the biggest is that resources in the cloud often have public endpoints, making them accessible from anywhere in the world.

The risks of public endpoints

Public endpoints let users interact with cloud resources directly via the Internet. While this is convenient for management purposes, it also means that if an attacker gains access to your environment, they can do whatever they want from wherever they may be. For example, if an attacker compromises your account, they could log in from another country and make changes to your resources without any geographical restrictions.

This issue is particularly prevalent in Azure, where public endpoints are often the default configuration for many resources. However, many organizations don’t need these endpoints to be public and would benefit greatly from restricting access.

Here are some simple tips on how to manage public endpoints

Restrict access with IP Allow Lists:
Configure endpoints to only be accessible from specific IP addresses. That way you can control who can access your resources based on their IP location.

Use private endpoints:
Where possible, make endpoints private so that they are only accessible from within your cloud environment. This adds an additional layer of security by limiting external access.

Regular audits:
Conduct regular audits of your cloud environment to identify and secure any public endpoints that don’t need to be exposed.

Assume compromise:
Work under the assumption that your environment could be compromised. By restricting public endpoints, you’ll limit the potential damage an attacker can do if they gain access.

Here’s an example
Imagine a storage bucket that contains sensitive data. If this bucket has a public endpoint, anyone with the right credentials can access it from anywhere. By restricting access to specific IP addresses or making the endpoint private, you’ll ensure that only authorized users within your network can interact with that data. This will greatly reduce the risk of unauthorized access and data breaches.

#5 Overly exposed internal networking

The last cloud misconfiguration on our list is overly exposed internal networking. This issue has been around for years, even in traditional on-premise networks. Organizations do often focus heavily on protecting their environments from external threats with robust perimeter defenses. But once an attacker gains a foothold inside your network, most of those internal protections aren’t enough to keep you safe.

The risks of overly exposed internal networking

In many cloud environments, internal network segmentation isn’t implemented to the level that it should be. Engineers might allow certain IP ranges to communicate freely for testing purposes, or resources might be deployed in a flat network structure. This lack of internal segmentation means that if an attacker compromises one resource, they can potentially move laterally within the network, accessing other resources that should be isolated.

For example, if an attacker compromises a virtual machine (VM) in the cloud, they could use it as a stepping stone to reach other sensitive applications or data. Without proper internal network segmentation, an attacker can move freely, increasing the risk of a significant breach.

Follow these steps to improve your internal network segmentation

Implement network segmentation:
Divide your cloud environment into smaller, isolated segments. Make sure that only necessary communication is allowed between these segments.

Use security groups and network ACLs:
Use security groups and network access control lists (ACLs) to define and enforce network traffic rules. This will help you restrict unnecessary communication between resources.

Regularly review and update your network policies:
Conduct regular reviews of your network policies to make sure that they are up-to-date and reflect the current needs of your environment. Remove any unnecessary permissions or access.

Keep an assume breach mindset:
Work under the assumption that a breach is possible. Evaluate the potential impact of a compromised resource and implement controls to minimize the blast radius.

Here’s an example
Consider an application that only needs to communicate with a database and a few other services. By segmenting the network and restricting the application’s communication to only these necessary services, you’ll be reducing the risk of lateral movement by an attacker. If the application is compromised, the attacker would have limited access and would be unable to reach other resources.

You don’t have to do it alone – Reversec can help

As the digital landscape evolves, avoiding these top 5 misconfigurations in the cloud will help your organization stay safe in an increasingly complicated and threat-laden world.

Here’s how we can help.

Assumed breach assessments
Think it can’t happen to you? Think again. We help organizations conduct assumed breach assessments by simulating attacks as specific user personas within your company. It’s a strategic service designed to identify potential actions from that user’s perspective and addresses vulnerabilities. It helps you identify issues and resolve them comprehensively – before a real breach happens.

Collaborative exercises
Staying on top of security – together. Collaborative exercises with a security consultancy like Reversec are invaluable to keeping your organization and sensitive data safe. These exercises offer fresh perspectives, helping you to identify and fix any misconfiguration issues you may have. Initial assumptions made during the setup of a cloud environment may no longer be relevant as services and features change. Regular reviews and updates are key to maintaining a secure environment.

The value of fresh perspectives
Having a fresh pair of eyes review your cloud environment can reveal things your team may have missed. Our security consultants can help you understand why certain configurations were set up in the first place and, when needed, propose more secure alternatives. Together, we’ll ensure that your environment is always secure and up-to-date with the latest best practices.

Challenging assumptions
Cloud environments are complex, and it’s crucial to question assumptions about service operations. Organizations should continuously review and understand the exact risks posed by their cloud configurations. Through our research-driven approach, we help companies in identifying the gap between what cloud providers claim is possible and what is practically achievable – essentially, the difference between theoretical and practical functionality.

Misconfigurations in cloud environments often occur due to misunderstood nuances to how cloud resources operate in any provider and due to a lack of regular holistic auditing of the deployed resources.

By assuming that breaches are possible, evaluating user access, conducting assumed breach assessments, and collaborating with security experts, you can significantly enhance your organization’s cloud security posture. This proactive approach will not only secure your environment but also make it more resilient against potential attacks.

If you would like to hear more about how we can help you protect your cloud environment, don’t hesitate to contact us!

Our Cloud Security experts are happy to help.

CloudWatch Dashboard (Over)Sharing

January 16, 2025 | Leonidas Tsaousis

Generative AI security: Findings from our research

At Reversec, we distinguish ourselves through our research-led approach. Our consultants dedicate part of their time to research – exploring the security of emerging technologies, such as GenAI. This approach allows us to stay ahead of new threats and provide real-world, actionable insights to our clients.

Our focus on GenAI security stems from the increasing adoption of large language models (LLMs) – like ChatGPT and Google Gemini – in enterprise environments. These systems, while powerful, introduce new and complex security risks, particularly around prompt injection attacks.

With our ongoing research on GenAI, we are looking to continuously deepen the understanding and raise awareness of these vulnerabilities to help organizations defend against potential exploits and ensure that they can safely leverage AI technologies.

Explore our GenAI security
research and experiments

Spikee – Simple Prompt Injection Kit for Evaluation and Exploitation

Spikee is an open-source toolkit that we created to help security practitioners and developers test LLM applications for prompt injection attacks that can lead to exploitation – examples of which include data exfiltration, XSS (Cross-Site Scripting), and resource exhaustion. A key feature of spikee is that it makes it easy to create custom datasets tailored to specific use cases, rather than flooding applications with many generic jailbreak attacks.

Check out the code: https://github.com/WithSecureLabs/spikee
Tutorial: https://labs.withsecure.com/tools/spikee

Visit spikee’s website

Multi-Chain Prompt Injection Attacks

We introduce multi-chain prompt injection, an exploitation technique targeting applications that chain multiple LLM calls to process and refine tasks sequentially.

Current testing methods for jailbreak and prompt injection vulnerabilities fall short in multi-chain scenarios, where queries are rewritten, passed through plugins, and formatted (e.g., XML/JSON), obscuring attack success. Multi-chain prompt injection exploits interactions between chains, bypassing intermediate processing and propagating adversarial prompts to achieve malicious objectives.

Explore the technique: Try our sample app or the related Colab Notebook. We also have a public CTF challenge to experiment hands-on.

Read on Reversec Labs

Prompt Injection in JetBrains Rider AI Assistant

This advisory explore security issues that could arise when using GenAI assistants integrated within software development IDEs.

Specifically, we demonstrate how prompt injection injection can be leveraged by an attacker to exfiltrate confidential information from a development environment. If an untrusted code snippet containing a malicious payload is passed to the AI Assistant (for example, when asking for an explanation), the injected instructions will be executed by the underlying LLM.

We also provide recommendations to developers of such GenAI code assistants on how to mitigate these issues and reduce their impact.

Read on Reversec Labs

Should you let ChatGPT control your browser?

This research investigates the security risks of granting LLMs control over web browsers.

We explore scenarios where attackers exploit prompt injection vulnerabilities to hijack browser agents, leading to sensitive data exfiltration or unauthorized actions, such as merging malicious code into repositories.

Our findings are relevant to developers and organizations that integrate LLMs into browser environments, highlighting the critical security measures needed to protect users and systems from such attacks.

Read on Reversec Labs

Fine-tuning LLMs to resist indirect prompt injection attacks

We fine-tuned Llama3-8B to enhance its resilience against indirect prompt injection attacks, where malicious inputs can manipulate a language model’s behavior in tasks like summarizing emails or processing documents.

Building on insights from Microsoft and OpenAI, our approach focused on using specialized markers to separate trusted instructions from user data. This exploration allowed us to understand the effectiveness of these methods in real-world scenarios.

Organizations and developers can access our model and fine-tuning scripts on Hugging Face and Ollama to test and extend our findings.

Read on Reversec Labs

When your AI assistant has an evil twin (Google Gemini prompt injection)

In this piece of research, we demonstrate how Google Gemini Advanced can be manipulated through prompt injection attacks to perform social engineering. By sending a single email to the target user, attackers can trick LLMs into misleading users into revealing sensitive information.

This research is crucial for enterprises using LLMs in environments that handle confidential data, as it exposes how easily attackers can compromise the integrity of the AI’s output.

We also discuss the current limitations in defending against these attacks and emphasize the need for caution when using AI-powered assistants.

Read on Reversec Labs

Synthetic recollections (prompt injection in ReAct agents)

This research focuses on how prompt injection can be used to manipulate ReAct (Reason and Act) LLM agents by injecting false observations or commands. Such attacks can lead the LLM to make incorrect decisions or take unintended actions, which could benefit attackers.

This type of vulnerability is particularly concerning for organizations using LLM agents in decision-making processes.

We outline two main categories of attacks, discuss their potential impact on operations, and suggest strategies to mitigate the risks associated with these manipulations.

Read on Reversec Labs

Domain-specific prompt injection detection with BERT classifier

We explore how domain-specific prompt injection attacks can be detected using a fine-tuned BERT classifier. This research provides a practical method for identifying malicious prompts by training a small model on domain-specific data, allowing it to distinguish between legitimate and malicious inputs.

It’s aimed at security teams and developers working with AI models who need reliable mechanisms to detect prompt injections.

We present our findings on the effectiveness of this approach and how it can be implemented in various environments.

Read on Reversec Labs

LLM Application Security Canavas

This one-page canvas offers a structured framework of security controls designed to protect LLM applications from prompt injection and jailbreak attacks.

Based on our client engagements, the canvas outlines the essential steps needed to secure inputs and outputs in LLM workflows.

It’s a practical tool for developers and security teams looking to safeguard their AI systems by implementing tested controls that reduce the likelihood of prompt-based attacks and mitigate their impact.

Download

Generative AI – An Attacker’s View

This blog delves into the use of GenAI (Generative Artificial Intelligence) by threat actors and what we can do to defend ourselves.

Topics covered include: social engineering, phishing, recon and malware generation.

Read on Reversec Labs

GenAI security challenges

To foster hands-on learning, we’ve released two public Capture the Flag (CTF) challenges focused on LLM security.

The challenges are designed for security researchers and practitioners who want practical experience with LLM security issues.

MyLLMBank

This challenge allows you to experiment with jailbreaks/prompt injection against LLM chat agents that use ReAct to call tools.

https://myllmbank.com

MyLLMDoc

This is an advanced challenge focusing on multi-chain prompt injection scenarios.

https://myllmdoc.com/

Prompt injections could confuse AI-powered agents

May 17, 2024 | Reversec

Everyone knows about SQL injections, but what about prompt injections? What do they mean for AI?

AI in general, and large language models (LLMs) in particular, have exploded in popularity. And this momentum is likely to continue or even increase as companies look into using LLMs to power AI applications capable of interacting with real people and take actions that affect the world around them.

Reversec’s Donato Capitella, a security consultant and researcher, wanted to explore how attackers could potentially compromise these agents. And prompt injection techniques gave him his answer.

The Echoes of SQL Vulnerabilities

Prompt injection techniques are specifically crafted inputs that attackers feed to LLMs as part of a prompt to manipulate responses. In a sense, they’re similar to the SQL injections that attackers have been using to attack databases for years. Injections are basically just commands attackers input to a vulnerable, external system.

Whereas SQL injections affect databases, prompt injections impact LLMs. Sometimes, a successful prompt injection might not have much of an impact. As Donato points out in his research (available here), in situations where the LLM is isolated from other users or systems, an injection probably won’t be able to do much damage.

However, companies aren’t building LLM applications to work in isolation, so they should understand what risks they’re exposing themselves if they neglect to secure these AI deployments.

ReAct Agents

One potential innovation where LLMs could play a key role is in the creation of AI agents—or ReAct (reasoning plus action) agents if you want to be specific. These agents are essentially programs that use LLMs (like GPT-4) to accept input, and then use logical reasoning to decide and execute a specific course of action according to its programming.

The way these agents use reasoning to make decisions involves a thought/observation loop. Specifics are available in Donato’s research on Reversec Labs (we highly recommend reading it for a more detailed explanation). Basically, the agent provides thoughts it has about a particular prompt it’s been given. That output is then checked to see if it contains an action that requires the agent to access a particular tool it’s programmed to use.

If the thought requires the agent to take an action, the result of the action becomes an observation. The observation is then incorporated into the output, which is then fed back into the thought/observation loop and repeated until the agent has addressed the initial prompt from the user.

To illustrate this process, and learn how to compromise it, Donato created a chatbot for a fictional book-selling website that can help customers request information on recent orders or ask for refunds.

Prompt injections reduce AI to confused deputies

The chatbot, powered by GPT-4, could access order data for users and determine refund eligibility for orders that were not delivered within the website’s two-week delivery timeframe (as per its policy).

Donato found that he could use several different prompt injection techniques to trick the agent into processing refunds for orders that should have been ineligible. Specifics are available in his blog, but he essentially tricked the agent into thinking that it had already checked for information from its system that he actually provided to it via prompts—information like fake order dates. Since the agent thought it recalled the fake dates from the appropriate system (rather than via Donato’s prompts), it didn’t realize the information was fake, and that it was being tricked.

Here’s a video showing one of the techniques Donato used:

Securing AI agents

Pointing to the work from the OWASP Top Ten for LLMs, Donato’s research identifies several ways an attacker could compromise an LLM ReAct agent. And while it’s a proof-of-concept, it does illustrate the kind of work that organizations need to do to secure these types of AI applications, and what the cyber security industry is doing to help.

There’s two distinct yet related mitigation strategies.

The first is to limit the potential damage a successful injection attack can cause. Specific recommendations based on Donato’s research include:

Enforcing stringent privilege controls to ensure LLMs can access only the essentials, minimizing potential breach points.
Incorporating human oversight for critical operations to add a layer of validation, acting as a safeguard against unintended LLM actions.
Adopting solutions such as OpenAI Chat Markup Language (ChatML) that attempt to segregate genuine user prompts from other content. These are not perfect but diminish the influence of external or manipulated inputs.
Treating the LLMs as untrusted, always maintaining external control in decision-making and being vigilant of potentially untrustworthy LLM responses.

The second is to secure any tools or systems that the agent may have access to, as compromises in those will inevitably lead to the agent making bad decisions—possibly in service of an attacker.

Read more research articles on securing AI agents on our Labs:

Table of contents

Summary

Intro to mainframes

Common attack paths in mainframe environments

Mainframe security testing: platform vs. application

Strengthen your critical systems with a holistic mainframe security assessment that prevents breaches and reduces your attack surface.

The reality of mainframe security testing

Why a combined platform and application approach matters

Testing frequency: Compliance vs. proactivity

Common misconceptions and hidden risks

Mainframe security: Best practices for organizations

Industry-specific mainframe security practices

Financial institutions: More than compliance

Aviation: Legacy systems that still fly

Avoid minimum viable testing

How Reversec can help

Q&A

Table of contents

Mainframe Security

Related content

Penetration testing: Our approach

Mainframe Security

DORA compliance: Testing that makes sense

Table of contents

#1 Excessive privileges

Here are some practical ways to limit excessive privileges

#2 Poor secrets management

The risks of poor secrets management

Follow these steps to improve secrets management

#3 Long-lived credentials

Take these practical steps to mitigate the risks of long-lived credentials

#4 Exposure of public endpoints

The risks of public endpoints

Here are some simple tips on how to manage public endpoints

#5 Overly exposed internal networking

The risks of overly exposed internal networking

Follow these steps to improve your internal network segmentation

You don’t have to do it alone – Reversec can help

Table of contents

Related content

How to run successful Kubernetes attack simulations?

Explore our GenAI securityresearch and experiments

Spikee – Simple Prompt Injection Kit for Evaluation and Exploitation

Multi-Chain Prompt Injection Attacks

Prompt Injection in JetBrains Rider AI Assistant

Should you let ChatGPT control your browser?

Fine-tuning LLMs to resist indirect prompt injection attacks

When your AI assistant has an evil twin (Google Gemini prompt injection)

Synthetic recollections (prompt injection in ReAct agents)

Domain-specific prompt injection detection with BERT classifier

LLM Application Security Canavas

Generative AI – An Attacker’s View

GenAI security challenges

MyLLMBank

MyLLMDoc

Related content

Generative AI Security

(Deep)Learning to trust

Building secure LLM apps into your business

Prompt injections could confuse AI-powered agents

Table of contents

The Echoes of SQL Vulnerabilities

ReAct Agents

Prompt injections reduce AI to confused deputies

Securing AI agents

Table of contents

Related content

Building secure LLM apps into your business

Striking the balance – EU AI Act and its impact on cybersecurity

What is Attack Path Mapping

Explore our GenAI security
research and experiments