Load testing Python chat app using RAG with Locust

This article provides the process to perform load testing on a Python chat application using the RAG pattern with Locust, a popular open-source load testing tool. The primary objective of load testing is to ensure that the expected load on your chat application doesn't exceed the current Azure OpenAI Transactions Per Minute (TPM) quota. By simulating user behavior under heavy load, you can identify potential bottlenecks and scalability issues in your application. This process is crucial for ensuring that your chat application remains responsive and reliable, even when faced with a high volume of user requests.

Watch the demonstration video to understand more about load testing the chat app.

Note

This article uses one or more AI app templates as the basis for the examples and guidance in the article. AI app templates provide you with well-maintained, easy to deploy reference implementations that help to ensure a high-quality starting point for your AI apps.

Prerequisites

Open Load test sample app

The load test is in Python chat app solution as a Locust test. You need to return to that article, deploy the solution, then use that dev container development environment to complete the following steps.

Run the test

  1. Install the dependencies for the load test.

    python3 -m pip install -r requirements-dev.txt
    
  2. Start Locust, which uses the Locust test file: locustfile.py found at the root of the repository.

    locust
    
  3. Open the running Locust web site such as http://localhost:8089.

  4. Enter the following in the Locust web site.

    Property Value
    Number of users 20
    Ramp up 1
    Host https://<YOUR-CHAT-APP-URL>.azurewebsites.net

    Screenshot of Locust test with values filled in.

  5. Select Start Swarm to start the test.

  6. Select Charts to watch the test progress.

    Screenshot of Locust chart during test run.

Clean up resources

When you're done with load testing, clean up the resources. The Azure resources created in this article are billed to your Azure subscription. If you don't expect to need these resources in the future, delete them to avoid incurring more charges. After you delete resource specific to this article, remember to return to the other chat app tutorial and follow the clean-up steps.

Return to the chat app article to clean up those resources.

Get help

If you have trouble using this load tester, log your issue to the repository's Issues.