REST API Load Testing

Performance test your REST APIs with k6, Artillery, and Locust — benchmarks, thresholds, and CI/CD integration

Last Updated:

Test Types

Different test types answer different questions about your API's performance. Always establish a baseline before adding load.

Test TypeQuestion AnsweredPattern
Load testDoes the API handle expected traffic?Ramp to target VUs, sustain, ramp down
Stress testWhat is the breaking point?Increase VUs until errors appear
Spike testHow does the API handle sudden bursts?Instant jump to 10× normal load
Soak testDoes it degrade over time (memory leaks)?Sustained load for 2–24 hours
Smoke testDoes it work at all?1–2 VUs, verify basic functionality

Key Metrics

  • p50 latency — median response time; 50% of requests are faster than this
  • p95 latency — 95% of requests are faster; the standard SLO target
  • p99 latency — 99% of requests are faster; captures tail latency that affects real users
  • Throughput (RPS) — requests per second your API sustains
  • Error rate — % of 4xx/5xx responses; should stay below 1% under load
  • Virtual users (VUs) — concurrent simulated users

k6 — Cloud-Native Load Testing

k6 is a Go-based load testing tool with a JavaScript scripting API. It is the most popular choice in 2026 for DevOps/platform teams.

npm install -g k6   # or: brew install k6
// load-test.js
import http from 'k6/http';
import { check, sleep, group } from 'k6';
import { Rate, Trend } from 'k6/metrics';

const errorRate   = new Rate('errors');
const createTime  = new Trend('create_order_duration');

export const options = {
  stages: [
    { duration: '30s', target: 10  },   // ramp up to 10 VUs
    { duration: '2m',  target: 50  },   // ramp to 50 VUs
    { duration: '1m',  target: 50  },   // stay at 50 VUs
    { duration: '30s', target: 0   },   // ramp down
  ],
  thresholds: {
    http_req_duration:        ['p(95)<500'],  // 95% of requests under 500ms
    http_req_failed:          ['rate<0.01'],  // less than 1% errors
    create_order_duration:    ['p(99)<1000'], // order creation p99 under 1s
    errors:                   ['rate<0.05'],  // custom error rate
  },
};

const BASE_URL = __ENV.BASE_URL || 'https://api.example.com';
const TOKEN    = __ENV.API_TOKEN;

export default function () {
  group('List users', () => {
    const res = http.get(`${BASE_URL}/users?limit=20`, {
      headers: { Authorization: `Bearer ${TOKEN}` }
    });
    check(res, {
      'status 200':             r => r.status === 200,
      'response time < 200ms':  r => r.timings.duration < 200,
      'has data array':         r => JSON.parse(r.body).data !== undefined,
    });
    errorRate.add(res.status >= 400);
  });

  group('Create order', () => {
    const start = Date.now();
    const res = http.post(
      `${BASE_URL}/orders`,
      JSON.stringify({
        customer_id: '550e8400-e29b-41d4-a716-446655440000',
        items: [{ product_id: 'prod_123', quantity: 1 }],
        idempotency_key: `test-${__VU}-${__ITER}`  // unique per VU + iteration
      }),
      {
        headers: {
          'Content-Type': 'application/json',
          Authorization: `Bearer ${TOKEN}`
        }
      }
    );
    createTime.add(Date.now() - start);
    check(res, {
      'order created': r => r.status === 201,
      'has order_id':  r => JSON.parse(r.body).order_id !== undefined,
    });
    errorRate.add(res.status >= 400);
  });

  sleep(1);  // think time between iterations
}
# Run locally
k6 run --env BASE_URL=https://api-staging.example.com --env API_TOKEN=xxx load-test.js

# Sample output:
# http_req_duration.............: avg=87ms  min=23ms  med=72ms  max=890ms  p(90)=198ms p(95)=312ms
# http_req_failed...............: 0.12% 6 out of 5000
# ✓ http_req_duration p(95) < 500ms
# ✓ http_req_failed rate < 1%

Artillery — YAML-First Load Testing

npm install -g artillery
# artillery.yaml
config:
  target: "https://api.example.com"
  http:
    timeout: 10
  phases:
    - duration: 60
      arrivalRate: 10
      name: "Warm up"
    - duration: 120
      arrivalRate: 50
      rampTo: 100
      name: "Ramp up"
  defaults:
    headers:
      Authorization: "Bearer {{ $env.API_TOKEN }}"
      Content-Type: "application/json"

scenarios:
  - name: "User workflow"
    weight: 70
    flow:
      - get:
          url: "/users"
          expect:
            - statusCode: 200
            - contentType: json
      - post:
          url: "/orders"
          json:
            customer_id: "{{ $randomString(10) }}"
            items:
              - product_id: "prod_123"
                quantity: 1
          expect:
            - statusCode: 201

  - name: "Read-only"
    weight: 30
    flow:
      - get:
          url: "/products?limit=20"
artillery run --output results.json artillery.yaml
artillery report results.json  # generates HTML report

Locust — Python Load Testing

# locustfile.py
from locust import HttpUser, task, between
import json, uuid

class APIUser(HttpUser):
    wait_time = between(1, 3)  # 1–3s between tasks

    def on_start(self):
        # Login and store token
        res = self.client.post('/auth/token', json={
            'client_id': 'test', 'client_secret': 'secret'
        })
        self.token = res.json()['access_token']
        self.headers = {'Authorization': f'Bearer {self.token}'}

    @task(3)  # weight 3: called 3x more often
    def list_users(self):
        self.client.get('/users?limit=20', headers=self.headers)

    @task(1)
    def create_order(self):
        self.client.post('/orders',
            json={
                'customer_id': str(uuid.uuid4()),
                'items': [{'product_id': 'prod_123', 'quantity': 1}],
                'idempotency_key': str(uuid.uuid4())
            },
            headers=self.headers
        )
# Run
locust --headless --users 50 --spawn-rate 5 --run-time 2m --host https://api.example.com

Tool Comparison

ToolLanguageBest ForCloud
k6JavaScriptDevOps teams, CI/CD, detailed metricsk6 Cloud
ArtilleryYAML + JSYAML-driven workflows, quick setupArtillery Cloud
LocustPythonPython teams, complex scenariosSelf-hosted
JMeterXML/JavaEnterprise teams, legacy systemsBlazeMeter
GatlingScala/JavaJava ecosystem, high-accuracy reportsGatling Cloud

Interpreting Results

Common failure patterns and what they mean:

  • p95 rises sharply at N VUs — you've hit a resource bottleneck (CPU, DB connections, thread pool). Find the bottleneck with monitoring.
  • Error rate spikes at high loadrate limiting is triggering, or the server is exhausting its connection pool.
  • Latency grows over time in soak test — memory leak, connection pool exhaustion, or cache pollution.
  • First 5s are slow then fast — JIT/warmup effect; exclude from metrics or pre-warm.

CI/CD Integration

# .gitlab-ci.yml
load-test:
  stage: test
  image: grafana/k6
  script:
    - k6 run
        --env BASE_URL=$STAGING_URL
        --env API_TOKEN=$STAGING_TOKEN
        --out json=results.json
        load-test.js
  artifacts:
    reports:
      junit: results.xml
    paths:
      - results.json
  only:
    - main

Run load tests against your staging environment on every merge to main. Set thresholds to fail the pipeline if p95 latency or error rate exceeds acceptable limits. See API Testing guide for unit and integration test patterns.