REST API Load Testing

Performance test your REST APIs with k6, Artillery, and Locust — benchmarks, thresholds, and CI/CD integration

Last Updated: May 13, 2026

Test Types

Different test types answer different questions about your API's performance. Always establish a baseline before adding load.

Test Type	Question Answered	Pattern
Load test	Does the API handle expected traffic?	Ramp to target VUs, sustain, ramp down
Stress test	What is the breaking point?	Increase VUs until errors appear
Spike test	How does the API handle sudden bursts?	Instant jump to 10× normal load
Soak test	Does it degrade over time (memory leaks)?	Sustained load for 2–24 hours
Smoke test	Does it work at all?	1–2 VUs, verify basic functionality

Key Metrics

p50 latency — median response time; 50% of requests are faster than this
p95 latency — 95% of requests are faster; the standard SLO target
p99 latency — 99% of requests are faster; captures tail latency that affects real users
Throughput (RPS) — requests per second your API sustains
Error rate — % of 4xx/5xx responses; should stay below 1% under load
Virtual users (VUs) — concurrent simulated users

k6 — Cloud-Native Load Testing

k6 is a Go-based load testing tool with a JavaScript scripting API. It is the most popular choice in 2026 for DevOps/platform teams.

npm install -g k6   # or: brew install k6

// load-test.js
import http from 'k6/http';
import { check, sleep, group } from 'k6';
import { Rate, Trend } from 'k6/metrics';

const errorRate   = new Rate('errors');
const createTime  = new Trend('create_order_duration');

export const options = {
  stages: [
    { duration: '30s', target: 10  },   // ramp up to 10 VUs
    { duration: '2m',  target: 50  },   // ramp to 50 VUs
    { duration: '1m',  target: 50  },   // stay at 50 VUs
    { duration: '30s', target: 0   },   // ramp down
  ],
  thresholds: {
    http_req_duration:        ['p(95)<500'],  // 95% of requests under 500ms
    http_req_failed:          ['rate<0.01'],  // less than 1% errors
    create_order_duration:    ['p(99)<1000'], // order creation p99 under 1s
    errors:                   ['rate<0.05'],  // custom error rate
  },
};

const BASE_URL = __ENV.BASE_URL || 'https://api.example.com';
const TOKEN    = __ENV.API_TOKEN;

export default function () {
  group('List users', () => {
    const res = http.get(`${BASE_URL}/users?limit=20`, {
      headers: { Authorization: `Bearer ${TOKEN}` }
    });
    check(res, {
      'status 200':             r => r.status === 200,
      'response time < 200ms':  r => r.timings.duration < 200,
      'has data array':         r => JSON.parse(r.body).data !== undefined,
    });
    errorRate.add(res.status >= 400);
  });

  group('Create order', () => {
    const start = Date.now();
    const res = http.post(
      `${BASE_URL}/orders`,
      JSON.stringify({
        customer_id: '550e8400-e29b-41d4-a716-446655440000',
        items: [{ product_id: 'prod_123', quantity: 1 }],
        idempotency_key: `test-${__VU}-${__ITER}`  // unique per VU + iteration
      }),
      {
        headers: {
          'Content-Type': 'application/json',
          Authorization: `Bearer ${TOKEN}`
        }
      }
    );
    createTime.add(Date.now() - start);
    check(res, {
      'order created': r => r.status === 201,
      'has order_id':  r => JSON.parse(r.body).order_id !== undefined,
    });
    errorRate.add(res.status >= 400);
  });

  sleep(1);  // think time between iterations
}

# Run locally
k6 run --env BASE_URL=https://api-staging.example.com --env API_TOKEN=xxx load-test.js

# Sample output:
# http_req_duration.............: avg=87ms  min=23ms  med=72ms  max=890ms  p(90)=198ms p(95)=312ms
# http_req_failed...............: 0.12% 6 out of 5000
# ✓ http_req_duration p(95) < 500ms
# ✓ http_req_failed rate < 1%

Artillery — YAML-First Load Testing

npm install -g artillery

# artillery.yaml
config:
  target: "https://api.example.com"
  http:
    timeout: 10
  phases:
    - duration: 60
      arrivalRate: 10
      name: "Warm up"
    - duration: 120
      arrivalRate: 50
      rampTo: 100
      name: "Ramp up"
  defaults:
    headers:
      Authorization: "Bearer {{ $env.API_TOKEN }}"
      Content-Type: "application/json"

scenarios:
  - name: "User workflow"
    weight: 70
    flow:
      - get:
          url: "/users"
          expect:
            - statusCode: 200
            - contentType: json
      - post:
          url: "/orders"
          json:
            customer_id: "{{ $randomString(10) }}"
            items:
              - product_id: "prod_123"
                quantity: 1
          expect:
            - statusCode: 201

  - name: "Read-only"
    weight: 30
    flow:
      - get:
          url: "/products?limit=20"

artillery run --output results.json artillery.yaml
artillery report results.json  # generates HTML report

Locust — Python Load Testing

# locustfile.py
from locust import HttpUser, task, between
import json, uuid

class APIUser(HttpUser):
    wait_time = between(1, 3)  # 1–3s between tasks

    def on_start(self):
        # Login and store token
        res = self.client.post('/auth/token', json={
            'client_id': 'test', 'client_secret': 'secret'
        })
        self.token = res.json()['access_token']
        self.headers = {'Authorization': f'Bearer {self.token}'}

    @task(3)  # weight 3: called 3x more often
    def list_users(self):
        self.client.get('/users?limit=20', headers=self.headers)

    @task(1)
    def create_order(self):
        self.client.post('/orders',
            json={
                'customer_id': str(uuid.uuid4()),
                'items': [{'product_id': 'prod_123', 'quantity': 1}],
                'idempotency_key': str(uuid.uuid4())
            },
            headers=self.headers
        )

# Run
locust --headless --users 50 --spawn-rate 5 --run-time 2m --host https://api.example.com

Tool Comparison

Tool	Language	Best For	Cloud
k6	JavaScript	DevOps teams, CI/CD, detailed metrics	k6 Cloud
Artillery	YAML + JS	YAML-driven workflows, quick setup	Artillery Cloud
Locust	Python	Python teams, complex scenarios	Self-hosted
JMeter	XML/Java	Enterprise teams, legacy systems	BlazeMeter
Gatling	Scala/Java	Java ecosystem, high-accuracy reports	Gatling Cloud

Interpreting Results

Common failure patterns and what they mean:

p95 rises sharply at N VUs — you've hit a resource bottleneck (CPU, DB connections, thread pool). Find the bottleneck with monitoring.
Error rate spikes at high load — rate limiting is triggering, or the server is exhausting its connection pool.
Latency grows over time in soak test — memory leak, connection pool exhaustion, or cache pollution.
First 5s are slow then fast — JIT/warmup effect; exclude from metrics or pre-warm.

CI/CD Integration

# .gitlab-ci.yml
load-test:
  stage: test
  image: grafana/k6
  script:
    - k6 run
        --env BASE_URL=$STAGING_URL
        --env API_TOKEN=$STAGING_TOKEN
        --out json=results.json
        load-test.js
  artifacts:
    reports:
      junit: results.xml
    paths:
      - results.json
  only:
    - main

Run load tests against your staging environment on every merge to main. Set thresholds to fail the pipeline if p95 latency or error rate exceeds acceptable limits. See API Testing guide for unit and integration test patterns.