REST API Load Testing
Performance test your REST APIs with k6, Artillery, and Locust — benchmarks, thresholds, and CI/CD integration
Test Types
Different test types answer different questions about your API's performance. Always establish a baseline before adding load.
| Test Type | Question Answered | Pattern |
|---|---|---|
| Load test | Does the API handle expected traffic? | Ramp to target VUs, sustain, ramp down |
| Stress test | What is the breaking point? | Increase VUs until errors appear |
| Spike test | How does the API handle sudden bursts? | Instant jump to 10× normal load |
| Soak test | Does it degrade over time (memory leaks)? | Sustained load for 2–24 hours |
| Smoke test | Does it work at all? | 1–2 VUs, verify basic functionality |
Key Metrics
- p50 latency — median response time; 50% of requests are faster than this
- p95 latency — 95% of requests are faster; the standard SLO target
- p99 latency — 99% of requests are faster; captures tail latency that affects real users
- Throughput (RPS) — requests per second your API sustains
- Error rate — % of 4xx/5xx responses; should stay below 1% under load
- Virtual users (VUs) — concurrent simulated users
k6 — Cloud-Native Load Testing
k6 is a Go-based load testing tool with a JavaScript scripting API. It is the most popular choice in 2026 for DevOps/platform teams.
npm install -g k6 # or: brew install k6
// load-test.js
import http from 'k6/http';
import { check, sleep, group } from 'k6';
import { Rate, Trend } from 'k6/metrics';
const errorRate = new Rate('errors');
const createTime = new Trend('create_order_duration');
export const options = {
stages: [
{ duration: '30s', target: 10 }, // ramp up to 10 VUs
{ duration: '2m', target: 50 }, // ramp to 50 VUs
{ duration: '1m', target: 50 }, // stay at 50 VUs
{ duration: '30s', target: 0 }, // ramp down
],
thresholds: {
http_req_duration: ['p(95)<500'], // 95% of requests under 500ms
http_req_failed: ['rate<0.01'], // less than 1% errors
create_order_duration: ['p(99)<1000'], // order creation p99 under 1s
errors: ['rate<0.05'], // custom error rate
},
};
const BASE_URL = __ENV.BASE_URL || 'https://api.example.com';
const TOKEN = __ENV.API_TOKEN;
export default function () {
group('List users', () => {
const res = http.get(`${BASE_URL}/users?limit=20`, {
headers: { Authorization: `Bearer ${TOKEN}` }
});
check(res, {
'status 200': r => r.status === 200,
'response time < 200ms': r => r.timings.duration < 200,
'has data array': r => JSON.parse(r.body).data !== undefined,
});
errorRate.add(res.status >= 400);
});
group('Create order', () => {
const start = Date.now();
const res = http.post(
`${BASE_URL}/orders`,
JSON.stringify({
customer_id: '550e8400-e29b-41d4-a716-446655440000',
items: [{ product_id: 'prod_123', quantity: 1 }],
idempotency_key: `test-${__VU}-${__ITER}` // unique per VU + iteration
}),
{
headers: {
'Content-Type': 'application/json',
Authorization: `Bearer ${TOKEN}`
}
}
);
createTime.add(Date.now() - start);
check(res, {
'order created': r => r.status === 201,
'has order_id': r => JSON.parse(r.body).order_id !== undefined,
});
errorRate.add(res.status >= 400);
});
sleep(1); // think time between iterations
}
# Run locally
k6 run --env BASE_URL=https://api-staging.example.com --env API_TOKEN=xxx load-test.js
# Sample output:
# http_req_duration.............: avg=87ms min=23ms med=72ms max=890ms p(90)=198ms p(95)=312ms
# http_req_failed...............: 0.12% 6 out of 5000
# ✓ http_req_duration p(95) < 500ms
# ✓ http_req_failed rate < 1%
Artillery — YAML-First Load Testing
npm install -g artillery
# artillery.yaml
config:
target: "https://api.example.com"
http:
timeout: 10
phases:
- duration: 60
arrivalRate: 10
name: "Warm up"
- duration: 120
arrivalRate: 50
rampTo: 100
name: "Ramp up"
defaults:
headers:
Authorization: "Bearer {{ $env.API_TOKEN }}"
Content-Type: "application/json"
scenarios:
- name: "User workflow"
weight: 70
flow:
- get:
url: "/users"
expect:
- statusCode: 200
- contentType: json
- post:
url: "/orders"
json:
customer_id: "{{ $randomString(10) }}"
items:
- product_id: "prod_123"
quantity: 1
expect:
- statusCode: 201
- name: "Read-only"
weight: 30
flow:
- get:
url: "/products?limit=20"
artillery run --output results.json artillery.yaml
artillery report results.json # generates HTML report
Locust — Python Load Testing
# locustfile.py
from locust import HttpUser, task, between
import json, uuid
class APIUser(HttpUser):
wait_time = between(1, 3) # 1–3s between tasks
def on_start(self):
# Login and store token
res = self.client.post('/auth/token', json={
'client_id': 'test', 'client_secret': 'secret'
})
self.token = res.json()['access_token']
self.headers = {'Authorization': f'Bearer {self.token}'}
@task(3) # weight 3: called 3x more often
def list_users(self):
self.client.get('/users?limit=20', headers=self.headers)
@task(1)
def create_order(self):
self.client.post('/orders',
json={
'customer_id': str(uuid.uuid4()),
'items': [{'product_id': 'prod_123', 'quantity': 1}],
'idempotency_key': str(uuid.uuid4())
},
headers=self.headers
)
# Run
locust --headless --users 50 --spawn-rate 5 --run-time 2m --host https://api.example.com
Tool Comparison
| Tool | Language | Best For | Cloud |
|---|---|---|---|
| k6 | JavaScript | DevOps teams, CI/CD, detailed metrics | k6 Cloud |
| Artillery | YAML + JS | YAML-driven workflows, quick setup | Artillery Cloud |
| Locust | Python | Python teams, complex scenarios | Self-hosted |
| JMeter | XML/Java | Enterprise teams, legacy systems | BlazeMeter |
| Gatling | Scala/Java | Java ecosystem, high-accuracy reports | Gatling Cloud |
Interpreting Results
Common failure patterns and what they mean:
- p95 rises sharply at N VUs — you've hit a resource bottleneck (CPU, DB connections, thread pool). Find the bottleneck with monitoring.
- Error rate spikes at high load — rate limiting is triggering, or the server is exhausting its connection pool.
- Latency grows over time in soak test — memory leak, connection pool exhaustion, or cache pollution.
- First 5s are slow then fast — JIT/warmup effect; exclude from metrics or pre-warm.
CI/CD Integration
# .gitlab-ci.yml
load-test:
stage: test
image: grafana/k6
script:
- k6 run
--env BASE_URL=$STAGING_URL
--env API_TOKEN=$STAGING_TOKEN
--out json=results.json
load-test.js
artifacts:
reports:
junit: results.xml
paths:
- results.json
only:
- main
Run load tests against your staging environment on every merge to main. Set thresholds to fail the pipeline if p95 latency or error rate exceeds acceptable limits. See API Testing guide for unit and integration test patterns.