Claude Chat Completions Batch Test (BatchTest)¶
📝 Introduction¶
This document describes a batch test suite for Claude models using the OpenAI Chat Completions compatible endpoint (POST /v1/chat/completions). It helps validate capabilities and parameter behaviors such as thinking/reasoning, SSE streaming, function calling, response_format, long-context handling, and common generation parameters. Default gateway: https://api-cs-al.naci-tech.com/v1.
For the full payload examples and advanced cases, see the Chinese version:
/api/claude-chat-batchtest/.
1. Project structure¶
The Claude batch test only needs the following directory and files (output/ will be created after running):
ks_claude/
├── requirements.txt
├── test_models.py # batch test entry
└── output/ # generated after run, contains test_results.json
2. Install dependencies¶
Follow these steps in order:
Step 1: Enter the ks_claude directory (where requirements.txt and test_models.py are located).
Step 2: Create and activate a virtual environment (optional, recommended):
Step 3: Install dependencies from requirements.txt:
requirements.txt:
If requirements.txt is missing, you can install directly:
3. Configure environment variables¶
Step 1: Create a .env file in the ks_claude directory (or in the project root), or export the variable in your current shell.
Step 2: Set your API key:
The script loads .env automatically via python-dotenv (current directory or parent directory).
4. Run the tests¶
Step 1: Enter the Claude test directory:
Step 2: Run tests (choose one):
- Run all scenarios (no args):
- Run specific scenarios (scenario name or alias; multiple supported):
Step 3: Check results:
- The console prints a PASS/FAIL table for model × scenario.
- Full results are written to
ks_claude/output/test_results.json.
Models covered¶
claude-3-7-sonnet-20250219claude-sonnet-4-20250514claude-sonnet-4-5-20250929claude-haiku-4-5-20251001claude-opus-4-20250514claude-opus-4-5-20251101
📦 Outputs¶
- Console: PASS/FAIL table and summary
- Result file:
ks_claude/output/test_results.json
🧪 Scenarios & aliases¶
| Scenario | Description | Aliases |
|---|---|---|
| Thinking | thinking / reasoning output | thinking, think |
| Function Calling | tool calling (streaming) | fc, function |
| Tool Choice | compare tool_choice behavior |
tc, tool |
| JSON Object | response_format: json_object |
so, json |
| JSON Schema | response_format: json_schema |
js, schema |
| 200k Context | long-context stress test | ctx, 200k |
| max_tokens | truncation behavior | mt |
| max_completion_tokens | truncation behavior | mct |
| Gen Params | stop / streaming usage |
gp, params |