Claude Chat Completions Batch Test (BatchTest)¶

📝 Introduction¶

This document describes a batch test suite for Claude models using the OpenAI Chat Completions compatible endpoint (POST /v1/chat/completions). It helps validate capabilities and parameter behaviors such as thinking/reasoning, SSE streaming, function calling, response_format, long-context handling, and common generation parameters. Default gateway: https://api-cs-al.naci-tech.com/v1.

For the full payload examples and advanced cases, see the Chinese version: /api/claude-chat-batchtest/.

1. Project structure¶

The Claude batch test only needs the following directory and files (output/ will be created after running):

ks_claude/
├── requirements.txt
├── test_models.py    # batch test entry
└── output/           # generated after run, contains test_results.json

2. Install dependencies¶

Follow these steps in order:

Step 1: Enter the ks_claude directory (where requirements.txt and test_models.py are located).

Step 2: Create and activate a virtual environment (optional, recommended):

python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate

Step 3: Install dependencies from requirements.txt:

pip install -r requirements.txt

requirements.txt:

requirements.txt

httpx>=0.27.0
python-dotenv>=1.0.0

If requirements.txt is missing, you can install directly:

pip install httpx>=0.27.0 python-dotenv>=1.0.0

3. Configure environment variables¶

Step 1: Create a .env file in the ks_claude directory (or in the project root), or export the variable in your current shell.

Step 2: Set your API key:

API_DEMO_API_KEY=your_api_key

The script loads .env automatically via python-dotenv (current directory or parent directory).

4. Run the tests¶

Step 1: Enter the Claude test directory:

cd ks_claude

Step 2: Run tests (choose one):

Run all scenarios (no args):

python test_models.py

Run specific scenarios (scenario name or alias; multiple supported):

python test_models.py thinking fc

Step 3: Check results:

The console prints a PASS/FAIL table for model × scenario.
Full results are written to ks_claude/output/test_results.json.

Models covered¶

claude-3-7-sonnet-20250219
claude-sonnet-4-20250514
claude-sonnet-4-5-20250929
claude-haiku-4-5-20251001
claude-opus-4-20250514
claude-opus-4-5-20251101

📦 Outputs¶

Console: PASS/FAIL table and summary
Result file: ks_claude/output/test_results.json

🧪 Scenarios & aliases¶

Scenario	Description	Aliases
Thinking	thinking / reasoning output	`thinking`, `think`
Function Calling	tool calling (streaming)	`fc`, `function`
Tool Choice	compare `tool_choice` behavior	`tc`, `tool`
JSON Object	`response_format: json_object`	`so`, `json`
JSON Schema	`response_format: json_schema`	`js`, `schema`
200k Context	long-context stress test	`ctx`, `200k`
max_tokens	truncation behavior	`mt`
max_completion_tokens	truncation behavior	`mct`
Gen Params	`stop` / streaming `usage`	`gp`, `params`