Claude Chat Completions バッチテスト（BatchTest）¶

📝 概要¶

本ドキュメントは、OpenAI Chat Completions 互換エンドポイント（POST /v1/chat/completions）における Claude 系モデル のバッチ検証方法をまとめたものです。thinking / reasoning、SSE ストリーミング、Function Calling、response_format、長コンテキスト、一般的な生成パラメータなどの挙動をまとめて確認できます。デフォルトゲートウェイ：https://api-cs-al.naci-tech.com/v1。

**payload の詳細例や高度なケース**は中国語版にあります：/api/claude-chat-batchtest/。

1. ディレクトリ構成¶

Claude バッチテストは以下のディレクトリ／ファイルのみで実行できます（output/ は実行後に自動生成）：

ks_claude/
├── requirements.txt
├── test_models.py    # バッチテスト入口
└── output/           # 実行後に生成（test_results.json を含む）

2. 依存関係のインストール¶

以下の順に実行してください：

手順 1：ks_claude ディレクトリへ移動（requirements.txt と test_models.py がある場所）。

手順 2：仮想環境を作成して有効化（任意・推奨）：

python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate

手順 3：依存関係をインストール（requirements.txt）：

pip install -r requirements.txt

requirements.txt の内容：

requirements.txt

httpx>=0.27.0
python-dotenv>=1.0.0

もし requirements.txt がない場合は直接インストールできます：

pip install httpx>=0.27.0 python-dotenv>=1.0.0

3. 環境変数の設定¶

手順 1：ks_claude ディレクトリ（またはプロジェクトルート）に .env を作成するか、現在のシェルに export します。

手順 2：API Key を設定します：

API_DEMO_API_KEY=your_api_key

スクリプトは python-dotenv により、同一ディレクトリまたは親ディレクトリの .env を自動で読み込みます。

4. 実行方法¶

手順 1：Claude テストディレクトリへ移動：

cd ks_claude

手順 2：テスト実行（どちらか）：

全シナリオ実行（引数なし）：

python test_models.py

指定シナリオのみ実行（シナリオ名または別名。複数指定可）：

python test_models.py thinking fc

手順 3：結果確認：

コンソールに「モデル × シナリオ」の PASS/FAIL 表が表示されます
完全な結果は ks_claude/output/test_results.json に出力されます

対象モデル¶

claude-3-7-sonnet-20250219
claude-sonnet-4-20250514
claude-sonnet-4-5-20250929
claude-haiku-4-5-20251001
claude-opus-4-20250514
claude-opus-4-5-20251101

📦 出力¶

コンソール：PASS/FAIL 表とサマリー
結果ファイル：ks_claude/output/test_results.json

🧪 シナリオと別名¶

シナリオ	説明	別名
Thinking	thinking / reasoning 出力	`thinking`, `think`
Function Calling	ツール呼び出し（ストリーミング）	`fc`, `function`
Tool Choice	`tool_choice` の挙動比較	`tc`, `tool`
JSON Object	`response_format: json_object`	`so`, `json`
JSON Schema	`response_format: json_schema`	`js`, `schema`
200k Context	長コンテキスト負荷テスト	`ctx`, `200k`
max_tokens	切り捨て挙動	`mt`
max_completion_tokens	切り捨て挙動	`mct`
Gen Params	`stop` / ストリーミング `usage`	`gp`, `params`