Skip to content

Kling AI Video Generation Full API Documentation (OpenAI & Native Formats)

This interface provides two calling methods through new-api: 1. OpenAI Compatible Mode: Suitable for standard video clients, with parameters mapped. 2. Native Forwarding Mode: Parameter structure is fully aligned with the original factory documentation, suitable for direct integration.


  • Endpoint: POST /v1/video/generations
  • Authentication: Authorization: Bearer $API_KEY

1.0 Basic Call Example (curl)

curl https://api-cs-al.naci-tech.com/v1/video/generations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_KEY" \
  -d '{
    "model": "kling-v2-6",
    "prompt": "A cat singing on a stage, spotlight on it, 4K HD quality, cinematic feel",
    "size": "1280x720",
    "duration": 5
  }'

The above example will be automatically mapped by new-api to the Kling Text-to-Video interface (16:9, 5s).

1.1 Basic Parameter Mapping

OpenAI Field Type Description Corresponding Native Logic
model string Model name, e.g., kling-v2-6, kling-video-o1 Automatically matches backend interface
prompt string Positive prompt, ≤2500 characters Mapped to prompt
image string Reference image URL Mapped to init_image (automatically enables Image-to-Video)
size string Size, e.g., 1280x720, 1024x1024 Automatically converted to aspect_ratio (16:9, 1:1, etc.)
duration number Duration (s), only supports 5, 10 Automatically converted to duration_string

1.2 Advanced Extended Parameters (Extra Parameters)

All non-OpenAI standard fields must be placed in the metadata object (or at the top level in clients that support custom JSON).

Applicable to kling-v2-6 / kling-v1-6:

{
  "model": "kling-v2-6",
  "prompt": "A cat singing",
  "metadata": {
    "negative_prompt": "blurry",      // Negative prompt
    "mode": "pro",                // Mode: std, pro
    "cfg_scale": 0.5,             // Free degree [0, 1]
    "sound": "on",                // Whether to generate sound: on, off
    "callback": "https://...",    // Callback address
    "params": "user_id_123",      // Callback transparent parameter
    "voice_list": [               // Referenced voice
      { "voice_id": "system_v1" }
    ]
  }
}

1.2.1 Full curl Example with Advanced Parameters

curl https://api-cs-al.naci-tech.com/v1/video/generations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_KEY" \
  -d '{
    "model": "kling-v2-6",
    "prompt": "A cat singing on a stage, spotlight on it, 4K HD quality, cinematic feel",
    "size": "1280x720",
    "duration": 10,
    "metadata": {
      "negative_prompt": "blurry, low resolution",
      "mode": "pro",
      "cfg_scale": 0.5,
      "sound": "on",
      "callback": "https://example.com/kling/callback",
      "params": "user_id_123",
      "voice_list": [
        { "voice_id": "system_v1" }
      ]
    }
  }'

Unique parameters for kling-video-o1 (Omni):

{
  "model": "kling-video-o1",
  "prompt": "...",
  "metadata": {
    "images": [                   // Reference image list (first and last frames)
      { "image_url": "...", "type": "first_frame" },
      { "image_url": "...", "type": "end_frame" }
    ],
    "elements": [                 // Subject reference list
      { "element_id": "obj_123" }
    ],
    "videos": [                   // Reference video
      { "video_url": "...", "refer_type": "base", "keep_original_sound": "yes" }
    ]
  }
}


2. Native Forwarding Mode (Native API)

If you wish to directly use the structure defined in the original factory documentation, you can use the following paths. Under these paths, new-api only performs authentication and intelligent routing, without modifying your payload.

  • Text-to-Video/Image-to-Video: POST /kling/v1/videos/text2video or /kling/v1/videos/image2video
  • Status Query: GET /kling/v1/videos/text2video/{task_id}

2.0 Text-to-Video Call Example (curl)

curl https://api-cs-al.naci-tech.com/kling/v1/videos/text2video \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_KEY" \
  -d '{
    "model": "kling-v2-6",
    "prompt": "A cat singing on a stage, spotlight on it, 4K HD quality, cinematic feel",
    "negative_prompt": "blurry, low resolution",
    "mode": "std",
    "aspect_ratio": "16:9",
    "duration_string": "5",
    "cfg_scale": 0.5,
    "sound": "on",
    "callback": "https://example.com/kling/callback",
    "params": "user_id_123"
  }'

2.0.1 Image-to-Video Call Example (curl)

curl https://api-cs-al.naci-tech.com/kling/v1/videos/image2video \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_KEY" \
  -d '{
    "model": "kling-v2-6",
    "prompt": "Keep the main style unchanged, let the character slowly walk towards the camera",
    "negative_prompt": "blurry, low resolution",
    "init_image": "https://example.com/input-image.jpg",
    "image_tail": "https://example.com/end-frame.jpg",
    "mode": "pro",
    "aspect_ratio": "16:9",
    "duration_string": "10",
    "cfg_scale": 0.6,
    "sound": "on",
    "callback": "https://example.com/kling/callback",
    "params": "user_id_123"
  }'

2.1 Full Parameters for Text/Image-to-Video (Kling V1.6/V2.6)

{
  "model": "kling-v2-6",
  "prompt": "string",
  "negative_prompt": "string",
  "init_image": "url",           // Required only for Image-to-Video
  "image_tail": "url",           // Optional, end frame; requires mode=pro (std not supported, returns 1201)
  "mode": "std",                 // std, pro; must be pro when image_tail is used
  "aspect_ratio": "16:9",        // 16:9, 9:16, 1:1
  "duration_string": "5",        // "5", "10"
  "cfg_scale": 0.5,
  "sound": "on",
  "voice_list": [{"voice_id": "..."}],
  "callback": "url",
  "params": "string"
}

Parameter Restrictions (Text/Image-to-Video)

Parameter Restrictions
model Enum: kling-v2-5-turbo, kling-v2-6
prompt / negative_prompt Required / Optional, both within 2500 characters
mode Enum: std, pro; V1.6 does not support pro
aspect_ratio Enum: 16:9, 9:16, 1:1, default is 16:9
duration_string String Enum: "5", "10" (in seconds)
cfg_scale Range [0, 1], higher values align more with the prompt
sound Enum: on, off; Only supported in V2.6 and later models
voice_list Up to 2 voices; voice_id comes from the voice customization interface or system presets
Image-to-Video init_image / image_tail init_image is required for Image-to-Video; both cannot be empty at the same time. mode=pro must be used when image_tail is present. kling-v2-6 does not support image_tail in std mode (returns code 1201 otherwise).

2.2 Omni (O1) Model Full Parameters (Kling o1)

Paths are the same as above; the system automatically handles it when it detects model is o1.

{
  "model": "kling-video-o1",
  "prompt": "string",
  "images": [
    { "image_url": "url", "type": "first_frame" }, // type: first_frame, end_frame, or reference image (empty)
    { "image_url": "url", "type": "end_frame" }
  ],
  "elements": [
    { "element_id": "string" }
  ],
  "videos": [
    { 
      "video_url": "url", 
      "refer_type": "feature", // feature (feature reference), base (to be edited)
      "keep_original_sound": "yes" // yes, no
    }
  ],
  "mode": "pro",
  "aspect_ratio": "16:9",
  "duration_string": "10",
  "callback": "url",
  "params": "string"
}

Parameter Restrictions (Omni)

Parameter Restrictions
prompt Within 2500 characters, can include positive/negative descriptions
images With reference video: quantity ≤4; Without reference video: ≤7; Cannot set end frame if more than 2 images; Only end frame is not supported (if end frame is present, first frame must be present); Video editing is not available when generating video from first frame or first/last frames
Image Format .jpg / .jpeg / .png; single image ≤10MB; width/height ≥300px; aspect ratio 1:2.5~2.5:1
elements With reference video: "Reference images + Subjects" ≤4; Without reference video: ≤7
videos At most 1 segment; Format MP4/MOV; Duration 3~10 seconds; Width/height 720~2160px; Frame rate 24~60fps (output 24fps); ≤200MB; First and last frames cannot be defined for editable video (base)
mode Enum: std (standard), pro (professional)
aspect_ratio Enum: 16:9, 9:16, 1:1; Task creation may fail when reference image is used as the first frame or when using video editing
duration_string Enum: 3~10 (string); Text-to-Video/First-frame Image-to-Video only supports "5", "10"; When a video is provided, the output duration matches the input duration, and this parameter is invalid

3. Task Status Constants Description (General)

Status Text (OpenAI) Status Code (Native) Description
queued 1 Queued
processing 2 Processing (returns progress)
succeeded 3 Succeeded
failed 4, 5 Failed / Timeout

About Query Interface

After submitting a task, you can use the task ID to query the status.

Applicable to standard OpenAI video response format.

curl https://api-cs-al.naci-tech.com/v1/video/generations/task_id_here \
  -H "Authorization: Bearer $API_KEY"

2. Native Style Query

Returns the original response structure defined by the factory.

curl https://api-cs-al.naci-tech.com/kling/v1/videos/text2video/task_id_here \
  -H "Authorization: Bearer $API_KEY"