Video Transcription API

March 16, 2026

Video Transcription API

Submit a video or audio file and receive a word-level transcript with speaker diarization (each segment labelled with the detected speaker). The API supports videos from major platforms and locally uploaded video/audio files, along with language settings.

Supported Video Sources

The API accepts URLs from the following platforms: YouTube, Vimeo, Dailymotion, Kick, Twitch, TikTok, Facebook, Zoom, Rumble and more.

You can also transcribe local audio or video files you upload — local upload requires a Standard plan or above.

Workflow

  1. Submit a transcription task from a video URL
  2. Poll the results until status is SUCCEEDED

Submit Transcription Task

Submit a new transcription task from a video or audio URL.

POST https://wayinvideo-api.wayin.ai/api/v2/transcripts

Request Body

ParameterTypeRequiredDefaultDescription
video_urlstringYesThe source video/audio URL or uploaded file identifier
source_langstringNonullSource language of the video (see Supported Languages). When null, the system auto-detects the original language.
target_langstringNonullTarget language for the transcript (see Supported Languages). When null, no translation is applied. If target_lang differs from the video's original language, the transcript will be automatically translated into the target language.

Example Request

curl -X POST https://wayinvideo-api.wayin.ai/api/v2/transcripts \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "x-wayinvideo-api-version: v2" \
  -d '{
    "video_url": "https://www.youtube.com/watch?v=example",
    "target_lang": "en"
  }'
import requests

requests.post(
    "https://wayinvideo-api.wayin.ai/api/v2/transcripts",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "x-wayinvideo-api-version": "v2",
    },
    json={
        "video_url": "https://www.youtube.com/watch?v=example",
        "target_lang": "en",
    },
)
await fetch("https://wayinvideo-api.wayin.ai/api/v2/transcripts", {
  method: "POST",
  headers: {
    Authorization: "Bearer YOUR_API_KEY",
    "x-wayinvideo-api-version": "v2",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    video_url: "https://www.youtube.com/watch?v=example",
    target_lang: "en",
  }),
});

Response

{
  "data": {
    "id": "trans_proj_001",
    "name": "sample project name",
    "status": "CREATED"
  }
}
FieldTypeDescription
idstringTask identifier (used to retrieve results)
namestringTask name
statusstringCREATED, QUEUED, ONGOING, SUCCEEDED, FAILED

Examples

Common transcription scenarios. Replace YOUR_API_KEY with a key from the API Dashboard.

Transcribe a YouTube video

curl -X POST https://wayinvideo-api.wayin.ai/api/v2/transcripts \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "x-wayinvideo-api-version: v2" \
  -H "Content-Type: application/json" \
  -d '{"video_url": "https://www.youtube.com/watch?v=EXAMPLE"}'

Transcribe a podcast with multiple speakers

Pass any podcast audio URL (or an uploaded file identifier) — speaker diarization is automatic; each segment in the response carries a speaker label.

curl -X POST https://wayinvideo-api.wayin.ai/api/v2/transcripts \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "x-wayinvideo-api-version: v2" \
  -H "Content-Type: application/json" \
  -d '{"video_url": "https://www.youtube.com/watch?v=EXAMPLE"}'

Translate a non-English transcript to English

Set target_lang to translate on the fly. Combine with source_lang if you already know the source.

curl -X POST https://wayinvideo-api.wayin.ai/api/v2/transcripts \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "x-wayinvideo-api-version: v2" \
  -H "Content-Type: application/json" \
  -d '{
    "video_url": "https://www.youtube.com/watch?v=EXAMPLE",
    "source_lang": "ja",
    "target_lang": "en"
  }'

Get Transcription Results

Retrieve the transcript with word-level timestamps and speaker labels. Poll until status is SUCCEEDED.

GET https://wayinvideo-api.wayin.ai/api/v2/transcripts/results/{id}

Path Parameters

ParameterTypeRequiredDescription
idstringYesThe task ID returned by the submit endpoint

Example Request

curl -X GET https://wayinvideo-api.wayin.ai/api/v2/transcripts/results/trans_proj_001 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "x-wayinvideo-api-version: v2"

Response

{
  "data": {
    "status": "SUCCEEDED",
    "cost_usage": 27.0,
    "transcript": [
      {
        "text": "Welcome to today's presentation",
        "language": null,
        "start": 200,
        "end": 4500,
        "speaker": "Speaker 1"
      },
      {
        "text": "Thanks for coming",
        "language": null,
        "start": 5000,
        "end": 8200,
        "speaker": "Speaker 2"
      }
    ]
  }
}

Response Fields

FieldTypeDescription
statusstringCREATED, QUEUED, ONGOING, SUCCEEDED, FAILED
error_messagestringError reason (only present when status is FAILED)
cost_usagenumberAPI units consumed for this request
transcriptarrayList of transcript segments (see below)

Transcript Segment

FieldTypeDescription
textstringTranscribed text
languagestring | nullDetected language code, or null if not detected
startintegerStart time in milliseconds
endintegerEnd time in milliseconds
speakerstringSpeaker label (e.g. "Speaker 1")

FAQ

What is the maximum video length?

There is no hard length limit. The API supports both short clips and long-form video or audio content across supported source platforms.

Does the API return word-level timestamps?

Yes. Each transcript segment includes start and end timestamps in milliseconds, the transcribed text, the detected language, and the assigned speaker label from speaker diarization.

How does speaker diarization work?

Speakers are auto-detected and labelled (Speaker 1, Speaker 2, …) per segment. No configuration is required — diarization runs on every transcription task.

Which audio and video formats are supported?

Source URLs are supported from YouTube, Vimeo, Dailymotion, Kick, Twitch, TikTok, Facebook, Zoom, Rumble, and more. For local uploads, send mp4, mov, webm, or avi (audio-only files can be muxed into one of these containers).

Can I translate the transcript into another language?

Yes — pass the target_lang parameter. The transcript is translated when target_lang differs from the source language. See Supported Languages for the full list of language codes.