Video Transcription API
Submit a video or audio file and receive a word-level transcript with speaker diarization (each segment labelled with the detected speaker). The API supports videos from major platforms and locally uploaded video/audio files, along with language settings.
Supported Video Sources
The API accepts URLs from the following platforms: YouTube, Vimeo, Dailymotion, Kick, Twitch, TikTok, Facebook, Zoom, Rumble and more.
You can also transcribe local audio or video files you upload — local upload requires a Standard plan or above.
Workflow
- Submit a transcription task from a video URL
- Poll the results until status is
SUCCEEDED
Submit Transcription Task
Submit a new transcription task from a video or audio URL.
POST https://wayinvideo-api.wayin.ai/api/v2/transcripts
Request Body
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
video_url | string | Yes | — | The source video/audio URL or uploaded file identifier |
source_lang | string | No | null | Source language of the video (see Supported Languages). When null, the system auto-detects the original language. |
target_lang | string | No | null | Target language for the transcript (see Supported Languages). When null, no translation is applied. If target_lang differs from the video's original language, the transcript will be automatically translated into the target language. |
Example Request
curl -X POST https://wayinvideo-api.wayin.ai/api/v2/transcripts \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "x-wayinvideo-api-version: v2" \
-d '{
"video_url": "https://www.youtube.com/watch?v=example",
"target_lang": "en"
}'
import requests
requests.post(
"https://wayinvideo-api.wayin.ai/api/v2/transcripts",
headers={
"Authorization": "Bearer YOUR_API_KEY",
"x-wayinvideo-api-version": "v2",
},
json={
"video_url": "https://www.youtube.com/watch?v=example",
"target_lang": "en",
},
)
await fetch("https://wayinvideo-api.wayin.ai/api/v2/transcripts", {
method: "POST",
headers: {
Authorization: "Bearer YOUR_API_KEY",
"x-wayinvideo-api-version": "v2",
"Content-Type": "application/json",
},
body: JSON.stringify({
video_url: "https://www.youtube.com/watch?v=example",
target_lang: "en",
}),
});
Response
{
"data": {
"id": "trans_proj_001",
"name": "sample project name",
"status": "CREATED"
}
}
| Field | Type | Description |
|---|---|---|
id | string | Task identifier (used to retrieve results) |
name | string | Task name |
status | string | CREATED, QUEUED, ONGOING, SUCCEEDED, FAILED |
Examples
Common transcription scenarios. Replace YOUR_API_KEY with a key from the API Dashboard.
Transcribe a YouTube video
curl -X POST https://wayinvideo-api.wayin.ai/api/v2/transcripts \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "x-wayinvideo-api-version: v2" \
-H "Content-Type: application/json" \
-d '{"video_url": "https://www.youtube.com/watch?v=EXAMPLE"}'
Transcribe a podcast with multiple speakers
Pass any podcast audio URL (or an uploaded file identifier) — speaker diarization is automatic; each segment in the response carries a speaker label.
curl -X POST https://wayinvideo-api.wayin.ai/api/v2/transcripts \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "x-wayinvideo-api-version: v2" \
-H "Content-Type: application/json" \
-d '{"video_url": "https://www.youtube.com/watch?v=EXAMPLE"}'
Translate a non-English transcript to English
Set target_lang to translate on the fly. Combine with source_lang if you already know the source.
curl -X POST https://wayinvideo-api.wayin.ai/api/v2/transcripts \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "x-wayinvideo-api-version: v2" \
-H "Content-Type: application/json" \
-d '{
"video_url": "https://www.youtube.com/watch?v=EXAMPLE",
"source_lang": "ja",
"target_lang": "en"
}'
Get Transcription Results
Retrieve the transcript with word-level timestamps and speaker labels. Poll until status is SUCCEEDED.
GET https://wayinvideo-api.wayin.ai/api/v2/transcripts/results/{id}
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | The task ID returned by the submit endpoint |
Example Request
curl -X GET https://wayinvideo-api.wayin.ai/api/v2/transcripts/results/trans_proj_001 \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "x-wayinvideo-api-version: v2"
Response
{
"data": {
"status": "SUCCEEDED",
"cost_usage": 27.0,
"transcript": [
{
"text": "Welcome to today's presentation",
"language": null,
"start": 200,
"end": 4500,
"speaker": "Speaker 1"
},
{
"text": "Thanks for coming",
"language": null,
"start": 5000,
"end": 8200,
"speaker": "Speaker 2"
}
]
}
}
Response Fields
| Field | Type | Description |
|---|---|---|
status | string | CREATED, QUEUED, ONGOING, SUCCEEDED, FAILED |
error_message | string | Error reason (only present when status is FAILED) |
cost_usage | number | API units consumed for this request |
transcript | array | List of transcript segments (see below) |
Transcript Segment
| Field | Type | Description |
|---|---|---|
text | string | Transcribed text |
language | string | null | Detected language code, or null if not detected |
start | integer | Start time in milliseconds |
end | integer | End time in milliseconds |
speaker | string | Speaker label (e.g. "Speaker 1") |
FAQ
What is the maximum video length?
There is no hard length limit. The API supports both short clips and long-form video or audio content across supported source platforms.
Does the API return word-level timestamps?
Yes. Each transcript segment includes start and end timestamps in milliseconds, the transcribed text, the detected language, and the assigned speaker label from speaker diarization.
How does speaker diarization work?
Speakers are auto-detected and labelled (Speaker 1, Speaker 2, …) per segment. No configuration is required — diarization runs on every transcription task.
Which audio and video formats are supported?
Source URLs are supported from YouTube, Vimeo, Dailymotion, Kick, Twitch, TikTok, Facebook, Zoom, Rumble, and more. For local uploads, send mp4, mov, webm, or avi (audio-only files can be muxed into one of these containers).
Can I translate the transcript into another language?
Yes — pass the target_lang parameter. The transcript is translated when target_lang differs from the source language. See Supported Languages for the full list of language codes.