Video Moderation
Video Streaming API is for identifying visual risks or business labels of video streaming.
Video Moderation
DeepCleer Video Moderation analyzes video files in both their captured frames and their audio track to detect regulatory risks and business attributes. Frame detection covers political content, pornography, sexually suggestive content, advertising, violence and terrorism, prohibited content, QR codes, and image-text violations; it can also recognize faces, logos, flora and fauna, and other business-specific content. Audio detection covers political content, pornography, advertising, prohibited content, violence, abuse, advertising-law violations, moaning, top-leader voiceprint, national anthem, and prohibited songs; it can also identify speaker gender, age, timbre, language, audio scene, singing, minor speakers, and human-voice authenticity.
The video moderation surface is exposed through two complementary endpoints. They share the same detection model, the same risk-label taxonomy, and the same response structure — they differ only in how the result is delivered.
API Description
Choose an endpoint based on how your integration prefers to receive results:
- Request API — submit a video file (by URL) for moderation. DeepCleer processes it asynchronously — capturing frames at your configured cadence and segmenting the audio track — and pushes the consolidated result to your
callbackURL when processing completes. - Query API — actively poll for a Request API submission's result by
btId. Use this when you cannot host a public callback endpoint, or as a backup to recover results when a callback delivery has been missed.
Both endpoints share the canonical response shape — the same frameDetail array, audioDetail array, top-level auxInfo, and tokenLabels structures are returned in both the callback and the Query response — so integrators can reuse the same parsing code across them.
Endpoints at a Glance
| Endpoint | Path | Delivery Model | Result Returned Via |
|---|---|---|---|
| Request API | /video/v4 | Asynchronous (fire-and-forget) | HTTP POST to your callback URL |
| Query API | /video/query/v4 | Polling | HTTP response body |
When to Use Which
Request API
Use when:
- You have a video file you want DeepCleer to moderate end-to-end.
- You can host a public HTTPS endpoint to receive callback deliveries.
- You want fire-and-forget submission — your application reacts to the moderation result when it arrives.
The Request endpoint accepts the video URL plus capture configuration (detectFrequency, checkFrameCount, or advancedFrequency for duration-based dynamic capture rates) and the detection types you want to apply to frames (imgType / imgBusinessType) and audio (audioType / audioBusinessType). The synchronous response is an acknowledgement only — it confirms that DeepCleer has accepted the moderation task and contains the requestId and btId you'll use to correlate the eventual result. Processing time is approximately one-third of the video file duration. The consolidated result is delivered to your callback URL once processing completes; deliveries are retried up to 20 times if your endpoint does not acknowledge with HTTP 200.
Query API
Use when:
- You cannot host a callback endpoint (firewalled internal services, edge clients, batch jobs).
- A previous Request callback delivery failed all retry attempts and you need to recover the result.
- You want to verify or re-fetch a result your callback endpoint has already received.
Poll at the recommended cadence of 30 seconds per btId. While moderation is still in progress, the endpoint returns code = 1101 with no result fields — keep polling. Once processing is complete, the response payload mirrors exactly what the Request callback would have delivered. Results are retained for up to 3 days from the original submission.
Lifecycle
┌─────────────────────────────────────────────────────────────────────────┐
│ 1. Client → POST /video/v4 (Request API) { video URL + config } │
│ ↓ │
│ 2. DeepCleer → 200 OK { requestId, btId } (synchronous ack) │
│ ↓ │
│ 3. DeepCleer captures frames + segments audio + runs detection │
│ (~1/3 of the video duration) │
│ ↓ │
│ 4a. DeepCleer → POST {callback} { riskLevel, frameDetail, audioDetail, … } OR
│ 4b. Client polls → POST /video/query/v4 { btId } │
│ ← 1101 while processing │
│ ← 1100 + full result once complete │
└─────────────────────────────────────────────────────────────────────────┘Updated 17 days ago