Request API

Submit a video for content moderation to detect regulatory and business-specific risks in frames and audio.

DeepCleer Async Video Moderation submits a video file for content moderation in both captured frames and audio segments, with results delivered asynchronously to your callback URL.

API Description

The Async Video Moderation API detects regulatory risks and business-specific content in both captured frames and audio segments of a video file.

Frame detection identifies political content, pornography, advertising, violence & terrorism, and other regulatory risks. It can also recognize faces, logos, flora & fauna, and other business-specific content based on your use case.

Audio detection identifies political content, pornography, advertising, and other regulatory risks. It can also recognize gender, voice timbre, minors, and other business-specific content based on your use case.

Submit video information for moderation with configurable frame capture frequency. Results are delivered asynchronously through a callback URL or can be retrieved via the active query endpoint. Processing time is approximately one-third of the video file duration.

Requirements

ItemSpecification
ProtocolHTTP or HTTPS
MethodPOST
EncodingUTF-8
FormatAll request and response parameters use JSON

Video Requirements

ItemSpecification
Supported formatsAVI, FLV, MP4, MPG, WMV, MOV, WMA, RMVB, M3U8, MKV, 3GP, WEBM
Size limit≤ 300 MB
Duration limit≤ 2 hours

Timeout Suggestion

Recommended request timeout: 7 seconds.

ℹ️

Internal processing timeout is 3 seconds with one automatic retry. Normal request latency is approximately 5 ms. The synchronous response is an acknowledgement only — moderation results are delivered asynchronously through your callback URL or via the active query endpoint.

Callback Mechanism

When DeepCleer pushes a result to your callback URL and your endpoint responds with HTTP 200, the delivery is considered successful. If any other status code is returned (or the request fails), the system retries on the following schedule (in seconds):

[5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 120, 120, 120, 120, 120, 120]

After 20 failed attempts, no further retries are made.


Request

Request URL

ClusterEndpoint
Singapore Videohttp://api-video-xjp.fengkongcloud.com/video/v4

Request Parameters

Top-Level Parameters

ParameterTypeRequiredMax LengthDescription
accessKeystringYes20API authentication key. The default accessKey is sent in your onboarding email.
appIdstringYes64Application identifier, such as web for your web application or app for your mobile app. The default appId is sent in your onboarding email. Contact DeepCleer if you need a new appId.
eventIdstringYes64Event identifier used to distinguish moderation scenarios in your application, such as promptMedia for attached media of prompts or liveVidio for livestream audio. The default eventId is sent in your onboarding email. Contact DeepCleer if you need a new eventId.
imgTypestringConditional64Frame detection types. At least one of imgType or imgBusinessType must be provided. See Image Detection Types.
audioTypestringConditional64Audio detection types. At least one of audioType or audioBusinessType must be provided. See Audio Detection Types.
imgBusinessTypestringConditional128Frame business detection labels. At least one of imgType or imgBusinessType must be provided. See your business label catalog for available values.
audioBusinessTypestringConditional128Audio business detection labels. At least one of audioType or audioBusinessType must be provided. See Audio Business Types.
callbackstringYes500URL that receives asynchronous moderation results. Supports HTTP and HTTPS.
dataobjectYes1 MBRequest payload containing video and user metadata. See data Object Parameters.

Image Detection Types

Combine multiple values with underscores (e.g. POLITY_QRCODE_ADVERT).

ValueDescription
POLITYPolitically sensitive content
EROTICPornographic & sexually suggestive content
VIOLENTViolence, terrorism & prohibited content
QRCODEQR code detection
ADVERTAdvertising content
IMGTEXTRISKImage text violation detection

Audio Detection Types

Combine multiple values with underscores (e.g. POLITY_EROTIC).

ValueDescription
POLITYPolitically sensitive content
EROTICPornographic content
ADVERTAdvertising content
BANProhibited content
VIOLENTViolence & terrorism content
DIRTYVerbal abuse
ADLAWAdvertising-law violations
MOANSexual moaning
AUDIOPOLITICALVoiceprint of top political leaders
ANTHENNational anthem detection
BANEDAUDIOProhibited songs
NONESkip audio detection

Audio Business Types

Combine multiple values with underscores. When detecting TIMBRE, SING, or LANGUAGE, you must also include GENDER.

ValueDescription
SINGSinging detection
LANGUAGELanguage detection (Chinese, English, Cantonese, Tibetan, Uyghur, Korean, Mongolian, Other)
MINORMinor speaker detection
GENDERSpeaker gender
TIMBRESpeaker timbre
VOICEHuman-voice attribute
AUDIOSCENEAudio scene
AGESpeaker age

data Object Parameters

ParameterTypeRequiredMax LengthDescription
btIdstringYes64Client-side unique request identifier.
tokenIdstringYes64User account identifier. Recommended to pass the user ID for behavioral risk detection.
urlstringYes600URL of the video to be moderated.
audioDetectStepint32NoAudio moderation sampling step. Range 136. 1 skips one 10-second segment between reviews, 2 skips two, and so on. When omitted, all audio content is moderated.
checkFrameCountint32NoFixed number of frames to capture. Includes the first and last frames by default; remaining positions are calculated as video_duration / frame_count (rounded to 3 decimal places, values > 0 are used). Priority: checkFrameCount > advancedFrequency > detectFrequency. If the video duration cannot be determined, falls back to detectFrequency.
dataIdstringNo128Custom data ID. Searchable in the DeepCleer SaaS dashboard.
detectFrequencyint32NoFrame capture interval in seconds. Range 160. Default: 5.
deviceIdstringNo128DeepCleer device fingerprint identifier, generated by the DeepCleer SDK for user behavior analysis.
genderint32NoUser gender. 0: unknown. 1: male. 2: female.
ipstringNo64Client public IP address (IPv4 or IPv6) for IP-based user behavior analysis.
langstringNoLanguage for text detection in captured frames and audio segments. Default: zh. See Supported Languages.
levelint32NoUser level for configuring different interception strategies. See User Levels.
receiveTokenIdstringConditional64Message receiver's tokenId. Alphanumeric with underscores and hyphens, up to 64 characters. Required when eventId is message.
returnAllAudioint32NoControls which audio segments are returned. 0 (default): return only segments with non-PASS risk levels. 1: return all segments regardless of risk level.
returnAllImgint32NoControls which video frames are returned. 0 (default): return only frames with non-PASS risk levels. 1: return all frames regardless of risk level.
returnAllVideoint32NoControls which video clips are returned. Only effective when detection types include DANCE. 0 (default): return only clips with non-PASS risk levels. 1: return all clips regardless of risk level.
videoTitlestringNo128Video name. Displayed in the dashboard.
advancedFrequencyobjectNoAdvanced duration-based frame capture configuration. When set, overrides the default capture strategy. See advancedFrequency Object.
extraobjectNoAuxiliary parameters.
extra.passThroughobjectNo1024Client pass-through field. DeepCleer does not process this field; it is returned as-is in the callback.
extra.acceptLangstringNoLanguage for returned labels. en (default): English. zh: Chinese.

Supported Languages

ValueLanguage
zhChinese (default)
enEnglish
arArabic
hiHindi
esSpanish
frFrench
ruRussian
ptPortuguese
idIndonesian
deGerman
jaJapanese
trTurkish
viVietnamese
itItalian
thThai
tlFilipino
koKorean
msMalay
autoAutomatic language detection (contact DeepCleer to enable)

User Levels

ValueDescription
0Lowest-level user (e.g., newly registered, completely inactive, or level-0 users)
1Lower-level user (e.g., low activity or low-level users)
2Mid-level user (e.g., moderately active or mid-level users)
3Higher-level user (e.g., highly active or high-level users)
4Highest-level user (e.g., paying users, VIP users)

advancedFrequency Object

Configure dynamic frame capture rates based on video duration.

ParameterTypeRequiredDescription
durationPointsarrayNoVideo duration interval breakpoints in seconds. Maximum of 5 elements (each int32).
frequenciesarrayNoFrame capture frequencies (int32, range 160 seconds) corresponding to each duration interval. Maximum of 6 elements. The frequencies array must have exactly one more element than durationPoints. Invalid or empty values return error code 1902.

Example configuration:

{
  "durationPoints": [300, 600],
  "frequencies": [1, 5, 10]
}

This means:

  • Video duration ≤ 300s → capture 1 frame per second
  • 300s ≤ video duration ≤ 600s → capture 1 frame every 5 seconds
  • Video duration > 600s → capture 1 frame every 10 seconds

Response

The synchronous response is an acknowledgement only — it confirms that DeepCleer has accepted the moderation task. Per-video moderation results are delivered asynchronously through the callback URL you provided.

Response Parameters

ℹ️

Parameters other than code, message, and requestId are only guaranteed to be returned when code is 1100.

ParameterTypeRequiredDescription
requestIdstringYesUnique DeepCleer request identifier.
codeint32YesResponse code. See Response Codes.
messagestringYesResponse message corresponding to the code.
btIdstringYesClient-side unique request identifier (echoed from the request).

Response Codes

CodeMessageDescription
1100SuccessThe request completed successfully.
1901QPS limit exceededThe request rate limit has been exceeded.
1902Invalid parametersOne or more request parameters are invalid.
1903Service failureAn internal service error occurred.
1905Invalid content formatThe content to be moderated does not meet format requirements.
9101Unauthorized operationThe provided accessKey does not have permission for this operation.

Callback Parameters

ℹ️

Parameters other than code, message, and requestId are only guaranteed to be returned when code is 1100.

ParameterTypeRequiredDescription
requestIdstringYesUnique DeepCleer request identifier.
codeint32YesResponse code. See Response Codes.
messagestringYesResponse message corresponding to the code.
btIdstringYesClient-side unique request identifier.
riskLevelstringYesOverall risk level. PASS: normal (allow). REVIEW: suspicious (manual review). REJECT: violation (block).
auxInfoobjectYesAuxiliary information. See Callback auxInfo Object.
frameDetailarrayNoFrame image risk details. Returned when risky frames exist or returnAllImg is 1. See frameDetail Array.
audioDetailarrayNoAudio segment risk details. Returned when risky segments exist or returnAllAudio is 1. See audioDetail Array.
tokenProfileLabelsarrayNoAccount attribute labels. Returned only when tokenId is provided and the labeling service is enabled. See Token Labels.
tokenRiskLabelsarrayNoAccount risk labels. Returned only when tokenId is provided and the labeling service is enabled. See Token Labels.

Callback auxInfo Object

ParameterTypeRequiredDescription
billingAudioDurationfloatYesAudio duration (in seconds) in the current video for billing purposes. If the audio track duration differs from the video duration, billing is based on the actual audio track duration (may be 0 if no audio track exists).
billingImgNumint32YesNumber of captured frame images in the current video for billing purposes.
frameCountint32YesNumber of returned video frames. When returnAllImg is 0, this is the risk-frame count; when returnAllImg is 1, this is the total count.
timefloatYesVideo duration in seconds.
passThroughobjectNoClient pass-through field returned as-is.

frameDetail Array

Each element in the array represents a captured frame:

ParameterTypeRequiredDescription
imgUrlstringYesURL of the captured frame image.
requestIdstringYesUnique DeepCleer request identifier for this frame.
riskLevelstringYesRisk level. PASS: normal. REVIEW: suspicious. REJECT: violation.
riskLabel1stringYesLevel 1 risk label. Returns normal when riskLevel is PASS.
riskLabel2stringYesLevel 2 risk label. Empty when riskLevel is PASS.
riskLabel3stringYesLevel 3 risk label. Empty when riskLevel is PASS.
riskDescriptionstringYesRisk description. Returns "Normal" when riskLevel is PASS. Hits against custom lists return "Matched custom list". Otherwise format: "Level 1: Level 2: Level 3". For reference only — do not use for programmatic logic.
allLabelsarrayYesAll risk labels detected for this frame. See Frame allLabels.
auxInfoobjectYesFrame auxiliary information. See Frame auxInfo.
riskDetailobjectYesRisk detail information. See Frame riskDetail.
imgTextstringNoOCR text content of the frame. Returned only when imgType includes ADVERT or IMGTEXTRISK.
timefloatNoTimestamp of this frame relative to the video start, in seconds.
businessLabelsarrayNoBusiness label list. See Frame businessLabels.

Frame allLabels

Each element in the allLabels array:

ParameterTypeRequiredDescription
riskLevelstringNoRisk level: PASS, REVIEW, or REJECT.
riskLabel1stringNoLevel 1 risk label.
riskLabel2stringNoLevel 2 risk label.
riskLabel3stringNoLevel 3 risk label.
riskDescriptionstringNoRisk description. Returns "Normal" when riskLevel is PASS. Format: "Level 1: Level 2: Level 3". For reference only — do not use for programmatic logic.
probabilityfloatNoConfidence score (0–1). Higher values indicate greater confidence.
riskDetailobjectNoRisk detail information. See Frame riskDetail.

Frame auxInfo

ParameterTypeRequiredDescription
similarityfloatYesSimilarity between the current frame and the previous frame. The first frame is compared against a pure black background image. Range: 0–1 (closer to 1 = more similar).
qrContentstringNoQR code URL detected in the image.

Frame riskDetail

ParameterTypeRequiredDescription
riskSourceint32YesRisk source. 1000: no risk. 1001: text risk. 1002: visual image risk.
face_numint32NoNumber of faces detected.
person_numint32NoNumber of persons detected.
facesarrayNoNames and positions of politically sensitive individuals in the image. Up to 10 entries (highest probability selected if more than 10). See Face Object.
objectsarrayNoDetected objects/logos with names and positions. See Object Info.
ocrTextobjectNoOCR text content. Present when imgType includes IMGTEXTRISK or ADVERT. Contains text (string): recognized text in the image.
matchedListsarrayNoMatched custom list information. Returned only when a custom list is hit. See Matched Lists.
riskSegmentsarrayNoHigh-risk content segments. Present when political, terrorism, prohibited, competitive, or advertising-law content is detected. See Risk Segments.
personsarrayNoPerson names and positions. When the "person — multiple persons" label is hit, the array contains multiple elements (up to 10, highest probability selected). See Person Object.
Face Object
ParameterTypeRequiredDescription
idstringNoIdentifier. The same person at the same position has the same ID across different labels. If the same person appears N times, N IDs are assigned.
namestringNoPerson name.
face_ratiofloatNoFace-to-image ratio (0–1). Higher values indicate a larger face proportion.
probabilityfloatNoConfidence score (0–1).
locationarrayNoFace position coordinates [x1, y1, x2, y2] representing the top-left and bottom-right corners. Example: [207, 522, 340, 567] where 207=top-left X, 522=top-left Y, 340=bottom-right X, 567=bottom-right Y.
Object Info
ParameterTypeRequiredDescription
idstringNoObject/logo identifier. The same object at the same position has the same ID across different labels.
namestringNoObject name.
probabilityfloatNoConfidence score (0–1).
qrContentstringNoQR code URL detected in the image.
locationarrayNoObject position coordinates [x1, y1, x2, y2] representing the top-left and bottom-right corners.
Matched Lists
ParameterTypeRequiredDescription
namestringNoName of the matched list.
wordsarrayNoSensitive word information from the matched list.
words[].wordstringNoThe matched sensitive word.
words[].positionarrayNoPosition of the sensitive word.
Risk Segments
ParameterTypeRequiredDescription
segmentstringNoHigh-risk content segment.
positionarrayNoPosition of the high-risk content segment (0-indexed).
Person Object
ParameterTypeRequiredDescription
idstringNoIdentifier. The same person has the same ID across different labels. If the same person appears N times, N IDs are assigned.
person_ratiofloatNoPerson-to-image ratio (0–1). Higher values indicate a larger person proportion.
probabilityfloatNoConfidence score (0–1).
locationarrayNoPerson position coordinates.

Frame businessLabels

Each element in the businessLabels array:

ParameterTypeRequiredDescription
businessLabel1stringYesLevel 1 business label.
businessLabel2stringYesLevel 2 business label.
businessLabel3stringYesLevel 3 business label.
businessDescriptionstringYesBusiness label description. Format: "Level 1: Level 2: Level 3". For reference only — do not use for programmatic logic.
probabilityfloatYesConfidence score (0–1).
confidenceLevelint32NoConfidence level (0–2). Higher values indicate greater confidence.
businessDetailobjectNoBusiness label details. May contain face_num, person_num, faces, objects, and persons with the same structure as described in Frame riskDetail.

audioDetail Array

Each element in the array represents an audio segment:

ParameterTypeRequiredDescription
audioUrlstringYesURL of the audio segment.
requestIdstringYesUnique DeepCleer request identifier for this segment.
riskLevelstringYesRisk level. PASS: normal. REVIEW: suspicious. REJECT: violation.
riskLabel1stringYesLevel 1 risk label. Returns normal when riskLevel is PASS.
riskLabel2stringYesLevel 2 risk label. Empty when riskLevel is PASS.
riskLabel3stringYesLevel 3 risk label. Empty when riskLevel is PASS.
riskDescriptionstringYesRisk description. Format: "Level 1: Level 2: Level 3". Returns "Matched custom list" when a custom list is hit. For reference only — do not use for programmatic logic.
allLabelsarrayYesAll risk labels detected for this segment. See Audio allLabels.
audioTextstringNoRecognized text content of this audio segment.
audioStarttimefloatNoAudio segment start time relative to the audio beginning, in seconds. (Note: this field uses lowercase t here. The Audio Stream Moderation API returns the same conceptual field as audioStartTime (uppercase T). On-the-wire casing is preserved as-returned and is a candidate for v5 alignment.)
audioEndtimefloatNoAudio segment end time relative to the audio beginning, in seconds. (Same casing-inconsistency note as audioStarttime.)
businessLabelsarrayNoBusiness label list. See Audio businessLabels.

Audio allLabels

Each element in the allLabels array:

ParameterTypeRequiredDescription
riskLevelstringNoRisk level: PASS, REVIEW, or REJECT.
riskLabel1stringNoLevel 1 risk label.
riskLabel2stringNoLevel 2 risk label.
riskLabel3stringNoLevel 3 risk label.
riskDescriptionstringNoRisk description. For reference only — do not use for programmatic logic.
probabilityfloatNoConfidence score (0–1).
riskDetailobjectNoRisk detail information. See Audio riskDetail.

Audio riskDetail

ParameterTypeRequiredDescription
riskSourceint32YesRisk source. 1000: no risk. 1001: text risk. 1003: audio voice risk.
audioTextstringNoRecognized text content of this segment.
matchedListsarrayNoMatched custom list information. See Matched Lists.
riskSegmentsarrayNoHigh-risk content segments. Present when political, terrorism, prohibited, competitive, or advertising-law content is detected. See Risk Segments.

Audio businessLabels

Each element in the businessLabels array:

ParameterTypeRequiredDescription
businessLabel1stringYesLevel 1 business label.
businessLabel2stringYesLevel 2 business label.
businessLabel3stringYesLevel 3 business label.
businessDescriptionstringYesBusiness label description. Format: "Level 1: Level 2: Level 3". For reference only — do not use for programmatic logic.
probabilityfloatYesConfidence score (0–1).
confidenceLevelint32NoConfidence level (0–2). Higher values indicate greater confidence.
businessDetailobjectNoBusiness label details.
businessDetail.riskSourceint32NoRisk source. 1000: no risk. 1001: text risk. 1003: audio voice risk.
businessDetail.audioTextstringNoRecognized text content.
businessDetail.matchedListsarrayNoMatched custom list information. See Matched Lists.
businessDetail.riskSegmentsarrayNoHigh-risk content segments. See Risk Segments.

Token Labels

Both tokenProfileLabels and tokenRiskLabels share the same structure:

ParameterTypeRequiredDescription
label1stringNoLevel 1 label.
label2stringNoLevel 2 label.
label3stringNoLevel 3 label.
descriptionstringNoLabel description. For reference only — do not use for programmatic logic.
timestampint64NoLabel assignment time. 13-digit Unix timestamp in milliseconds (UTC).

Examples

Request Example

{
  "accessKey": "YOUR_ACCESS_KEY",
  "appId": "default",
  "audioBusinessType": "SING_LANGUAGE",
  "audioType": "POLITY_EROTIC_ADVERT_MOAN",
  "callback": "http://www.example.com/callbackaddr",
  "data": {
    "advancedFrequency": {
      "durationPoints": [300, 600],
      "frequencies": [1, 5, 10]
    },
    "btId": "1639824316368",
    "detectFrequency": 3,
    "ip": "123.171.34.3",
    "returnAllAudio": 1,
    "returnAllImg": 1,
    "tokenId": "test",
    "url": "http://oss.example.com/static/photo/117608703147396.mp4"
  },
  "eventId": "video",
  "imgBusinessType": "BODY_FOOD_3CPRODUCTSLOGO",
  "imgType": "POLITY_EROTIC_ADVERT"
}

Synchronous Response Example

{
  "btId": "1639824316368",
  "code": 1100,
  "message": "Success",
  "requestId": "66fb85e3149bb9e13d6c72161cc6c6cf"
}

Callback Example

{
  "audioDetail": [
    {
      "allLabels": [
        {
          "probability": 0.998463273048401,
          "riskDescription": "Abuse: Personal attack: Severe personal attack",
          "riskDetail": {
            "audioText": "Recognized audio text content...",
            "riskSource": 1001
          },
          "riskLabel1": "abuse",
          "riskLabel2": "renshengongji",
          "riskLabel3": "zhongdurenshengongji",
          "riskLevel": "REJECT"
        }
      ],
      "audioEndtime": 20,
      "audioStarttime": 10,
      "audioText": "Recognized audio text content...",
      "audioUrl": "http://example.com/audio_segment_a0001.wav",
      "businessLabels": [],
      "requestId": "edaa113581ec1c18df7b44c86d36ae3b_a0001",
      "riskDescription": "Abuse: Personal attack: Severe personal attack",
      "riskDetail": {
        "audioText": "Recognized audio text content...",
        "riskSource": 1001
      },
      "riskLabel1": "abuse",
      "riskLabel2": "renshengongji",
      "riskLabel3": "zhongdurenshengongji",
      "riskLevel": "REJECT"
    }
  ],
  "auxInfo": {
    "billingAudioDuration": 85,
    "billingImgNum": 2,
    "frameCount": 2,
    "time": 85
  },
  "btId": "1666684506188",
  "code": 1100,
  "frameDetail": [
    {
      "allLabels": [
        {
          "probability": 0.665125370025635,
          "riskDescription": "Politics: Political symbols: Party emblem",
          "riskDetail": {
            "ocrText": {
              "text": "2022/10/25 09:05"
            },
            "riskSource": 1002
          },
          "riskLabel1": "politics",
          "riskLabel2": "zhengzhixiangzheng",
          "riskLabel3": "danghui",
          "riskLevel": "REJECT"
        }
      ],
      "auxInfo": {
        "similarity": 0.4765625
      },
      "businessLabels": [
        {
          "businessDescription": "Face: Face pose: Frontal face",
          "businessDetail": {},
          "businessLabel1": "face",
          "businessLabel2": "renlianzitai",
          "businessLabel3": "zhenglian",
          "confidenceLevel": 1,
          "probability": 0.450656906102068
        },
        {
          "businessDescription": "Face: Face type: Real person",
          "businessDetail": {
            "face_num": 1,
            "faces": [
              {
                "face_ratio": 0.00227673095650971,
                "id": "f7bf8842f80a5a2192781064bd69e776",
                "location": [352, 237, 381, 278],
                "name": "Example Person",
                "probability": 0.499512671029603
              }
            ]
          },
          "businessLabel1": "face",
          "businessLabel2": "renlianleixing",
          "businessLabel3": "zhenren",
          "confidenceLevel": 2,
          "probability": 0.979977369308472
        }
      ],
      "imgText": "2022/10/25 09:05",
      "imgUrl": "http://example.com/frame_v81.jpg",
      "requestId": "edaa113581ec1c18df7b44c86d36ae3b_v81",
      "riskDescription": "Politics: Political symbols: Party emblem",
      "riskDetail": {
        "ocrText": {
          "text": "2022/10/25 09:05"
        },
        "riskSource": 1002
      },
      "riskLabel1": "politics",
      "riskLabel2": "zhengzhixiangzheng",
      "riskLabel3": "danghui",
      "riskLevel": "REJECT",
      "time": 81
    }
  ],
  "message": "Success",
  "requestId": "66fb85e3149bb9e13d6c72161cc6c6cf",
  "riskLevel": "REJECT"
}