This API submits video stream moderation and related information. Once the stream is pulled stably, recognition results are continuously sent via callback to the specified callback address.
| Item | Specification |
|---|
| Communication Protocol | HTTP or HTTPS |
| Request Method | POST |
| Character Encoding | UTF-8 |
| Parameter Format | All request and response parameters use JSON format |
Standard stream URLs currently support RTMP, RTMPS, HLS, HTTP, and HTTPS protocols, as well as FLV, M3U8, and other formats.
When a user receives a push result and returns an HTTP status code of 200, the push is considered successful. Otherwise, the system retries the push (until the maximum retry count is reached). The retry logic uses intervals of [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60] seconds. After 12 retries, if it still fails, no further retries are attempted.
To prevent stream pull failures caused by network anomalies, the Deepcleer video stream service includes a retry mechanism for failed stream pulls. The specific mechanism is as follows:
- Standard streams, Zego/Tencent/Volcano streams: A total of 12 retries. Each attempt lasts 5 minutes with intervals of [5, 10, 15, 20, ... 60 seconds]. For example, Deepcleer first attempts continuous stream pulling for 5 minutes. If unsuccessful, it waits 5 seconds and pulls again for another 5 minutes. If still unsuccessful, it waits 10 seconds and pulls for another 5 minutes, and so on.
- Agora streams: No retries. The connection is closed after a 5-minute stream pull timeout.
- Recommended timeout: 7 seconds
- Internal processing timeout: 3 seconds with one retry. Normal API response time is within 100ms.
See Historical Versions.
| Cluster | Request URL | Supported Products |
|---|
| Shanghai Video Stream Cluster | http://api-videostream-sh.fengkongcloud.com/videostream/v4 | Chinese video stream |
| Singapore Video Stream Cluster | http://api-videostream-xjp.fengkongcloud.com/videostream/v4 | Chinese video stream, English video stream, Arabic video stream |
| Parameter | Type | Required | Max Length | Description |
|---|
| accessKey | string | Yes | 20 | Company key. Used for permission authentication, provided by Deepcleer upon service activation. |
| eventId | string | Yes | 64 | Event identifier. The value must be agreed upon with Deepcleer in advance. |
| appId | string | Yes | 64 | Application identifier. This field is strictly validated and must be agreed upon with Deepcleer in advance. |
| imgType | string | No | 64 | Moderation types for video stream frames. At least one of imgType or imgBusinessType must be provided. Possible values: POLITY — Political content detection, EROTIC — Pornography & sexual content detection, VIOLENT — Terrorism & prohibited content detection, QRCODE — QR code detection, ADVERT — Advertisement detection, IMGTEXTRISK — Image text violation detection. To combine multiple types, join them with underscores, e.g., POLITY_QRCODE_ADVERT for political, QR code, and advertisement detection. |
| audioType | string | No | 64 | Moderation types for audio in the video stream. At least one of audioType or audioBusinessType must be provided. Possible values: POLITY — Political content detection, EROTIC — Pornographic content detection, ADVERT — Advertisement detection, BAN — Prohibited content detection, VIOLENT — Terrorism detection, DIRTY — Profanity detection, ADLAW — Advertising law, MOAN — Moaning detection, AUDIOPOLITICAL — Top leader voiceprint detection, ANTHEN — National anthem detection, BANEDAUDIO — Prohibited songs, NONE — Do not detect audio. To combine multiple types, join them with underscores, e.g., POLITY_EROTIC for political and pornographic detection. |
| imgBusinessType | string | No | 128 | Business detection types for video stream frames. At least one of imgBusinessType or imgType must be provided. See business label types for possible values. |
| audioBusinessType | string | No | 128 | Business detection types for audio in the video stream. At least one of audioBusinessType or audioType must be provided. Possible values: SING — Singing detection, LANGUAGE — Language detection (Chinese, English, Cantonese, Tibetan, Uyghur, Korean, Mongolian, Other), MINOR — Minor detection, GENDER — Gender detection, TIMBRE — Timbre detection, VOICE — Voice attributes, AUDIOSCENE — Audio scene, AGE — Age detection, APPNAME — App name detection. To detect timbre, singing, or language, GENDER must also be included. To combine multiple types, join them with underscores. |
| imgCallback | string | Yes | 1024 | Image callback URL. Detection results of captured frames from the video stream are sent to this address. |
| audioCallback | string | No | 1024 | Audio callback URL. Detection results of audio segments from the video stream are sent to this address. Required when audio detection is enabled. |
| data | object | Yes | — | Request data content. Maximum size: 1 MB. |
| Parameter | Type | Required | Max Length | Description |
|---|
| streamType | string | Yes | — | Video stream type. Possible values: NORMAL — Standard stream URL (supports RTMP, RTMPS, HLS, HTTP, HTTPS protocols), AGORA — Agora moderation, TRTC — Tencent moderation, ZEGO — Zego moderation, VOLC — Volcano Engine moderation, ALI — Alibaba Cloud moderation. Note: When using an RTC SDK recording solution, additional recording fees may be incurred on the RTC side. Consult the relevant RTC provider for specific fees. |
| tokenId | string | Yes | 64 | User account identifier. Pass the user ID to enable risk detection for behaviors such as spam and advertising. |
| url | string | No | 600 | URL of the standard video stream to be detected. |
| anchorName | string | No | — | Anchor name. Typically used for manual review. |
| audioDetectStep | int32 | No | — | Audio moderation step for the video stream. Unit: segments. Value range: 1–36 (integer). A value of 1 skips one 10-second audio segment, a value of 2 skips two segments, and so on. If not used, all audio content is moderated. |
| detectFrequency | int32 | No | — | Frame capture frequency interval for the video stream. Unit: seconds. Value range: 1–60 seconds. Decimals are rounded down; values less than 1 are treated as 1 second. Default: 3 seconds per frame capture. |
| detectStep | int32 | No | — | Detection step for captured frame images. Only one detection per step of captured frames. Value must be ≥ 1. If not used, all captured frames are moderated. |
| deviceId | string | No | 128 | Deepcleer device fingerprint identifier. Generated by the Deepcleer SDK for user behavior analysis. |
| gender | string | No | — | User gender. Suggested values: male, female, ambiguity (gender unknown). |
| imgBusinessDetectStep | int32 | No | — | Image business label detection step. Only one imgBusinessType detection per step. Value must be ≥ 1. Default: 1 (all segments are moderated for business labels). |
| imgCompareBase | string | No | 1024 | Base image for comparison. Present when the businessType request parameter contains the FACECOMPARE label. Supports image URL links in the following formats: jpg, jpeg, png, webp, gif, tiff, tif, heif. Recommended minimum image resolution: 256×256. Animated image formats are not currently supported for base images. |
| ip | string | No | 64 | Client public IP address. Used for IP-based user behavior analysis. |
| lang | string | No | — | Language type. Specifies the language for text content detection in captured frames and audio segments (default: Chinese). Possible values: zh — Chinese, en — English, ar — Arabic, hi — Hindi, es — Spanish, fr — French, ru — Russian, pt — Portuguese, id — Indonesian, de — German, ja — Japanese, tr — Turkish, vi — Vietnamese, it — Italian, th — Thai, tl — Filipino, ko — Korean, ms — Malay, auto — Automatic language detection (requires Deepcleer to enable the corresponding blocking standard). |
| level | int32 | No | — | User level. Different blocking strategies can be configured for different user levels. Possible values: 0 — Lowest-level user (e.g., newly registered, completely inactive, or level-0 users), 1 — Lower-level user (e.g., low-activity or low-level users), 2 — Mid-level user (e.g., moderately active or mid-level users), 3 — Higher-level user (e.g., highly active or high-level users), 4 — Highest-level user (e.g., paying users, VIP users). |
| liveCover | string | No | — | Live cover image. Typically used for manual review. |
| liveTitle | string | No | — | Live title. Typically used for manual review. |
| receiveTokenId | string | No | 64 | Token ID of the message receiver. A string of up to 64 characters consisting of digits, letters, underscores, and hyphens. |
| returnAllImg | int32 | No | — | Risk level of returned frame recognition results. Possible values (default: 0): 0 — Return image moderation results with a risk level other than pass. 1 — Return image moderation results for all risk levels. |
| returnAllText | int32 | No | — | Risk level of returned audio recognition results. Possible values (default: 0): 0 — Return audio segments and text content with a risk level other than pass. 1 — Return audio segments and text content for all risk levels. |
| returnFinishInfo | int32 | No | — | Stream end callback notification. Possible values (default: 0): 1 — Send an end notification when moderation finishes; the callback parameters include a statCode status code. 0 — Do not send an end notification when moderation finishes. |
| returnPreAudio | int32 | No | — | Whether to return previous segment information. Possible values: 1 — The returned preAudioUrl field contains a 20-second audio segment link covering the previous 10 seconds and the current 10 seconds. 0 — Do not return previous segment information. |
| returnPreText | int32 | No | — | Whether to return previous segment text information. Possible values: 1 — The returned content field contains 20 seconds of audio segment text covering the previous 10 seconds and the current 10 seconds. 0 — Do not return previous segment text information. |
| room | string | No | 64 | Live room / game room number. Different strategies can be configured for individual rooms. |
| streamName | string | No | 64 | Video stream name. Used for display in the backend interface; recommended to pass in. |
| acceptLang | string | No | — | Language type for returned labels. Possible values: zh — Chinese, en — English. Default: Chinese labels. |
| Parameter | Type | Required | Max Length | Description |
|---|
| passThrough | object | No | 1024 | Customer pass-through field. Deepcleer does not process this field internally; it is returned as-is with the results. |
| Parameter | Type | Required | Max Length | Description |
|---|
| appId | string | Yes | 64 | Application identifier provided by Agora. |
| channel | string | Yes | 64 | Channel name provided by Agora. |
| channelProfile | int32 | No | 32 | Agora recording channel mode. Possible values: 0 — Communication (default). Common for 1-on-1 or group chats where any user in the channel can speak freely. 1 — Live broadcast. Two user roles: host and audience. |
| enableH265Support | boolean | No | — | Whether to support H.265 video stream recording. false (default) — Do not support H.265 video stream recording. Remote users in the channel cannot send H.265 video streams. true — Support H.265 video stream recording. |
| enableIntraRequest | boolean | No | — | Whether to enable keyframe requests. Default: true. This improves the audio/video experience under weak network conditions. To enable seeking in recordings made in individual stream mode, set enableIntraRequest to false. false — Disable keyframe requests. All senders in the channel send keyframes every 2 seconds. When disabled, recordings in individual stream mode support seeking. true — The sender controls whether keyframe requests are enabled. When enabled, recordings in individual stream mode do not support seeking. |
| subscribeMode | string | No | — | Subscription mode. AUTO — Automatically subscribe to all streams in the room (default behavior). UNTRUSTED — Used with untrustedUserIdList to subscribe only to streams from users in the list. In this mode, if the list is empty, a parameter error occurs because no streams can be subscribed. TRUSTED — Used with trustedUserIdList to subscribe only to streams from users NOT in the list. In this mode, if no users outside the trustedUserIdList enter the room within a certain time, Deepcleer will proactively end the moderation. |
| token | string | No | 64 | Users with higher security requirements can use a token for authentication. See Agora documentation for generation: https://docs.agora.io/cn/Recording/token_server?platform=CPP. Set the token validity period to exceed the channel duration to prevent stream pull failures due to token expiration. Agora currently supports a maximum token validity of 24 hours. For channels lasting more than 24 hours, handle token expiration as follows: Set returnFinishInfo to 1 in the request parameters. When the callback receives an end notification (statCode is 1) due to an invalid or expired pull token, generate a new token and resubmit the channel for moderation if it still exists and requires continued moderation. |
| uid | int32 | No | 64 | A 32-bit unsigned integer. When a token is present, provide the user ID used to generate the token. Note: This uid must be different from the actual user UIDs in the room — the uid used for server-side recording must not already exist in the room. |
| trustedUserIdList | array | No | — | List of trusted users. Effective when subscribeMode is TRUSTED. Must not be empty. Deepcleer will not subscribe to streams from users in this list. Comma-separated UID array, e.g., [1, 2]. Maximum: 17 users. |
| untrustedUserIdList | array | No | — | List of untrusted users. Effective when subscribeMode is UNTRUSTED. Must not be empty. Deepcleer will only subscribe to streams from users in this list. Comma-separated UID array, e.g., [1, 2]. Maximum: 17 users. |
| Parameter | Type | Required | Max Length | Description |
|---|
| sdkAppId | int32 | Yes | 64 | The sdkAppId provided by Tencent. |
| strRoomId | string | Yes | 128 | Room number. Only allows letters (a-zA-Z), digits (0-9), underscores, and hyphens. Note: If both strRoomId and roomId are provided, roomId takes priority. |
| userId | string | Yes | 32 | The userId assigned to the recording end. Max length: 32 bits. Only allows letters (a-zA-Z), digits (0-9), underscores, and hyphens. |
| userSig | string | Yes | 128 | Verification signature corresponding to the recording userId, equivalent to a login password. |
| appScene | int32 | Yes | 1 | Application scene. Possible values: 0 — Video call scene, 1 — Video live broadcast scene. Default: 0. See: https://cloud.tencent.com/document/product/647/79634 |
| demoSences | int32 | Yes | — | Recording type. Possible values: 2 — Individual stream recording, 4 — Mixed stream recording. |
| roomId | int32 | No | 10 | Room number. Value range: 1–4294967294. One of roomId or strRoomId must be provided. If both have values, roomId takes priority. Note: Currently, a maximum of 8 users can be moderated per room. |
| Parameter | Type | Required | Max Length | Description |
|---|
| roomId | string | Yes | 64 | Zego room number. |
| tokenId | string | Yes | 64 | Authentication information provided by Zego. Used to obtain identify_token for login. See Zego documentation for generation: https://doc-zh.zego.im/article/15258. Note: tokenId is a unique identifier. A new token must be generated for each moderation request. |
| Parameter | Type | Required | Max Length | Description |
|---|
| appId | string | Yes | 64 | Application identifier provided by Volcano Engine. |
| roomId | string | Yes | 128 | Room ID. |
| token | string | Yes | 64 | Verification signature corresponding to the recording userId, equivalent to a login password. |
| userId | string | Yes | 32 | The userId assigned to the recording end. Max length: 32 bits. Only allows letters (a-zA-Z), digits (0-9), underscores, and hyphens. |
| Parameter | Type | Required | Max Length | Description |
|---|
| room | string | Yes | 64 | Room ID. Must be exactly the same as the channelID used to generate the token. The server pulls and records streams on a per-room basis. room is a unique identifier; duplicate rooms will not result in duplicate stream pulls. |
| token | string | Yes | 64 | Token used for the pull-stream end to join the channel. See documentation for generation: https://help.aliyun.com/zh/live/user-guide/token-based-authentication. A new token must be generated for each moderation submission. |
| userId | int32 | No | 32 | Alibaba user account identifier. |
| Parameter | Type | Required | Description |
|---|
| requestId | string | Yes | Deepcleer unique request identifier. |
| message | string | Yes | Corresponds to code: 1100 — Success, 1901 — QPS limit exceeded, 1902 — Invalid parameters, 1903 — Service failure, 1904 — Stream count limit exceeded, 9101 — Unauthorized operation. |
| code | int32 | Yes | 1100 — Success, 1901 — QPS limit exceeded, 1902 — Invalid parameters, 1903 — Service failure, 1904 — Stream count limit exceeded, 9101 — Unauthorized operation. |
| detail | object | No | Detail information. |
| dupRequestId | string | No | Duplicate requestId. Returned when errorcode is 1001 (duplicate stream submission). For example, if the first request did not receive a response but the audio stream has already started moderation and there is no requestId to close it, you can submit the request again. Upon receiving the duplicate stream message, use the returned dupRequestId to call the close moderation API. |
| errorcode | int32 | No | 1001 — Duplicate stream submission. |
The following parameters (except code, message, and requestId) are required only when code returns 1100.
| Parameter | Type | Required | Description |
|---|
| requestId | string | Yes | Deepcleer unique request identifier. |
| message | string | Yes | Corresponds to code: 1100 — Success, 1901 — QPS limit exceeded, 1902 — Invalid parameters, 1903 — Service failure, 1904 — Stream count limit exceeded, 9101 — Unauthorized operation. |
| code | int32 | Yes | 1100 — Success, 1901 — QPS limit exceeded, 1902 — Invalid parameters, 1903 — Service failure, 1904 — Stream count limit exceeded, 9101 — Unauthorized operation. |
| statCode | int32 | No | Callback status code. Values: 0 — Moderation result callback, 1 — Stream end result callback. |
| contentType | int32 | No | Distinguishes between audio and image callbacks. Possible values: 1 — Image callback, 2 — Audio callback. |
| auxInfo | object | No | Auxiliary information. Contains the passThrough field from data.extra in the request parameters. |
| passThrough | object | No | Customer pass-through field. Deepcleer does not process this field internally; it is returned as-is with the results. |
Returned when there are risky segments or when returnAllImg=1.
| Parameter | Type | Required | Description |
|---|
| imgUrl | string | Yes | Captured frame image URL. |
| riskLevel | string | Yes | Recognition result. Possible values: PASS — Normal, recommended to allow, REVIEW — Suspicious, recommended for manual review, REJECT — Violation, recommended to block. |
| riskLabel1 | string | Yes | Primary risk label. Returns normal when riskLevel is PASS. |
| riskLabel2 | string | Yes | Secondary risk label. Empty when riskLevel is PASS. |
| riskLabel3 | string | Yes | Tertiary risk label. Empty when riskLevel is PASS. |
| riskDescription | string | Yes | Label description. Returns "Hit custom list" when a user-defined list is matched; returns "Normal" when riskLevel is PASS; otherwise displays as "Primary Label: Secondary Label: Tertiary Label". For reference only — do not use this value for programmatic logic. |
| allLabels | array | Yes | Complete list of risk labels. |
| riskDetail | object | No | Risk detail information. |
| auxInfo | object | Yes | Auxiliary information. |
| businessLabels | array | No | Business label list. |
| Parameter | Type | Required | Description |
|---|
| riskLevel | string | No | Recognition result. Possible values: PASS, REVIEW, REJECT. |
| riskLabel1 | string | No | Primary risk label. |
| riskLabel2 | string | No | Secondary risk label. |
| riskLabel3 | string | No | Tertiary risk label. |
| riskDescription | string | No | Returns "Normal" when riskLevel is PASS; otherwise "Primary Label: Secondary Label: Tertiary Label". For reference only. |
| probability | float | No | Confidence score. Value range: 0–1. Higher values indicate greater confidence. |
| Parameter | Type | Required | Description |
|---|
| riskSource | int32 | Yes | Risk source. Possible values: 1000 — No risk, 1001 — Text risk, 1002 — Visual image risk. |
| face_num | int32 | No | Number of faces detected. |
| person_num | int32 | No | Number of persons detected. |
| faces | array | No | Name and position information of politically sensitive persons in the image. When the face-type-multiface label is matched, the array contains multiple elements (up to 10; if more than 10, the top 10 by probability are selected). |
| objects | array | No | Object information. Returns names and positions of identified objects or symbols in the image. |
| ocrText | object | No | OCR text content recognized in the image. Present when the imgType request parameter includes IMGTEXTRISK or ADVERT. |
| matchedLists | array | No | Matched custom list information. Returned only when a customer-defined list is matched. |
| riskSegments | array | No | High-risk content segments. Present when the detected image contains political, terrorism, prohibited, competitor, or advertising law violations. |
| persons | array | No | Name and position information of persons in the image. When the "person-multiple persons" label is matched, the array contains multiple elements (up to 10; if more than 10, the top 10 by probability are selected). |
| Parameter | Type | Required | Description |
|---|
| id | string | No | Identifier. The same person at the same position has the same ID across different labels. If the same person appears n times in the image, n IDs are assigned. |
| name | string | No | Person name. |
| face_ratio | float | No | Face ratio. Range: 0–1. Higher values indicate a larger face area. |
| probability | float | No | Confidence score. Range: 0–1. Higher values indicate greater confidence. |
| location | array | No | Person position information. The array has four values representing the top-left and bottom-right coordinates. Example: [207, 522, 340, 567] — 207: top-left x, 522: top-left y, 340: bottom-right x, 567: bottom-right y. |
| Parameter | Type | Required | Description |
|---|
| id | string | No | Object or symbol identifier. The same object at the same position has the same ID across different labels. |
| name | string | No | Object name. |
| probability | float | No | Confidence score. Range: 0–1. Higher values indicate greater confidence. |
| qrContent | string | No | QR code URL recognized in the image. |
| location | array | No | Object position information. The array has four values representing the top-left and bottom-right coordinates. Example: [207, 522, 340, 567]. |
| Parameter | Type | Required | Description |
|---|
| text | string | Yes | Text recognized in the image. |
| Parameter | Type | Required | Description |
|---|
| name | string | No | Name of the matched list. |
| words | array | No | Sensitive word information from the matched list. |
| Parameter | Type | Required | Description |
|---|
| word | string | No | The matched sensitive word. |
| position | array | No | Position of the sensitive word. |
| Parameter | Type | Required | Description |
|---|
| segment | string | No | High-risk content segment. |
| position | array | No | Position of the high-risk content segment. Index starts from 0. |
| Parameter | Type | Required | Description |
|---|
| id | string | No | Identifier. The same person has the same ID across different labels. If the same person appears n times in the image, n IDs are assigned. |
| person_ratio | float | No | Person ratio. Range: 0–1. Higher values indicate a larger person area. |
| probability | float | No | Confidence score. Range: 0–1. Higher values indicate greater confidence. |
| location | array | No | Person position coordinates. |
| Parameter | Type | Required | Description |
|---|
| beginProcessTime | int32 | Yes | Processing start time (13-digit timestamp). |
| finishProcessTime | int32 | Yes | Processing end time (13-digit timestamp). |
| detectType | int32 | No | Indicates whether the captured frame was actually detected. Possible values (only returned when the detectStep request parameter is set): 1 — The captured frame was detected, 2 — The captured frame was not detected. |
| imgTime | string | No | Time when the captured frame was taken (absolute time of the violation in the video stream). |
| room | string | No | Room number. |
| similarityDedup | int32 | No | Auxiliary parameter. Possible values (only returned when the similar frame deduplication feature changes the outer riskLevel from reject/review to pass; not returned in other cases): 1 — Similar frame deduplication feature is active. |
| strUserId | string | No | User identifier for distinguishing violating users within a room. Unrelated to the userId request parameter; this is the individual stream user ID. Returned in the following cases: ZEGO stream moderation by room number, TRTC individual stream moderation, VOLC stream moderation, ALI stream moderation. |
| userId | int32 | No | Agora user account identifier. Only present in individual stream scenarios. The returned userId is the actual user ID in the room, unrelated to the uid request parameter. |
| Parameter | Type | Required | Description |
|---|
| businessLabel1 | string | Yes | Primary business label. |
| businessLabel2 | string | Yes | Secondary business label. |
| businessLabel3 | string | Yes | Tertiary business label. |
| businessDescription | string | Yes | Business label description in the format "Primary Label: Secondary Label: Tertiary Label". |
| probability | float | Yes | Confidence score. Range: 0–1. Higher values indicate greater confidence. |
| confidenceLevel | int32 | No | Confidence level. Range: 0–2. Higher values indicate greater confidence. |
| businessDetail | object | No | Business label details. |
| Parameter | Type | Required | Description |
|---|
| face_num | int32 | No | Number of faces detected. |
| person_num | int32 | No | Number of persons detected. |
| faces | array | No | Name and position information of politically sensitive persons in the image. Structure is the same as frameDetail.riskDetail.faces. |
| objects | array | No | Object information. Structure is the same as frameDetail.riskDetail.objects. |
| persons | array | No | Person information. Structure is the same as frameDetail.riskDetail.persons. |
Returned when there are risky segments or when returnAllText=1.
| Parameter | Type | Required | Description |
|---|
| audioUrl | string | Yes | Audio segment URL. |
| riskLevel | string | Yes | Recognition result. Possible values: PASS — Normal, recommended to allow, REVIEW — Suspicious, recommended for manual review, REJECT — Violation, recommended to block. |
| riskLabel1 | string | Yes | Primary risk label. Returns normal when riskLevel is PASS. |
| riskLabel2 | string | Yes | Secondary risk label. Empty when riskLevel is PASS. |
| riskLabel3 | string | Yes | Tertiary risk label. Empty when riskLevel is PASS. |
| riskDescription | string | Yes | Label description. Returns "Hit custom list" when a user-defined list is matched; returns "Normal" when riskLevel is PASS; otherwise displays as "Primary Label: Secondary Label: Tertiary Label". For reference only — do not use this value for programmatic logic. |
| vadCode | int32 | Yes | Whether the segment is silent. 0 — Silent segment, 1 — Non-silent segment. |
| allLabels | array | Yes | Complete list of risk labels. |
| riskDetail | object | No | Risk detail information for each segment. |
| content | string | No | Text content recognized from the audio in the video stream. When returnPreText is 1 and the current audio segment is reject, returns 20 seconds of audio segment text (previous 10 seconds + current 10 seconds). Otherwise, returns only the current segment text. |
| preAudioUrl | string | No | Previous audio segment link. When returnPreAudio is 1 and the current audio segment is reject, returns a 20-second audio segment link (previous 10 seconds + current 10 seconds). Otherwise, not returned. |
| auxInfo | object | No | Auxiliary information. |
| businessLabels | array | No | Business label list. |
| Parameter | Type | Required | Description |
|---|
| riskLevel | string | Yes | Recognition result. Possible values: PASS, REVIEW, REJECT. |
| riskLabel1 | string | Yes | Primary risk label. Returns normal when riskLevel is PASS. |
| riskLabel2 | string | Yes | Secondary risk label. Empty when riskLevel is PASS. |
| riskLabel3 | string | Yes | Tertiary risk label. Empty when riskLevel is PASS. |
| riskDescription | string | Yes | Risk description. Returns "Normal" when riskLevel is PASS. Format: "Primary Risk Label: Secondary Risk Label: Tertiary Risk Label". Returns "Hit custom list" when a user-defined list is matched. |
| Parameter | Type | Required | Description |
|---|
| riskSource | int32 | Yes | Risk source. Possible values: 1000 — No risk, 1001 — Text risk, 1002 — Visual image risk, 1003 — Audio voice risk. |
| audioText | string | No | Text content recognized in the segment. |
| matchedLists | array | No | Matched custom list information. Returned only when a customer-defined list is matched. |
| riskSegments | array | No | High-risk content segments. Present when the detection contains political, terrorism, prohibited, competitor, or advertising law violations. |
| Parameter | Type | Required | Description |
|---|
| name | string | No | Name of the matched list. |
| words | array | No | Sensitive word information from the matched list. |
| Parameter | Type | Required | Description |
|---|
| word | string | No | The matched sensitive word. |
| position | array | No | Position of the high-risk content segment. Index starts from 0. |
| Parameter | Type | Required | Description |
|---|
| segment | string | No | High-risk content segment. |
| position | array | No | Position of the high-risk content segment. Index starts from 0. |
| Parameter | Type | Required | Description |
|---|
| audioStartTime | string | Yes | Violation content start time (absolute time). |
| audioEndTime | string | Yes | Violation content end time (absolute time). |
| beginProcessTime | int32 | Yes | Processing start time (13-digit timestamp). |
| finishProcessTime | int32 | Yes | Processing end time (13-digit timestamp). |
| room | string | No | Room number. |
| strUserId | string | No | User identifier for distinguishing violating users within a room. Unrelated to the userId request parameter; this is the individual stream user ID. Returned in the following cases: ZEGO stream moderation by room number, TRTC individual stream moderation, VOLC stream moderation, ALI stream moderation. |
| userId | int32 | No | Agora user account identifier. Only present in individual stream scenarios. The returned userId is the actual user ID in the room, unrelated to the uid request parameter. |
| Parameter | Type | Required | Description |
|---|
| businessLabel1 | string | Yes | Primary business label. |
| businessLabel2 | string | Yes | Secondary business label. |
| businessLabel3 | string | Yes | Tertiary business label. |
| businessDescription | string | Yes | Business label description in the format "Primary Label: Secondary Label: Tertiary Label". |
| probability | float | Yes | Confidence score. Range: 0–1. Higher values indicate greater confidence. |
| confidenceLevel | int32 | No | Confidence level. Range: 0–2. Higher values indicate greater confidence. |
| riskDetail | object | No | Risk detail information. |
| Parameter | Type | Required | Description |
|---|
| riskSource | int32 | Yes | Risk source. Possible values: 1000 — No risk, 1001 — Text risk, 1003 — Audio voice risk. |
| audioText | string | No | Text content recognized in the segment. |
| matchedLists | array | No | Matched custom list information. Returned only when a customer-defined list is matched. Structure is the same as audioDetail.riskDetail.matchedLists. |
| riskSegments | array | No | High-risk content segments. Structure is the same as audioDetail.riskDetail.riskSegments. |
Returned only when tokenId is provided and the label service is enabled.
| Parameter | Type | Required | Description |
|---|
| label1 | string | No | Primary label. |
| label2 | string | No | Secondary label. |
| label3 | string | No | Tertiary label. |
| description | string | No | Label description. |
| timestamp | int32 | No | Label timestamp. 13-digit Unix timestamp in milliseconds. |
Returned only when tokenId is provided and the label service is enabled.
| Parameter | Type | Required | Description |
|---|
| label1 | string | No | Primary label. |
| label2 | string | No | Secondary label. |
| label3 | string | No | Tertiary label. |
| description | string | No | Label description. |
| timestamp | int32 | No | Label timestamp. 13-digit Unix timestamp in milliseconds. |
Returned only when returnFinishInfo is set to 1.
| Parameter | Type | Required | Description |
|---|
| requestId | string | Yes | Deepcleer unique request identifier. |
| message | string | Yes | Corresponds to code: 1100 — Success, 1901 — QPS limit exceeded, 1902 — Invalid parameters, 1903 — Service failure, 1904 — Stream count limit exceeded, 9101 — Unauthorized operation. |
| code | int32 | Yes | 1100 — Success, 1901 — QPS limit exceeded, 1902 — Invalid parameters, 1903 — Service failure, 1904 — Stream count limit exceeded, 9101 — Unauthorized operation. |
| riskLevel | string | Yes | Overall stream disposition recommendation returned when the callback ends. Possible values: PASS — Normal, recommended to allow, REVIEW — Suspicious, recommended for manual review, REJECT — Violation, recommended to block. |
| statCode | int32 | Yes | Callback status code. Present when returnFinishInfo is 1. Values: 0 — Moderation result callback, 1 — Stream end result callback. |
| contentType | int32 | Yes | Distinguishes between audio and image callback end. Possible values: 1 — Image moderation end callback, 2 — Audio moderation end callback. |
| pullStreamSuccess | bool | Yes | Whether the stream pull was successful. Possible values: true — Stream pull succeeded, false — Stream pull failed. The stream pull is considered failed if not even a single captured frame was obtained successfully. |
| auxInfo | object | Yes | Auxiliary information. |
| Parameter | Type | Required | Description |
|---|
| streamTime | int32 | Yes | Stream moderation duration. Returned in the last callback after the stream ends. Represents the moderation duration. When interval moderation logic is applied, this may differ from the actual stream duration. |
| requestParams | object | No | Returns all fields from the data request parameter. Returned when contentType is 2. |
| detail | object | No | Detail information. Returned when contentType is 1. |
| Parameter | Type | Required | Description |
|---|
| requestParams | object | Yes | Returns all fields from the data request parameter |
{
"accessKey": "*********",
"appId": "defaulttest",
"audioBusinessType": "SING_LANGUAGE",
"audioCallback": "http://www.xxx.top/callbackxxx",
"audioType": "POLITY_EROTIC_ADVERT_MOAN",
"data": {
"detectFrequency": 10,
"detectStep": 1,
"extra": {
"passThrough": {
"passThrough1": "111",
"passThrough2": "222",
"passThrough3": "333"
}
},
"ip": "123.171.34.4",
"lang": "zh",
"returnAllImg": 1,
"returnAllText": 1,
"returnPreAudio": 1,
"returnPreText": 1,
"room": "5e1854a6a0a79d0001a09bc3",
"streamType": "NORMAL",
"tokenId": "123",
"url": "http://rtmp.xxxx.cn/live/3637778raLSXdOdu.flv"
},
"eventId": "VIDEOSTREAM",
"imgBusinessType": "BODY_FOOD_3CPRODUCTSLOGO",
"imgCallback": "http://www.xxx.top/callbackxxx",
"imgType": "POLITY_EROTIC_ADVERT"
}
{
"code": 1100,
"message": "Success",
"requestId": "66fb85e3149bb9e13d6c72161cc6c6cf"
}
{
"auxInfo": {
"passThrough": {
"passThrough1": "111",
"passThrough2": "222",
"passThrough3": "333"
}
},
"code": 1100,
"contentType": 1,
"frameDetail": {
"allLabels": [
{
"riskDescription": "Involvement in politics: Involvement in politics: Involvement in politics",
"riskLabel1": "politics",
"riskLabel2": "shezheng",
"riskLabel3": "shezheng",
"riskLevel": "REJECT"
}
],
"auxInfo": {
"beginProcessTime": 1639825248361,
"detectType": 1,
"finishProcessTime": 1639825248809,
"imgTime": "2021-12-18 19:00:48.375",
"room": "5e1854a6a0a79d0001a09bc3"
},
"businessLabels": [],
"imgUrl": "http://bj.cos.ap-beijing.xxx.com/image/1639825145166_vs130_1639825248361471656.jpg",
"riskDescription": "Involvement in politics: Involvement in politics: Involvement in politics",
"riskDetail": {
"ocrText": {
"text": "Page 4 (ban) Page 5 (violence)"
},
"riskSource": 1002
},
"riskLabel1": "politics",
"riskLabel2": "shezheng",
"riskLabel3": "shezheng",
"riskLevel": "REJECT"
},
"message": "Success",
"requestId": "1639825145166_vs130_1639825248361471656"
}
{
"audioDetail": {
"allLabels": [
{
"riskDescription": "Political involvement: Leader One: Leader One",
"riskLabel1": "politics",
"riskLabel2": "yihaolingdao",
"riskLabel3": "yihaolingdao",
"riskLevel": "REJECT"
}
],
"audioText": "Emphasized in important instructions that vocational education has broad prospects in the new journey of comprehensively building a modern socialist country. General Secretary of the CPC Central Committee and President of the State.",
"audioUrl": "http://bj-voice-mp3-1251671073.cos.ap-beijing.myqcloud.com/POST_VIDEOSTREAM%2FPOST_VIDEOSTREAM_AUDIO%2FMP3%2F20221027%2Fy28f8a4f1264085b321f12223wqed1121retestpvvvvv44321we12_3.mp3?q-sign-algorithm=sha1&q-ak=AKIDg9LHyOYSAcmfHekZ6NN6XidHflbASUHn&q-sign-time=1666876123%3B1669468123&q-key-time=1666876123%3B1669468123&q-header-list=host&q-url-param-list=&q-signature=f32da45be186fd4a8ed063e499d3f4e0f4f5fc19",
"auxInfo": {
"audioEndTime": "2022-10-27 21:08:42",
"audioStartTime": "2022-10-27 21:08:32",
"beginProcessTime": 1666876123332,
"finishProcessTime": 1666876123893,
"room": "y1123413312ewe24sv2"
},
"businessLabels": [],
"content": "",
"preAudioUrl": "http://bj-voice-mp3-1251671073.cos.ap-beijing.myqcloud.com/POST_VIDEOSTREAM%2FPOST_VIDEOSTREAM_AUDIO%2FMP3%2F20221027%2Fy28f8a4f1264085b321f12223wqed1121retestpvvvvv44321we12_3_pre.mp3?q-sign-algorithm=sha1&q-ak=AKIDg9LHyOYSAcmfHekZ6NN6XidHflbASUHn&q-sign-time=1666876123%3B1669468123&q-key-time=1666876123%3B1669468123&q-header-list=host&q-url-param-list=&q-signature=449fdcab8a3c11d5132f43f78c61e6663f5c08d6",
"riskDescription": "Political involvement: Leader One: Leader One",
"riskDetail": {
"audioText": "Stressed in important instructions that vocational education has broad prospects in the new journey of comprehensively building a modern socialist country. General Secretary of the CPC Central Committee and President of the State.",
"riskSource": 1001
},
"riskLabel1": "politics",
"riskLabel2": "yihaolingdao",
"riskLabel3": "yihaolingdao",
"riskLevel": "REJECT"
},
"code": 1100,
"contentType": 2,
"message": "Success",
"requestId": "y28f8a4f1264085b321f12223wqed1121retestpvvvvv44321we12_3",
"statCode": 0
}
{
"auxInfo": {
"streamTime": 70
},
"code": 1100,
"contentType": 1,
"detail": {
"requestParams": {
"detectFrequency": 10,
"detectStep": 1,
"extra": {
"passThrough": {
"passThrough1": "111",
"passThrough2": "222",
"passThrough3": "333"
}
},
"ip": "123.171.34.4",
"lang": "zh",
"returnAllImg": 1,
"returnAllText": 1,
"returnPreAudio": 1,
"returnPreText": 1,
"room": "5e1854a6a0a79d0001a09bc3",
"streamType": "NORMAL",
"tokenId": "123",
"url": "http://rtmp.xxxx.cn/live/3637778raLSXdOdu.flv"
}
},
"message": "Success",
"pullStreamSuccess": true,
"requestId": "5515ce1f9b474a6c4a3d79a8dfcaeaf4",
"riskLevel": "PASS",
"statCode": 1
}