Request API

API Description

Video stream moderation API. Submits a live or pre-recorded video stream URL (or RTC channel) and DeepCleer pulls the stream from the source, captures frames, extracts audio segments, and delivers per-frame and per-audio-segment results to the callback URLs you supply. Detects regulatory risks in both the visual track (political, pornography, violence & terrorism, QR codes, advertisements, image-text violations) and the audio track (political, pornography, advertising, prohibited, profanity, moaning, top-leader voiceprint, national anthem, prohibited songs), and can additionally identify business-scenario attributes such as gender, voice timbre, language, age, and minor presence.

Requirements

ItemSpecification
ProtocolHTTP or HTTPS
MethodPOST
EncodingUTF-8
FormatAll request and response parameters use JSON

Stream Requirements

ItemSpecification
Standard stream URLsRTMP, RTMPS, HLS, HTTP, HTTPS protocols; FLV, M3U8, and other common container formats
RTC providersAgora, Tencent (TRTC), Zego, Volcano Engine, Alibaba Cloud — each with provider-specific parameters (see Stream Type)

Timeout Suggestion

  • Recommended timeout: 7 seconds for the submission call
  • Internal processing timeout: 3 seconds with one automatic retry; normal API response time is within 100 ms
ℹ️

The submission call only registers the stream for moderation — it returns almost immediately. Frame and audio results are delivered separately via the callback URLs you supply. Keep your callback handlers fast (< 2 seconds) so DeepCleer doesn't trigger unnecessary retries.

Callback Mechanism

Results are delivered to the imgCallback and audioCallback URLs you supply in the request. When DeepCleer calls your endpoint:

  • The request body is a JSON payload matching Stream Segment Callback Parameters.
  • Your endpoint must respond with HTTP 200 OK. Any non-200 response is treated as a delivery failure.
  • On failure, DeepCleer retries with the following intervals (in seconds): [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60]. After 12 failed retries, the segment is dropped.
  • Your endpoint should be idempotent on requestId + imgUrl / audioUrl — the same segment may be delivered more than once if an earlier delivery succeeded but the response was lost in transit.

Stream Pull Retry Mechanism

To reduce failures caused by transient network issues, DeepCleer retries failed stream pulls as follows:

  • Standard streams and Zego / Tencent / Volcano streams: up to 12 retries. Each attempt lasts 5 minutes; intervals between attempts follow [5, 10, 15, 20, …, 60] seconds. For example, DeepCleer first attempts continuous stream pulling for 5 minutes; if unsuccessful, it waits 5 seconds and pulls again for another 5 minutes; if still unsuccessful, it waits 10 seconds and pulls again, and so on.
  • Agora streams: no retries. The connection is closed after a 5-minute stream pull timeout.

Request

Request URL

ClusterRequest URL
Singapore Video Stream Clusterhttp://api-videostream-xjp.fengkongcloud.com/videostream/v4

Request Parameters

ParameterTypeRequiredMax LengthDescription
accessKeystringYes20API authentication key. The default accessKey is sent in your onboarding email.
appIdstringYes64Application identifier, such as web for your web application or app for your mobile app. The default appId is sent in your onboarding email. Contact DeepCleer if you need a new appId.
eventIdstringYes64Event identifier used to distinguish moderation scenarios in your application, such as liveStream for live broadcasts or rtcRoom for RTC channels. The default eventId is sent in your onboarding email. Contact DeepCleer if you need a new eventId.
imgTypestringConditional1024Image (frame) risk detection types. At least one of imgType or imgBusinessType must be provided. Multiple values can be combined with underscores, for example POLITY_QRCODE_ADVERT. See Image Detection Types.
audioTypestringConditional1024Audio risk detection types. At least one of audioType or audioBusinessType must be provided when audio moderation is enabled. Multiple values can be combined with underscores, for example POLITY_EROTIC. Use NONE to skip audio moderation entirely. See Audio Detection Types.
imgBusinessTypestringConditional1024Image business-label detection types. At least one of imgType or imgBusinessType must be provided. Multiple values can be combined with underscores. See Business Types of Visual Moderation.
audioBusinessTypestringConditional1024Audio business-label detection types. At least one of audioType or audioBusinessType must be provided when audio moderation is enabled. Multiple values can be combined with underscores. See Audio Business Detection Types.
imgCallbackstringYes1024Callback HTTP URL for captured-frame moderation results. DeepCleer posts image callback payloads to this endpoint.
audioCallbackstringConditional1024Callback HTTP URL for audio-segment moderation results. Required when audio moderation is enabled. DeepCleer posts audio callback payloads to this endpoint.
dataobjectYes1 MBRequest data content. Max 1 MB. See data Object.
acceptLangstringNo-Language for returned labels. Set en by default. Supported values: en, zh.

Image Detection Types

Combine multiple types with underscores, for example POLITY_QRCODE_ADVERT.

ValueDescription
POLITYPolitical content detection
EROTICPornography and sexual content detection
VIOLENTViolence, terrorism, and prohibited content detection
QRCODEQR code detection
ADVERTAdvertisement detection
IMGTEXTRISKImage-text violation detection (OCR)

Audio Detection Types

Combine multiple types with underscores, for example POLITY_EROTIC. Use NONE to skip audio moderation entirely.

ValueDescription
POLITYPolitical content detection
EROTICPornographic content detection
ADVERTAdvertising detection
BANProhibited content detection
VIOLENTViolence and terrorism detection
DIRTYProfanity or abusive language detection
ADLAWAdvertising law violation detection
MOANMoaning detection
AUDIOPOLITICALTop-leader voiceprint detection
ANTHENNational anthem detection
BANEDAUDIOProhibited songs detection
NONESkip audio detection

Audio Business Detection Types

Combine multiple types with underscores. To detect timbre, singing, or language, GENDER must also be included.

ValueDescription
SINGSinging detection
LANGUAGELanguage detection (Chinese, English, Cantonese, Tibetan, Uyghur, Korean, Mongolian, Other)
MINORMinor detection
GENDERGender detection
TIMBREVoice timbre detection
VOICEVoice attributes
AUDIOSCENEAudio scene detection
AGEAge detection
APPNAMEApp name detection

data Object

ParameterTypeRequiredMax LengthDescription
streamTypestringYes-Video stream source type. See Stream Type.
tokenIdstringYes64Stable identifier for the end user, typically your internal user ID. Used for behavioral-risk signals such as spam, advertising, and repeat-offender detection. Alphanumeric with underscores and hyphens, up to 64 characters.
anchorNamestringNo-Anchor display name. Usually used by human review workflows.
audioDetectStepint32No-Audio moderation step in 10-second segments. Integer range: 1-36. 1 means skip one 10-second segment between moderated segments; 2 means skip two segments, and so on. Omit this field to moderate all audio segments.
detectFrequencyint32No-Frame capture interval, in seconds. Range: 1-60. Decimals are rounded down; values below 1 are treated as 1. Default: 3 seconds.
detectStepint32No-Captured-frame moderation step. One captured frame is moderated per step. Must be >= 1. Omit this field to moderate all captured frames.
deviceIdstringNo128Device-fingerprint identifier issued by the DeepCleer SDK. Used for user behavior analysis.
genderstringNo-End user's gender. Recommended values: male, female, ambiguity.
imgBusinessDetectStepint32No-Image business-label moderation step. One captured frame is moderated for imgBusinessType per step. Must be >= 1. Default: 1, meaning all captured frames are checked for business labels.
imgCompareBasestringNo1024Reference image URL used for face comparison. Present when imgBusinessType includes FACECOMPARE. Supported formats: JPG, JPEG, PNG, WebP, GIF, TIFF, TIF, HEIF. Recommended minimum image size: 256 x 256 px. Animated images are not currently supported as reference images.
ipstringNo64Public IP address of the user. Accepts IPv4 or IPv6. Used for IP-based user behavior analysis.
langstringNo-Language used for OCR and audio-text moderation. Pass en by default. For international traffic when the language is mixed, pass auto to enable automatic language detection. See Supported Languages.
levelint32No-User level. Use this to configure different moderation policies for different user tiers. See User Levels.
liveCoverstringNo-Livestream cover image. Usually used by human review workflows.
liveTitlestringNo-Livestream title. Usually used by human review workflows.
receiveTokenIdstringNo64tokenId of the message recipient. Alphanumeric with underscores and hyphens, up to 64 characters.
returnAllImgint32No-Controls which frame moderation results are returned. 0 (default): return only non-PASS frame results. 1: return all frame results.
returnAllTextint32No-Controls which audio moderation results are returned. 0 (default): return only non-PASS audio segments and transcripts. 1: return all audio segments and transcripts.
returnFinishInfoint32No-Whether to send a stream-end callback. 0 (default): do not send a stream-end callback. 1: send a callback when stream moderation ends; callback payload includes statCode.
returnPreAudioint32No-Whether to return the previous audio segment. 0 (default): do not return previous audio. 1: when the current segment is risky, preAudioUrl contains a 20-second clip covering the previous 10 seconds plus the current 10 seconds.
returnPreTextint32No-Whether to return the previous audio transcript. 0 (default): do not return previous transcript. 1: when the current segment is risky, content contains 20 seconds of transcript covering the previous 10 seconds plus the current 10 seconds.
roomstringNo64Live-room or game-room ID. Can be used to apply room-level moderation policies.
streamNamestringNo64Video stream name. Used for display in the DeepCleer console; recommended.
urlstringConditional600Standard video stream URL to moderate. Required when streamType is NORMAL.
agoraParamobjectConditional-Agora recording parameters. Required when streamType is AGORA. See data.agoraParam Object.
trtcParamobjectConditional-Tencent TRTC recording parameters. Required when streamType is TRTC. See data.trtcParam Object.
zegoParamobjectConditional-Zego recording parameters. Required when streamType is ZEGO. See data.zegoParam Object.
volcParamobjectConditional-Volcano Engine recording parameters. Required when streamType is VOLC. See data.volcParam Object.
aliParamobjectConditional-Alibaba Cloud recording parameters. Required when streamType is ALI. See data.aliParam Object.
extraobjectNo-Auxiliary parameters. See data.extra Object.

Stream Type

ValueDescription
NORMALStandard stream URL. Supported protocols: RTMP, RTMPS, HLS, HTTP, HTTPS. Provide the stream URL via data.url.
AGORAAgora moderation. Provide recording parameters via data.agoraParam.
TRTCTencent TRTC moderation. Provide recording parameters via data.trtcParam.
ZEGOZego moderation. Provide recording parameters via data.zegoParam.
VOLCVolcano Engine moderation. Provide recording parameters via data.volcParam.
ALIAlibaba Cloud moderation. Provide recording parameters via data.aliParam.

When an RTC SDK recording solution is used, additional recording charges may be incurred by the RTC provider. Contact the relevant RTC provider for details.

User Levels

ValueDescription
0Lowest-level user, such as newly registered, completely inactive, or level-0 users
1Lower-level user, such as low-activity or low-level users
2Mid-level user, such as moderately active or mid-level users
3Higher-level user, such as highly active or high-level users
4Highest-level user, such as paying users or VIP users

Supported Languages

ValueLanguage
enEnglish
zhChinese
arArabic
hiHindi
esSpanish
frFrench
ruRussian
ptPortuguese
idIndonesian
deGerman
jaJapanese
trTurkish
viVietnamese
itItalian
thThai
tlFilipino
koKorean
msMalay
autoAutomatic language detection. Contact DeepCleer to enable.

data.extra Object

ParameterTypeRequiredMax LengthDescription
passThroughobjectNo1024Client pass-through field. DeepCleer does not process this field; it is echoed back in the callback payload as-is.

data.agoraParam Object

Required when streamType is AGORA.

ParameterTypeRequiredMax LengthDescription
appIdstringYes64Application identifier provided by Agora.
channelstringYes64Channel name provided by Agora.
channelProfileint32No32Agora recording channel mode. 0 (default): communication mode, such as 1-on-1 or group chats where any user can speak freely. 1: live broadcast mode with host and audience roles.
enableH265SupportbooleanNo-Whether to support H.265 video stream recording. false (default): do not support H.265, and remote users cannot send H.265 video. true: support H.265.
enableIntraRequestbooleanNo-Whether to enable keyframe requests. Default: true. This can improve audio and video experience under weak network conditions. To enable seeking in recordings made in individual-stream mode, set this to false. false: disable keyframe requests; all senders send keyframes every 2 seconds, and individual-stream recordings support seeking. true: sender controls keyframe requests, and individual-stream recordings do not support seeking.
subscribeModestringNo-Subscription mode. AUTO (default): subscribe to all streams in the room. UNTRUSTED: pair with untrustedUserIdList to subscribe only to streams from users in the list; an empty list returns a parameter error. TRUSTED: pair with trustedUserIdList to subscribe only to streams from users not in the list. In TRUSTED mode, if no users outside the list join the room within a certain time, DeepCleer proactively ends moderation.
tokenstringNo64Authentication token. See Agora documentation for generation: https://docs.agora.io/cn/Recording/token_server?platform=CPP. Set the token validity period to exceed the channel duration to prevent stream-pull failures due to token expiration. Agora currently caps token validity at 24 hours; for channels lasting longer, set returnFinishInfo to 1, watch for an end callback with statCode of 1 caused by an invalid or expired pull token, generate a new token, and resubmit the channel for moderation if it still requires moderation.
uidint32No64A 32-bit unsigned integer. When token is provided, supply the user ID used to generate the token. Must not collide with any actual user UID in the room; the recording-side UID must be unique.
trustedUserIdListarrayNo-Trusted-user UID list. Effective when subscribeMode is TRUSTED. Must not be empty. DeepCleer will not subscribe to streams from users in this list. Comma-separated UID array, for example [1, 2]. Maximum: 17 users.
untrustedUserIdListarrayNo-Untrusted-user UID list. Effective when subscribeMode is UNTRUSTED. Must not be empty. DeepCleer will only subscribe to streams from users in this list. Comma-separated UID array, for example [1, 2]. Maximum: 17 users.

data.trtcParam Object

Required when streamType is TRTC.

ParameterTypeRequiredMax LengthDescription
appSceneint32Yes1Application scene. 0 (default): video call. 1: video live broadcast. See https://cloud.tencent.com/document/product/647/79634.
demoSencesint32Yes-Recording type. 2: individual stream recording. 4: mixed stream recording. Note: the field name is demoSences on the wire and is preserved as a typo of "demoScenes"; flag for v5 cleanup.
sdkAppIdint32Yes64The sdkAppId provided by Tencent.
strRoomIdstringConditional128String room ID. Allowed characters: letters (a-z, A-Z), digits (0-9), underscores, and hyphens. One of roomId or strRoomId must be provided. If both are present, roomId takes priority.
userIdstringYes32The userId assigned to the recording end. Allowed characters: letters (a-z, A-Z), digits (0-9), underscores, and hyphens.
userSigstringYes128Verification signature corresponding to the recording userId. Equivalent to a login password.
roomIdint32Conditional10Numeric room ID. Range: 1-4294967294. One of roomId or strRoomId must be provided. If both are present, roomId takes priority. Currently a maximum of 8 users per room can be moderated.

data.zegoParam Object

Required when streamType is ZEGO.

ParameterTypeRequiredMax LengthDescription
roomIdstringYes64Zego room ID.
tokenIdstringYes64Authentication information provided by Zego. Used to obtain identify_token for login. See Zego documentation for generation: https://doc-zh.zego.im/article/15258. Note: tokenId is a unique identifier; a new token must be generated for each moderation request.

data.volcParam Object

Required when streamType is VOLC.

ParameterTypeRequiredMax LengthDescription
appIdstringYes64Application identifier provided by Volcano Engine.
roomIdstringYes128Room ID.
tokenstringYes64Verification signature corresponding to the recording userId. Equivalent to a login password.
userIdstringYes32The userId assigned to the recording end. Allowed characters: letters (a-z, A-Z), digits (0-9), underscores, and hyphens.

data.aliParam Object

Required when streamType is ALI.

ParameterTypeRequiredMax LengthDescription
roomstringYes64Room ID. Must exactly match the channelID used to generate the token. DeepCleer pulls and records streams on a per-room basis. room is a unique identifier; duplicate rooms will not result in duplicate stream pulls.
tokenstringYes64Token used by the pull-stream end to join the channel. See https://help.aliyun.com/zh/live/user-guide/token-based-authentication for generation. A new token must be generated for each moderation submission.
userIdint32No32Alibaba user account identifier.

Synchronous Response

The synchronous response is an acknowledgement only. It confirms whether the stream was accepted for moderation. Frame and audio results are delivered later to the callback URLs you supplied. See Stream Segment Callback Parameters and, when returnFinishInfo is 1, Stream End Callback Parameters.

Response Parameters

ParameterTypeRequiredDescription
requestIdstringYesUnique DeepCleer request identifier. Save this value to cancel moderation, correlate callbacks, and troubleshoot.
codeint32YesResponse code. See Response Codes.
messagestringYesResponse message corresponding to the code.
detailobjectNoAdditional response detail.
dupRequestIdstringNoReturned when errorcode is 1001, indicating a duplicate stream request. Use this request ID to close moderation if the original response was missed.
errorcodeint32NoBusiness error code. 1001: duplicate stream request.

Response Codes

CodeMessage
1100Success
1901QPS limit exceeded
1902Invalid parameters
1903Service failure
1904Stream count limit exceeded
9101Unauthorized operation

Stream Segment Callback Parameters

Per-segment results are delivered to imgCallback when contentType is 1 and to audioCallback when contentType is 2. Both callback payloads share the envelope below.

Note: Parameters other than code, message, and requestId are only guaranteed to be returned when code is 1100.

ParameterTypeRequiredDescription
requestIdstringYesUnique DeepCleer request identifier, same as the value returned in the synchronous acknowledgement.
codeint32YesResponse code. See Response Codes.
messagestringYesResponse message corresponding to the code.
statCodeint32NoCallback status code. 0: regular moderation result callback. 1: stream-end result callback, only when returnFinishInfo is 1.
contentTypeint32YesDistinguishes between image and audio callbacks. 1: image (frame) callback. 2: audio segment callback.
frameDetailobjectNoFrame moderation detail. Present when contentType is 1 and the frame has a risky label, or when returnAllImg is 1. See frameDetail Object.
audioDetailobjectNoAudio segment moderation detail. Present when contentType is 2 and the segment has a risky label, or when returnAllText is 1. See audioDetail Object.
auxInfoobjectNoAuxiliary information. Contains passThrough echoed from data.extra.passThrough in the original request.

frameDetail Object

Present when there are risky frames or when returnAllImg is 1.

ParameterTypeRequiredDescription
imgUrlstringYesURL of the captured frame.
riskLevelstringYesDisposition recommendation. PASS: normal (allow). REVIEW: suspicious (route to manual review). REJECT: violation (block).
riskLabel1stringYesLevel 1 risk label. Returns normal when riskLevel is PASS.
riskLabel2stringYesLevel 2 risk label. Empty when riskLevel is PASS.
riskLabel3stringYesLevel 3 risk label. Empty when riskLevel is PASS.
riskDescriptionstringYesRisk description. Returns "Normal" when riskLevel is PASS. Returns "Hit custom list" when a customer-defined list is matched. Otherwise format: "Level 1: Level 2: Level 3". For reference only; do not use for programmatic logic.
allLabelsarrayYesAll risk labels matched on this frame. See frameDetail.allLabels.
riskDetailobjectNoRisk detail. See frameDetail.riskDetail.
auxInfoobjectYesAuxiliary information. See frameDetail.auxInfo.
businessLabelsarrayNoBusiness labels matched on this frame. See frameDetail.businessLabels.

frameDetail.allLabels

Each element in the array:

ParameterTypeRequiredDescription
riskLevelstringNoRisk level: PASS, REVIEW, or REJECT.
riskLabel1stringNoLevel 1 risk label.
riskLabel2stringNoLevel 2 risk label.
riskLabel3stringNoLevel 3 risk label.
riskDescriptionstringNoRisk description. Returns "Normal" when riskLevel is PASS. For reference only; do not use for programmatic logic.
probabilityfloatNoConfidence score from 0 to 1. Higher values indicate greater confidence.

frameDetail.riskDetail

ParameterTypeRequiredDescription
riskSourceint32YesRisk source. 1000: no risk. 1001: text risk. 1002: visual image risk.
face_numint32NoNumber of faces detected.
person_numint32NoNumber of persons detected.
facesarrayNoNames and positions of politically sensitive persons in the frame. Up to 10 entries, with the highest probability entries selected if more are detected. See Face Object.
objectsarrayNoDetected objects or symbols with names and positions. See Object Info.
ocrTextobjectNoOCR text content recognized in the frame. Present when imgType includes IMGTEXTRISK or ADVERT. Contains text (string): recognized text.
matchedListsarrayNoMatched custom list information. Returned only when a customer-defined list is hit. See Matched Lists.
riskSegmentsarrayNoHigh-risk content segments. Present when political, terrorism, prohibited, competitor, or advertising-law content is detected. See Risk Segments.
personsarrayNoPerson names and positions in the frame. Up to 10 entries, with the highest probability entries selected. See Person Object.
Face Object
ParameterTypeRequiredDescription
idstringNoIdentifier. The same person at the same position has the same ID across different labels. If the same person appears N times, N IDs are assigned.
namestringNoPerson name.
face_ratiofloatNoFace-to-frame ratio from 0 to 1.
probabilityfloatNoConfidence score from 0 to 1.
locationarrayNoFace position coordinates [x1, y1, x2, y2] representing top-left and bottom-right corners. Example: [207, 522, 340, 567].
Object Info
ParameterTypeRequiredDescription
idstringNoObject or symbol identifier. The same object at the same position has the same ID across different labels.
namestringNoObject name.
probabilityfloatNoConfidence score from 0 to 1.
qrContentstringNoQR code URL detected in the frame.
locationarrayNoObject position coordinates [x1, y1, x2, y2]. Example: [207, 522, 340, 567].
Matched Lists
ParameterTypeRequiredDescription
namestringNoName of the matched list.
wordsarrayNoSensitive word details.
words[].wordstringNoThe matched sensitive word.
words[].positionarrayNoPosition of the sensitive word.
Risk Segments
ParameterTypeRequiredDescription
segmentstringNoHigh-risk content segment text.
positionarrayNoPosition of the segment, 0-indexed.
Person Object
ParameterTypeRequiredDescription
idstringNoIdentifier. The same person has the same ID across different labels.
person_ratiofloatNoPerson-to-frame ratio from 0 to 1.
probabilityfloatNoConfidence score from 0 to 1.
locationarrayNoPerson position coordinates.

frameDetail.auxInfo

ParameterTypeRequiredDescription
beginProcessTimeint64YesProcessing start time. 13-digit Unix timestamp in milliseconds (UTC).
finishProcessTimeint64YesProcessing end time. 13-digit Unix timestamp in milliseconds (UTC).
detectTypeint32NoWhether the captured frame was actually moderated. Only returned when the detectStep request parameter is set. 1: the frame was moderated. 2: the frame was skipped according to detectStep.
imgTimestringNoTime the frame was captured, as the absolute time of the frame in the video stream.
roomstringNoRoom ID.
similarityDedupint32NoAuxiliary parameter. Only returned when the similar-frame deduplication feature changed the outer riskLevel from REJECT or REVIEW to PASS. 1: similar-frame deduplication is active.
strUserIdstringNoUser identifier for distinguishing violating users within a room. Unrelated to the userId request parameter; this is the individual stream user ID. Returned for ZEGO room-level moderation, TRTC individual-stream moderation, VOLC, and ALI.
userIdint32NoAgora user account identifier. Only present in individual-stream Agora scenarios. The returned userId is the actual user ID in the room, unrelated to the uid request parameter.

frameDetail.businessLabels

Each element in the array:

ParameterTypeRequiredDescription
businessLabel1stringYesLevel 1 business label.
businessLabel2stringYesLevel 2 business label.
businessLabel3stringYesLevel 3 business label.
businessDescriptionstringYesBusiness label description. Format: "Level 1: Level 2: Level 3".
probabilityfloatYesConfidence score from 0 to 1. Higher values indicate greater confidence.
confidenceLevelint32NoConfidence level from 0 to 2. Higher values indicate greater confidence.
businessDetailobjectNoBusiness label details. May contain face_num, person_num, faces, objects, and persons with the same structure as frameDetail.riskDetail.

audioDetail Object

Present when there are risky audio segments or when returnAllText is 1.

ParameterTypeRequiredDescription
audioUrlstringYesURL of the audio segment.
riskLevelstringYesDisposition recommendation. PASS: normal (allow). REVIEW: suspicious (route to manual review). REJECT: violation (block).
riskLabel1stringYesLevel 1 risk label. Returns normal when riskLevel is PASS.
riskLabel2stringYesLevel 2 risk label. Empty when riskLevel is PASS.
riskLabel3stringYesLevel 3 risk label. Empty when riskLevel is PASS.
riskDescriptionstringYesRisk description. Returns "Normal" when riskLevel is PASS. Returns "Hit custom list" when a customer-defined list is matched. Otherwise format: "Level 1: Level 2: Level 3". For reference only; do not use for programmatic logic.
vadCodeint32YesVoice activity flag. 0: silent segment. 1: non-silent segment.
allLabelsarrayYesAll risk labels matched on this segment. See audioDetail.allLabels.
riskDetailobjectNoRisk detail per segment. See audioDetail.riskDetail.
contentstringNoAudio-to-text transcription for this segment. When returnPreText is 1 and the current segment is REJECT, returns 20 seconds of transcript covering the previous 10 seconds plus the current 10 seconds. Otherwise returns only the current segment transcript.
preAudioUrlstringNoPrevious audio segment URL. When returnPreAudio is 1 and the current segment is REJECT, returns a 20-second audio clip covering the previous 10 seconds plus the current 10 seconds. Otherwise not returned.
auxInfoobjectNoAuxiliary information. See audioDetail.auxInfo.
businessLabelsarrayNoBusiness labels matched on this segment. See audioDetail.businessLabels.

audioDetail.allLabels

Each element in the array:

ParameterTypeRequiredDescription
riskLevelstringYesRisk level: PASS, REVIEW, or REJECT.
riskLabel1stringYesLevel 1 risk label. Returns normal when riskLevel is PASS.
riskLabel2stringYesLevel 2 risk label. Empty when riskLevel is PASS.
riskLabel3stringYesLevel 3 risk label. Empty when riskLevel is PASS.
riskDescriptionstringYesRisk description. Returns "Normal" when riskLevel is PASS. Returns "Hit custom list" when a customer-defined list is matched. Otherwise format: "Level 1: Level 2: Level 3". For reference only; do not use for programmatic logic.

audioDetail.riskDetail

ParameterTypeRequiredDescription
riskSourceint32YesRisk source. 1000: no risk. 1001: text risk. 1002: visual image risk. 1003: audio voice risk.
audioTextstringNoAudio-to-text transcription for this segment.
matchedListsarrayNoMatched custom list information. Returned only when a customer-defined list is hit. Same structure as Matched Lists.
riskSegmentsarrayNoHigh-risk content segments. Present when political, terrorism, prohibited, competitor, or advertising-law content is detected. Same structure as Risk Segments.

audioDetail.auxInfo

ParameterTypeRequiredDescription
audioStartTimestringYesViolation content start time, as absolute time.
audioEndTimestringYesViolation content end time, as absolute time.
beginProcessTimeint64YesProcessing start time. 13-digit Unix timestamp in milliseconds (UTC).
finishProcessTimeint64YesProcessing end time. 13-digit Unix timestamp in milliseconds (UTC).
roomstringNoRoom ID.
strUserIdstringNoUser identifier for distinguishing violating users within a room. Unrelated to the userId request parameter; this is the individual stream user ID. Returned for ZEGO room-level moderation, TRTC individual-stream moderation, VOLC, and ALI.
userIdint32NoAgora user account identifier. Only present in individual-stream Agora scenarios. The returned userId is the actual user ID in the room, unrelated to the uid request parameter.

Note on field casing: audioStartTime and audioEndTime use uppercase T, while the standalone Audio Moderation APIs use lowercase t (audioStarttime / audioEndtime). Preserved exactly as returned on the wire; flag for v5 cleanup.

audioDetail.businessLabels

Each element in the array:

ParameterTypeRequiredDescription
businessLabel1stringYesLevel 1 business label.
businessLabel2stringYesLevel 2 business label.
businessLabel3stringYesLevel 3 business label.
businessDescriptionstringYesBusiness label description. Format: "Level 1: Level 2: Level 3".
probabilityfloatYesConfidence score from 0 to 1. Higher values indicate greater confidence.
confidenceLevelint32NoConfidence level from 0 to 2. Higher values indicate greater confidence.
riskDetailobjectNoBusiness risk detail. Fields: riskSource (int32; 1000, 1001, or 1003), audioText (string), matchedLists (array, same as above), and riskSegments (array, same as above).

tokenProfileLabels - Account Attribute Labels

Returned only when tokenId is provided and the labeling service is enabled. Each element in the array:

ParameterTypeRequiredDescription
label1stringNoLevel 1 label.
label2stringNoLevel 2 label.
label3stringNoLevel 3 label.
descriptionstringNoLabel description. For reference only; do not use for programmatic logic.
timestampint64NoLabel timestamp. 13-digit Unix timestamp in milliseconds (UTC).

tokenRiskLabels - Account Risk Labels

Returned only when tokenId is provided and the labeling service is enabled. Each element in the array:

ParameterTypeRequiredDescription
label1stringNoLevel 1 label.
label2stringNoLevel 2 label.
label3stringNoLevel 3 label.
descriptionstringNoLabel description. For reference only; do not use for programmatic logic.
timestampint64NoLabel timestamp. 13-digit Unix timestamp in milliseconds (UTC).

Stream End Callback Parameters

Delivered when returnFinishInfo is 1 and moderation of the stream completes, either naturally, via stream-pull failure, or via the close-moderation API.

ParameterTypeRequiredDescription
requestIdstringYesUnique DeepCleer request identifier, same as the value returned in the synchronous acknowledgement.
codeint32YesResponse code. See Response Codes.
messagestringYesResponse message corresponding to the code.
riskLevelstringYesOverall stream disposition recommendation. PASS: normal (allow). REVIEW: suspicious (route to manual review). REJECT: violation (block).
statCodeint32YesCallback status code. 0: regular moderation result callback. 1: stream-end result callback. Always 1 for this payload.
contentTypeint32YesDistinguishes between image and audio end callbacks. 1: image moderation end callback. 2: audio moderation end callback.
pullStreamSuccessbooleanYesWhether the stream pull succeeded. true: stream pull succeeded. false: stream pull failed, meaning not even a single frame was successfully captured.
auxInfoobjectYesAuxiliary information. See Stream End auxInfo.
requestParamsobjectNoEcho of all fields submitted under data in the original request. Returned when contentType is 2.
detailobjectNoDetail information. Returned when contentType is 1. Contains requestParams (object): echo of all fields submitted under data in the original request.

Stream End auxInfo

ParameterTypeRequiredDescription
streamTimeint32YesTotal stream moderation duration. Returned in the final callback after the stream ends. When interval moderation (audioDetectStep) is applied, this may differ from the actual stream duration.

Example

Request Example

{
  "accessKey": "*********",
  "appId": "defaulttest",
  "audioBusinessType": "SING_LANGUAGE",
  "audioCallback": "http://www.xxx.top/callbackxxx",
  "audioType": "POLITY_EROTIC_ADVERT_MOAN",
  "data": {
    "detectFrequency": 10,
    "detectStep": 1,
    "extra": {
      "passThrough": {
        "passThrough1": "111",
        "passThrough2": "222",
        "passThrough3": "333"
      }
    },
    "ip": "123.171.34.4",
    "lang": "zh",
    "returnAllImg": 1,
    "returnAllText": 1,
    "returnPreAudio": 1,
    "returnPreText": 1,
    "room": "5e1854a6a0a79d0001a09bc3",
    "streamType": "NORMAL",
    "tokenId": "123",
    "url": "http://rtmp.xxxx.cn/live/3637778raLSXdOdu.flv"
  },
  "eventId": "VIDEOSTREAM",
  "imgBusinessType": "BODY_FOOD_3CPRODUCTSLOGO",
  "imgCallback": "http://www.xxx.top/callbackxxx",
  "imgType": "POLITY_EROTIC_ADVERT"
}

Response Example

{
  "code": 1100,
  "message": "Success",
  "requestId": "66fb85e3149bb9e13d6c72161cc6c6cf"
}

Frame Capture Callback Example

{
  "auxInfo": {
    "passThrough": {
      "passThrough1": "111",
      "passThrough2": "222",
      "passThrough3": "333"
    }
  },
  "code": 1100,
  "contentType": 1,
  "frameDetail": {
    "allLabels": [
      {
        "riskDescription": "Involvement in politics: Involvement in politics: Involvement in politics",
        "riskLabel1": "politics",
        "riskLabel2": "shezheng",
        "riskLabel3": "shezheng",
        "riskLevel": "REJECT"
      }
    ],
    "auxInfo": {
      "beginProcessTime": 1639825248361,
      "detectType": 1,
      "finishProcessTime": 1639825248809,
      "imgTime": "2021-12-18 19:00:48.375",
      "room": "5e1854a6a0a79d0001a09bc3"
    },
    "businessLabels": [],
    "imgUrl": "http://bj.cos.ap-beijing.xxx.com/image/1639825145166_vs130_1639825248361471656.jpg",
    "riskDescription": "Involvement in politics: Involvement in politics: Involvement in politics",
    "riskDetail": {
      "ocrText": {
        "text": "Page 4 (ban) Page 5 (violence)"
      },
      "riskSource": 1002
    },
    "riskLabel1": "politics",
    "riskLabel2": "shezheng",
    "riskLabel3": "shezheng",
    "riskLevel": "REJECT"
  },
  "message": "Success",
  "requestId": "1639825145166_vs130_1639825248361471656"
}

Stream End Callback Example

{
  "auxInfo": {
    "streamTime": 70
  },
  "code": 1100,
  "contentType": 1,
  "detail": {
    "requestParams": {
      "detectFrequency": 10,
      "detectStep": 1,
      "extra": {
        "passThrough": {
          "passThrough1": "111",
          "passThrough2": "222",
          "passThrough3": "333"
        }
      },
      "ip": "123.171.34.4",
      "lang": "zh",
      "returnAllImg": 1,
      "returnAllText": 1,
      "returnPreAudio": 1,
      "returnPreText": 1,
      "room": "5e1854a6a0a79d0001a09bc3",
      "streamType": "NORMAL",
      "tokenId": "123",
      "url": "http://rtmp.example.com/live/3637778raLSXdOdu.flv"
    }
  },
  "message": "Success",
  "pullStreamSuccess": true,
  "requestId": "5515ce1f9b474a6c4a3d79a8dfcaeaf4",
  "riskLevel": "PASS",
  "statCode": 1
}