Request API

DeepCleer Audio Stream Moderation streams a live audio source from a public URL or a supported RTC provider, slices it into segments, and continuously pushes per-segment moderation results to your callback URL.

API Description

The Audio Stream Moderation API detects risks such as political sensitivity, pornography, advertising, terrorism, abuse, prohibited songs, and copyrighted songs in live or recorded audio streams. It can also identify business attributes such as gender, age, timbre, language, audio scene, singing, minors, and human-voice authenticity to support your business scenarios.

Submit a stream URL or RTC pull configuration once; DeepCleer maintains the pull, segments the audio, and continuously delivers moderation results to your callback URL until the stream ends or you stop the task.

Requirements

Item	Specification
Protocol	HTTP or HTTPS
Method	POST
Encoding	UTF-8
Format	All request and response parameters use JSON

Stream Pull Retry Mechanism

To guard against transient network failures, DeepCleer automatically retries failed stream pulls. The retry policy varies by stream type:

Stream Source	Retry Count	Retry Interval
Standard `rtmp` / `http` / `hls`	12	5s, 10s, 15s, … (incrementing by 5s, capped at 60s)
Agora SDK recording	2	0s (immediate retries)
Zego SDK recording	10	30s between each retry

If all retries fail the task is closed and (when returnFinishInfo is 1) a stream-end callback is delivered with an auxInfo.errorCode describing the failure.

Timeout Suggestion

Recommended request timeout: 5 seconds.

ℹ️
This timeout applies to the synchronous acknowledgement only. Moderation results are delivered asynchronously through your callback URL once the stream pull has stabilized.

Callback Mechanism

When DeepCleer pushes a result to your callback URL and your endpoint responds with HTTP 200, the delivery is considered successful. If any other status code is returned (or the request fails), the system retries up to 20 times.

Request

Request URL

Cluster	Endpoint
Silicon Valley	`http://api-audiostream-gg.fengkongcloud.com/audiostream/v4`
Singapore	`http://api-audiostream-xjp.fengkongcloud.com/audiostream/v4`

Request Parameters

Top-Level Parameters

Parameter	Type	Required	Max Length	Description
`accessKey`	string	Yes	20	API authentication key. The default `accessKey` is sent in your onboarding email.
`appId`	string	Yes	64	Application identifier, such as `web` for your web application or `app` for your mobile app. The default `appId` is sent in your onboarding email. Contact DeepCleer if you need a new `appId`.
`eventId`	string	Yes	64	Event identifier used to distinguish moderation scenarios in your application, such as `voiceMessage` for chat voice messages or `liveAudio` for livestream audio. The default `eventId` is sent in your onboarding email. Contact DeepCleer if you need a new `eventId`.
`type`	string	Conditional	—	Risk detection types. Either `type` or `businessType` (or both) must be provided. See Detection Types. Combine multiple types with underscores, e.g. `POLITY_EROTIC_MOAN`.
`businessType`	string	Conditional	—	Business detection labels. Either `type` or `businessType` (or both) must be provided. See Business Detection Types. Combine multiple types with underscores. When detecting timbre, singing, or language, `GENDER` must be included.
`data`	object	Yes	1 MB	Request payload containing stream and user metadata. See `data` Object Parameters.
`callback`	string	Yes	—	URL that receives asynchronous moderation results. Supports HTTP and HTTPS.
`acceptLang`	string	No	—	Language for returned labels. `en` (default): English. `zh`: Chinese.

Detection Types

Values for the type field. Combine multiple values with underscores (e.g. POLITY_EROTIC_DIRTY).

Value	Description
`POLITY`	Politically sensitive content
`EROTIC`	Pornographic content
`ADVERT`	Advertising content
`BAN`	Prohibited content
`VIOLENT`	Violent or terrorism-related content
`MOAN`	Sexual moaning
`AUDIOPOLITICAL`	Voiceprint of top political leaders
`ANTHEN`	National anthem detection
`DIRTY`	Verbal abuse
`ADLAW`	Advertising-law violations
`SING`	Singing detection
`MINOR`	Minor speaker detection
`BANEDAUDIO`	Prohibited songs
`COPYRIGHTSONGS`	Copyrighted songs
`VOICE`	Human-voice attribute (e.g. synthesized / forged voice)

Business Detection Types

Values for the businessType field. Combine multiple values with underscores (e.g. GENDER_TIMBRE_SING_LANGUAGE).

Value	Description
`GENDER`	Speaker gender
`AGE`	Speaker age
`TIMBRE`	Speaker timbre
`SING`	Singing detection
`LANGUAGE`	Language identification
`VOICE`	Human-voice attribute
`AUDIOSCENE`	Audio scene

`data` Object Parameters

Parameter	Type	Required	Max Length	Description
`tokenId`	string	Yes	64	User account identifier. Recommended to pass the user ID for behavioral risk detection.
`btId`	string	Yes	128	Unique audio identifier used to query a specific stream.
`streamType`	string	Yes	—	Stream source type. See Stream Types. When using an RTC SDK recording option (Agora, Zego, TRTC, Volc, Giants, Aliyun, NetEase Yunxin), additional recording fees may be charged on the RTC provider's side — please consult the provider for details.
`url`	string	Conditional	—	Live stream URL. Required when `streamType` is `NORMAL`.
`lang`	string	No	—	Audio stream language. Default: `zh`. See Supported Languages. Cluster-specific support: see the Request URL table.
`zegoParam`	object	Conditional	—	Zego pull configuration. Required when `streamType` is `ZEGO`. See `zegoParam` Object.
`agoraParam`	object	Conditional	—	Agora pull configuration. Required when `streamType` is `AGORA`. See `agoraParam` Object.
`trtcParam`	object	Conditional	—	TRTC pull configuration. Required when `streamType` is `TRTC`. See `trtcParam` Object.
`volcParam`	object	Conditional	—	Volcengine pull configuration. Required when `streamType` is `VOLC`. See `volcParam` Object.
`ginParam`	object	Conditional	—	Giants pull configuration. Required when `streamType` is `GIN`. See `ginParam` Object.
`aliParam`	object	Conditional	—	Aliyun pull configuration. Required when `streamType` is `ALI`. See `aliParam` Object.
`yunxinParam`	object	Conditional	—	NetEase Yunxin pull configuration. Required when `streamType` is `YUNXIN`. See `yunxinParam` Object.
`returnPreText`	int32	No	—	Whether to return the transcribed text of the segment immediately preceding a violating segment. `0` (default): do not return. `1`: return.
`returnPreAudio`	int32	No	—	Whether to return the audio URL of the segment immediately preceding a violating segment. `0` (default): do not return. `1`: return a 20-second clip combining the preceding and current segments.
`returnFinishInfo`	int32	No	—	Whether to push a stream-end callback. `0` (default): no end callback. `1`: send an end callback with `statCode` set. Recommended: `1` — without it no callback will be produced when the stream ends.
`returnAllText`	int32	No	—	Callback granularity. `0` (default): only push when violations are detected. `1`: push the latest 10-second result every 10 seconds regardless of risk level. Recommended: `1` — without it no callback will be produced during silent or risk-free periods.
`extra`	object	No	—	Auxiliary parameters.
`passThrough`	object	No	—	Client pass-through field. DeepCleer does not process this field; it is returned as-is in the callback.
`liveTitle`	string	No	—	Room title (used when human review is enabled).
`anchorName`	string	No	—	User nickname (used when human review is enabled).
`audioDetectStep`	int32	No	—	Segment-sampling step. Range `1`–`36`; default reviews every segment. `1` reviews odd-numbered segments only, `2` reviews one of every three segments, and so on.
`receiveTokenId`	string	Conditional	64	Message receiver's `tokenId`. Alphanumeric with underscores and hyphens, up to 64 characters. Required when `eventId` is `message`.
`deviceId`	string	No	128	DeepCleer device fingerprint identifier, generated by the DeepCleer SDK for user behavior analysis.
`ip`	string	No	64	Client public IP address (IPv4 or IPv6) for IP-based user behavior analysis.
`level`	int32	No	—	User level for configuring different interception strategies. See User Levels.
`gender`	int32	No	—	User gender. `0`: unknown. `1`: male. `2`: female.

Stream Types

Values for the streamType field.

Value	Description
`NORMAL`	Standard public URL pull. Supports `rtmp`, `rtmps`, `hls`, `http`, `https` protocols and `flv`, `m3u8` and similar formats.
`ZEGO`	Zego SDK recording
`AGORA`	Agora SDK recording
`TRTC`	Tencent TRTC recording
`VOLC`	Volcengine recording
`GIN`	Giants recording
`ALI`	Aliyun recording
`YUNXIN`	NetEase Yunxin

Supported Languages

Values for the lang field. Default: zh.

Value	Language
`zh`	Chinese
`en`	English
`ar`	Arabic
`hi`	Hindi
`es`	Spanish
`fr`	French
`ru`	Russian
`pt`	Portuguese
`id`	Indonesian
`de`	German
`ja`	Japanese
`tr`	Turkish
`vi`	Vietnamese
`it`	Italian
`th`	Thai
`tl`	Filipino
`ko`	Korean
`ms`	Malay
`auto`	Automatic language detection (contact DeepCleer to enable)

User Levels

Value	Description
`0`	Lowest-level user (e.g., newly registered, completely inactive, or level-0 users)
`1`	Lower-level user (e.g., low activity or low-level users)
`2`	Mid-level user (e.g., moderately active or mid-level users)
`3`	Higher-level user (e.g., highly active or high-level users)
`4`	Highest-level user (e.g., paying users, VIP users)

`zegoParam` Object

Parameter	Type	Required	Description
`tokenId`	string	Yes	Zego `identify_token` used for login. See the Zego documentation. Each request must regenerate this token; it uniquely identifies the moderation request.
`streamId`	string	Conditional	Stream identifier (uniquely maps to one audio stream). At least one of `streamId` or `roomId` must be provided.
`roomId`	string	Conditional	Room identifier (uniquely maps to one room). At least one of `streamId` or `roomId` must be provided.
`isMixingEnabled`	boolean	No	Recording mode. `true`: mixed stream — all users in the room are merged into a single recorded stream. When both `streamId` and `roomId` are provided, `streamId` takes precedence. `false`: separated streams — each user is recorded individually. In this case `roomId` is required and `streamId` must not be provided.
`initDomain`	int32	Conditional	Required when the Zego client init uses an isolation domain or random `userId`. Values: `0` default version; `1` isolation domain only; `2` isolation domain + random userId; `3` SDK update with bug fixes; `4` custom SEI; `5` VAD silence detection (token uniqueness check, must regenerate per request); `6` per-stream submission control in room-scoped pull mode. Recommended: `6`. Default: `0`.

`agoraParam` Object

Parameter	Type	Required	Description
`appId`	string	Yes	Agora-issued `appId`. Distinct from the DeepCleer `appId`.
`channel`	string	Yes	Agora channel name.
`token`	string	No	Optional Agora token for higher-security accounts. See the Agora documentation. Set the validity period longer than the channel duration to avoid expiry. Maximum Agora token validity is 24 hours; for longer channels enable `returnFinishInfo: 1` and watch for `auxInfo.errorCode = 3005` in the stream-end callback to know when to refresh the token.
`uid`	int32	Conditional	Unsigned 32-bit user ID. Required when `token` is provided and must match the `uid` used to generate the token. Must be different from any uid actually present in the room.
`isMixingEnabled`	boolean	No	`true` (default): mixed stream — one stream per room. `false`: separated streams — one stream per microphone slot.
`channelProfile`	int32	No	Channel profile. `0` (default): Communication (1-on-1 or group, all users may speak). `1`: Live broadcast (host / audience roles).
`subscribeMode`	string	No	Subscription mode. `AUTO` (default): subscribe to all streams in the room. `UNTRUSTED`: with `untrustedUserIdList`, subscribe only to the listed users (separated streams only). `TRUSTED`: with `trustedUserIdList`, subscribe only to users not in the list (separated streams only).
`trustedUserIdList`	array	No	Trusted user list. Active when `subscribeMode = TRUSTED`. May be empty. Each element is a stringified `uint32` value, e.g. `["123","456"]`.
`untrustedUserIdList`	array	No	Untrusted user list. Active when `subscribeMode = UNTRUSTED`. Must be non-empty. Each element is a stringified `uint32` value, e.g. `["123","456"]`.

`trtcParam` Object

Parameter	Type	Required	Description
`sdkAppId`	int32	Yes	Tencent-issued `sdkAppId`.
`demoSences`	int32	Yes	Recording type. `2`: separated stream recording. `4`: mixed stream recording. (Note: the source spelling `demoSences` is preserved for wire compatibility; this appears to be a typo of `demoScenes` and is a candidate for v5 cleanup.)
`userId`	string	Yes	Recording-side `userId`, up to 32 bits. Allowed characters: `a-z`, `A-Z`, `0-9`, underscore, hyphen.
`userSig`	string	Yes	Verification signature for the recording `userId` (functions as the login password).
`roomId`	int32	Conditional	Numeric room ID (range `1`–`4294967294`). One of `roomId` or `strRoomId` must be provided. When both are present, `roomId` takes precedence.
`strRoomId`	string	Conditional	String room ID (allowed characters: `a-z`, `A-Z`, `0-9`, underscore, hyphen). One of `roomId` or `strRoomId` must be provided. When both are present, `roomId` takes precedence.
`uid`	string	No	Specific user ID to moderate. If omitted, all publishing users in the room are pulled and moderated. To moderate a subset of users, submit multiple requests with different recording-side `userId` / `userSig`. Distinct from the recording `userId`.

`volcParam` Object

Parameter	Type	Required	Description
`appId`	string	Yes	Volcengine-issued `appId`. Distinct from the DeepCleer `appId`.
`roomId`	string	Yes	Room number.
`token`	string	Yes	Volcengine authentication token. See the Volcengine documentation.
`userId`	string	Yes	Recording-side `userId`.
`subscribeMode`	string	No	Subscription mode. `AUTO` (default): subscribe to all streams in the room. `UNTRUSTED`: with `untrustedUserIdList`, subscribe only to listed users — list must be non-empty or the request fails. `TRUSTED`: with `trustedUserIdList`, subscribe only to users not in the list — if no qualifying user joins within a grace period, DeepCleer will end the moderation.
`trustedUserIdList`	array	No	Trusted user list. Active when `subscribeMode = TRUSTED`. May be empty.
`untrustedUserIdList`	array	No	Untrusted user list. Active when `subscribeMode = UNTRUSTED`. Must be non-empty.

`ginParam` Object

Parameter	Type	Required	Description
`tokenId`	string	Yes	Room token used by the pull endpoint to log in to the room. Provided by Giants.
`roomId`	string	Yes	Room number (uniquely maps to one room). The server pulls and records on a per-room basis.
`isMixingEnabled`	boolean	No	`true` (default): mixed stream — all users in the room merged into one stream. `false`: separated streams — each user recorded individually.
`ip`	string	Yes	Designated server IP address.
`port`	string	Yes	Designated port.

`aliParam` Object

Parameter	Type	Required	Description
`token`	string	Yes	Authentication token used by the pull endpoint to join the channel. See the Aliyun documentation. A new token must be generated for every moderation request.
`room`	string	Yes	Room ID. Non-empty, must exactly match the `channelID` used to generate the token. The server pulls and records on a per-room basis. The same `room` will not trigger duplicate pulls.
`userId`	string	Yes	Pull-bot user ID. Must exactly match the `userId` used to generate the token. Non-empty.
`isMixingEnabled`	boolean	No	`true` (default): mixed stream — all users in the room merged into one stream. `false`: separated streams — each user recorded individually.

`yunxinParam` Object

Parameter	Type	Required	Description
`token`	string	Yes	Authentication token used by the pull endpoint to join the channel. See the NetEase Yunxin documentation. A new token must be generated for every moderation request.
`cname`	string	Yes	Channel name. Non-empty, must exactly match the `cname` used to generate the token.
`uid`	int32	Yes	Pull-bot uid. Must exactly match the `uid` used to generate the token.
`appKey`	string	Yes	App key issued by NetEase Yunxin.

Response

The synchronous response is an acknowledgement only — it confirms that DeepCleer has accepted the moderation task. Per-segment moderation results are delivered asynchronously through the callback URL you provided.

Response Parameters

ℹ️
Parameters other than code, message, and requestId are only guaranteed to be returned when code is 1100.

Parameter	Type	Required	Description
`requestId`	string	Yes	Unique DeepCleer request identifier.
`code`	int32	Yes	Response code. See Response Codes.
`message`	string	Yes	Response message corresponding to the `code`.
`detail`	object	No	Detailed information for error scenarios. See `errorCode` and `dupRequestId` below.
`errorCode`	int32	No	Detailed status code. `1001`: duplicate stream submission.
`dupRequestId`	string	No	Returned when `errorCode` is `1001` (duplicate submission). If the original request response was lost but the stream has already entered moderation, the original `requestId` is unknown to the caller. Resubmit the same stream and use the returned `dupRequestId` to call the close-moderation endpoint.

Response Codes

Code	Message
`1100`	Success
`1901`	QPS or stream-count limit exceeded
`1902`	Invalid parameters
`1903`	Service failure
`1904`	Stream pull failure
`9101`	Unauthorized operation

⚠️
The synchronous response field is named errorCode (camelCase). This differs from the Video Stream Moderation sync response, which uses lowercase errorcode. Both are documented as-returned and are candidates for casing alignment in the v5 cleanup.

Callback Mechanism

Once the stream pull stabilizes, DeepCleer continuously pushes per-segment moderation results to your callback URL. The push cadence depends on returnAllText:

returnAllText = 0: a callback is sent only when a segment is found to contain a violation.
returnAllText = 1: a callback is sent every 10 seconds covering the most recent 10-second segment, regardless of risk level.

Payloads are delivered as JSON in the HTTP request body.

Callback Parameters

Parameter	Type	Required	Description
`requestId`	string	Yes	Unique DeepCleer identifier for this stream segment.
`btId`	string	Yes	Client-side audio identifier (echoed from the request).
`code`	int32	Yes	Response code. `1100`: success. Other codes match the synchronous response. Fields other than `message` and `requestId` are only present when `code` is `1100`.
`message`	string	Yes	Response message corresponding to the `code`.
`statCode`	int32	No	Moderation lifecycle status. `0`: in progress (regular per-segment result). `1`: moderation finished (stream-end callback). Only present when `returnFinishInfo` is `1`. Note: the semantics here differ from the Video Stream Moderation API, where `statCode` `0` means a regular result and `1` means a stream-end callback at the same parameter location. Treat the two APIs as having distinct callback lifecycles even if the field name matches.
`requestParams`	object	Yes	Echo of the original request parameters.
`audioDetail`	object	No	Per-segment audio moderation result. Returned when `code` is `1100` and `statCode` is `0`. See `audioDetail` Object.
`auxInfo`	object	No	Stream-end auxiliary information. Returned when `statCode` is `1`. See Stream-End `auxInfo`.

`audioDetail` Object

Parameter	Type	Required	Description
`audioUrl`	string	Yes	URL of the audio segment.
`riskLevel`	string	Yes	Disposition recommendation. `PASS`: normal. `REVIEW`: suspicious. `REJECT`: violation.
`riskLabel1`	string	Yes	Level 1 risk label. Returns `normal` when `riskLevel` is `PASS`.
`riskLabel2`	string	Yes	Level 2 risk label. Empty when `riskLevel` is `PASS`.
`riskLabel3`	string	Yes	Level 3 risk label. Empty when `riskLevel` is `PASS`.
`riskDescription`	string	Yes	Risk description. Returns "Normal" when `riskLevel` is `PASS`. Hits against custom lists return "Matched custom list". Otherwise format: "Level 1: Level 2: Level 3". For reference only — do not use for programmatic logic.
`audioText`	string	No	Transcribed text of the segment. When `returnPreText` is `1`, contains both the preceding and current segment text; when `0`, contains the current segment text only.
`preAudioUrl`	string	No	URL of a 20-second clip combining the preceding and current audio segments. Returned only when `returnPreAudio` is `1`.
`riskDetail`	object	No	Detailed risk information. Returned when `code` is `1100`. See `riskDetail` Object.
`auxInfo`	object	Yes	Auxiliary information for this segment. See Segment `auxInfo`.
`businessLabels`	array	No	Business labels for this audio segment (gender, timbre, singing, etc.). See `businessLabels` Array.
`allLabels`	array	No	All risk labels detected for this segment. See `allLabels` Array.
`tokenProfileLabels`	array	No	Account attribute labels. Returned only when the labeling service is enabled. See Token Labels.
`tokenRiskLabels`	array	No	Account risk labels. Returned only when the labeling service is enabled. See Token Labels.
`speakers`	array	No	Per-second speaker activity within this segment. See `speakers` Array. Currently only present in Agora mixed streams.
`vadCode`	int32	No	Silence flag for this segment. `0`: silent segment. `1`: non-silent segment.
`audioTags`	object	No	Legacy audio attribute labels (gender, timbre, language, singing). See `audioTags` Object. For new integrations prefer `businessLabels`.

`riskDetail` Object

Parameter	Type	Required	Description
`riskSource`	int32	Yes	Risk source. `1000`: no risk. `1001`: text risk. `1003`: audio risk.
`audioText`	string	No	Transcribed text used during moderation.
`matchedLists`	array	No	Matched custom list information. Returned only when a custom list is hit. See Matched Lists.
`riskSegments`	array	No	High-risk content segments. Present when political, terrorism, prohibited, competitive, or advertising-law content is detected. See Risk Segments.

Matched Lists

Parameter	Type	Required	Description
`name`	string	Yes	Name of the matched list.
`words`	array	Yes	Sensitive word details.
`words[].word`	string	Yes	The matched sensitive word.
`words[].position`	array	Yes	Position of the sensitive word (0-indexed).

Risk Segments

Parameter	Type	Required	Description
`segment`	string	No	High-risk content segment.
`position`	array	No	Position of the segment within the transcript (0-indexed).

Segment `auxInfo`

Parameter	Type	Required	Description
`audioStartTime`	string	Yes	Absolute start time of the violating content within the stream. (Note: this field uses uppercase `T` here. The standalone Audio Sync/Async/Query APIs return the same conceptual field as `audioStarttime` (lowercase `t`). The on-the-wire casing is preserved as-returned and is a candidate for v5 alignment.)
`audioEndTime`	string	Yes	Absolute end time of the violating content within the stream. (Same casing-inconsistency note as `audioStartTime`.)
`beginProcessTime`	int64	Yes	Processing start time. 13-digit Unix timestamp in milliseconds (UTC).
`finishProcessTime`	int64	Yes	Processing finish time. 13-digit Unix timestamp in milliseconds (UTC).
`userId`	int32	No	In-room user ID for the speaker. Present only for Agora separated streams. Distinct from the `agoraParam.uid` used for token generation.
`strUserId`	string	No	In-room user ID for the speaker. Present for separated streams of `ALI`, `TRTC`, `ZEGO`, `VOLC`, and `GIN`. Distinct from `trtcParam.uid` (TRTC separated) and `aliParam.userId` (Aliyun separated).
`room`	string	No	Room number.
`seiInfo`	array	No	SEI information. Contact DeepCleer to enable.
`passThrough`	object	No	Pass-through field. Same value as `data.extra.passThrough` from the request.

`businessLabels` Array

Each element in the array:

Parameter	Type	Required	Description
`businessLabel1`	string	No	Level 1 business label.
`businessLabel2`	string	No	Level 2 business label.
`businessLabel3`	string	No	Level 3 business label.
`businessDescription`	string	No	Business label description. Format: "Level 1: Level 2: Level 3". For reference only — do not use for programmatic logic.
`probability`	float	No	Confidence score (0–1).
`confidenceLevel`	int32	No	Confidence level (0–2). Higher values indicate greater confidence.

`allLabels` Array

Each element in the array:

Parameter	Type	Required	Description
`riskLabel1`	string	Yes	Level 1 risk label.
`riskLabel2`	string	Yes	Level 2 risk label.
`riskLabel3`	string	Yes	Level 3 risk label.
`riskDescription`	string	Yes	Risk reason. For reference only — do not use for programmatic logic.

`speakers` Array

Per-second speaker uid + volume sampling for the audio segment, ordered chronologically. The outer array contains up to 10 elements (one per sampled second). Each inner element is itself an array describing every speaker active at that second.

Currently only populated in Agora mixed stream moderation.

Each inner object:

Parameter	Type	Required	Description
`uid`	int32	Yes	In-room speaker uid.
`volume`	int32	Yes	Volume level. Range `0`–`255`.

`audioTags` Object

Legacy audio attribute labels. Returned when the corresponding type value is requested. For new integrations, prefer the businessLabels array; the structure here is preserved for backwards compatibility.

Parameter	Type	Required	Description
`gender`	object	No	Gender label. Returned when `type` includes `GENDER`. See Gender Label.
`timbre`	array	No	Timbre labels. Returned when `type` includes `TIMBRE`. See Timbre Labels.
`song`	int32	No	Singing label. Returned when `type` includes `SING`. `0`: no singing detected. `1`: singing detected.
`language`	object	No	Language label. Returned when `type` includes `LANGUAGE`. See Language Labels.

Gender Label

Parameter	Type	Required	Description
`label`	string	Yes	Gender label name. Possible values (Chinese as returned by the legacy API): `男性` (male), `女性` (female).
`probability`	int32	Yes	Confidence score on a `0`–`100` scale. Higher values indicate greater confidence. (Legacy 0–100 scale — modern endpoints use a 0–1 scale; this is a v5-cleanup candidate.)

Timbre Labels

Each element in the array:

Parameter	Type	Required	Description
`label`	string	Yes	Timbre category. Possible values (Chinese as returned by the legacy API): `大叔` (older male), `青年` (young male), `正太` (boy), `老年` (elderly), `女王` (mature woman), `御姐` (assertive woman), `少女` (young woman), `萝莉` (girl), `大妈` (older female).
`probability`	int32	Yes	Confidence score on a `0`–`100` scale. Higher values indicate greater confidence. (Legacy 0–100 scale — see Gender Label note.)

Language Labels

Parameter	Type	Required	Description
`label`	int32	Yes	Language category. See Language Codes.
`probability`	int32	Yes	Confidence score on a `0`–`100` scale. (Legacy 0–100 scale — see Gender Label note.)

Language Codes

Value	Language
`0`	Mandarin Chinese
`1`	English
`2`	Cantonese
`3`	Tibetan
`4`	Uyghur
`5`	Mongolian
`6`	Korean
`-1`	Other

Token Labels

Both tokenProfileLabels and tokenRiskLabels share the same structure:

Parameter	Type	Required	Description
`label1`	string	No	Level 1 label.
`label2`	string	No	Level 2 label.
`label3`	string	No	Level 3 label.
`description`	string	No	Label description. For reference only — do not use for programmatic logic.
`timestamp`	int64	No	Label assignment time. 13-digit Unix timestamp in milliseconds (UTC).

Stream-End `auxInfo`

Returned only in the stream-end callback (statCode = 1). Indicates why the moderation task ended.

Parameter	Type	Required	Description
`errorCode`	int32	Yes	Stream-end error code. `3001`: stream URL access failure (e.g. HTTP 404 / 403). `3002`: invalid stream data (e.g. "Invalid data found when processing input"). `3003`: stream not found (e.g. Zego error `197612`). `3004`: stream returned no audio data. `3005`: pull token invalid or expired — refresh the token and resubmit (e.g. expired Agora token, invalid TRTC `userSig`).
`streamTime`	int64	No	Submitted stream duration. Returned in the final stream-end callback. When `audioDetectStep` is configured this value may differ from the actual stream length.

ℹ️
These auxInfo.errorCode values (3001–3005) are specific to streaming pull failures. They are distinct from the 2003 / 2007 codes used by the standalone Audio Sync, Async, and Query APIs, which describe file-fetch and decode failures. Integrators using both interfaces should map the two namespaces separately.

Examples

Request Example

{
  "accessKey": "xxxxx",
  "appId": "default",
  "eventId": "default",
  "type": "EROTIC_ADVERT_POLITY_DIRTY",
  "businessType": "GENDER_TIMBRE_SING_LANGUAGE",
  "callback": "xxxxx",
  "data": {
    "btId": "test1",
    "lang": "zh",
    "room": "room2",
    "url": "xxxxx",
    "streamType": "NORMAL",
    "returnAllText": 1,
    "returnPreText": 1,
    "returnPreAudio": 1,
    "tokenId": "2222"
  }
}

Synchronous Response Example

{
  "code": 1100,
  "message": "Success",
  "requestId": "b639042cbfe229359e672074762c5583"
}

Callback Example

{
  "requestId": "b639042cbfe229359e672074762c5583_2",
  "btId": "1637847086612",
  "code": 1100,
  "message": "Success",
  "audioDetail": {
    "audioTags": {
      "gender": {
        "label": "男性",
        "probability": 99
      },
      "language": [
        { "label": 0, "probability": 99 },
        { "label": 1, "probability": 0 }
      ],
      "song": 0,
      "timbre": [
        { "label": "大叔", "probability": 7 },
        { "label": "青年", "probability": 34 },
        { "label": "老年", "probability": 58 },
        { "label": "正太", "probability": 0 }
      ]
    },
    "audioText": "那就不好打吗？所以所以他小龙让掉也是合情合理。还要看这条，先锋啊，下一个节奏点可能就是个先锋，但这个先锋的时候，苏宁其实是可以打正面团战了，谢谢毛主任一直都",
    "audioUrl": "https://bj-voice-mp3-1251671073.cos.ap-beijing.myqcloud.com/MP3%2F20211125%2Fb639042cbfe229359e672074762c5583_2.mp3?...",
    "auxInfo": {
      "beginProcessTime": 1637847113897,
      "finishProcessTime": 1637847114514,
      "room": "test1"
    },
    "businessLabels": [
      {
        "businessDescription": "Minor: Minor: Minor",
        "businessLabel1": "minor",
        "businessLabel2": "weichengnianren",
        "businessLabel3": "weichengnianren",
        "confidenceLevel": 0,
        "probability": 0
      }
    ],
    "preAudioUrl": "https://bj-voice-mp3-1251671073.cos.ap-beijing.myqcloud.com/MP3%2F20211125%2Fb639042cbfe229359e672074762c5583_2_pre.mp3?...",
    "riskDescription": "Normal",
    "riskDetail": {
      "audioText": "那就不好打吗？所以所以他小龙让掉也是合情合理。还要看这条，先锋啊，下一个节奏点可能就是个先锋，但这个先锋的时候，苏宁其实是可以打正面团战了，谢谢毛主任一直都"
    },
    "riskLabel1": "normal",
    "riskLabel2": "normal",
    "riskLabel3": "normal",
    "riskLevel": "REJECT",
    "speakers": [
      [
        { "uid": 2, "volume": 100 },
        { "uid": 1, "volume": 255 },
        { "uid": 3, "volume": 50 }
      ],
      [
        { "uid": 2, "volume": 200 },
        { "uid": 3, "volume": 50 }
      ],
      [
        { "uid": 4, "volume": 255 }
      ]
    ]
  }
}

API Description

Requirements

Stream Pull Retry Mechanism

Timeout Suggestion

Callback Mechanism

Request

Request URL

Request Parameters

Top-Level Parameters

Detection Types

Business Detection Types

data Object Parameters

Stream Types

Supported Languages

User Levels

zegoParam Object

agoraParam Object

trtcParam Object

volcParam Object

ginParam Object

aliParam Object

yunxinParam Object

Response

Response Parameters

Response Codes

Callback Mechanism

Callback Parameters

audioDetail Object

riskDetail Object

Matched Lists

Risk Segments

Segment auxInfo

businessLabels Array

allLabels Array

speakers Array

audioTags Object

Gender Label

Timbre Labels

Language Labels

Language Codes

Token Labels

Stream-End auxInfo

Examples

Request Example

Synchronous Response Example

Callback Example

`data` Object Parameters

`zegoParam` Object

`agoraParam` Object

`trtcParam` Object

`volcParam` Object

`ginParam` Object

`aliParam` Object

`yunxinParam` Object

`audioDetail` Object

`riskDetail` Object

Segment `auxInfo`

`businessLabels` Array

`allLabels` Array

`speakers` Array

`audioTags` Object

Stream-End `auxInfo`