Audio Query

Poll this endpoint to query audio moderation results. Results are available for up to 3 days.

API Description

Audio result polling API. Pairs with the Async Audio API — instead of waiting for the callback delivery, your application calls this endpoint with the btId you submitted earlier and DeepCleer returns the latest moderation result for that clip. Useful in two scenarios: (1) as a primary delivery channel when your environment can't host a public callback URL, and (2) as a recovery channel when an earlier callback was dropped or your handler failed. Results are retained on DeepCleer's side for up to 3 days after the original submission; queries for older btId values return 1101 (Processing) or fail to find the record.

Requirements

ItemSpecification
ProtocolHTTP or HTTPS
MethodPOST
EncodingUTF-8
FormatAll request and response parameters use JSON

Timeout Suggestion

  • Recommended timeout: 5 seconds
ℹ️

This is a fast lookup against DeepCleer's result store, not the original audio fetch — so response time is typically well under a second. The 5-second timeout exists as a safety margin for transient network conditions. If a clip is still being processed, the response returns code: 1101 (Processing) and you should retry after a short delay rather than holding the connection open.


Request

Request URL

ClusterRequest URL
Silicon Valleyhttp://api-audio-sh.fengkongcloud.com/query_audio/v4
Singaporehttp://api-audio-xjp.fengkongcloud.com/query_audio/v4

Request Parameters

ParameterTypeRequiredMax LengthDescription
accessKeystringYes20API authentication key. The default accessKey is sent in your onboarding email.
btIdstringYes128Client-side audio identifier originally submitted to the Async Audio API. Used to look up the moderation result for that clip.

Response

The response payload mirrors the async callback payload — querying for a successfully moderated clip returns the same fields you would have received via callback.

Response Parameters

ℹ️

Parameters other than code, message, and requestId are only guaranteed to be returned when code is 1100.

ParameterTypeRequiredDescription
requestIdstringYesUnique DeepCleer request identifier for this query.
btIdstringYesClient-side audio identifier echoed back from the request.
codeint32YesResponse code. See Response Codes.
messagestringYesResponse message corresponding to the code.
riskLevelstringYesOverall disposition recommendation. PASS: normal (allow). REVIEW: suspicious (route to manual review). REJECT: violation (block). During initial integration, we recommend tuning your interception thresholds before using this value for hard blocks.
audioTextstringYesFull audio-to-text transcription result.
audioTimeint32YesTotal audio duration in seconds.
audioDetailarrayYesPer-segment moderation results. See audioDetail Array.
audioTagsobjectNoLegacy audio tags — gender, timbre, and singing detection. New integrations should use businessLabels inside audioDetail instead. See audioTags Object.
requestParamsobjectYesEcho of all fields submitted under data in the original async request. Useful for correlating the polled result with the original payload.
auxInfoobjectNoAuxiliary information. See auxInfo Object.

Response Codes

CodeMessage
1100Success
1101Processing
1901QPS limit exceeded
1902Invalid parameters
1903Service failure
1904Download failure
1905Decoding failure
9100Insufficient balance
9101Unauthorized operation
ℹ️

1101 (Processing) means the clip identified by btId was accepted but moderation has not yet finished. Wait briefly and retry — do not treat it as a final state. Records older than 3 days may also return this code if the result has been purged from the store.

audioDetail Array

Each element represents one audio segment:

ParameterTypeRequiredDescription
requestIdstringYesUnique identifier for this audio segment.
audioStarttimefloatYesSegment start time relative to the audio beginning, in seconds.
audioEndtimefloatYesSegment end time relative to the audio beginning, in seconds.
audioUrlstringYesAudio segment URL (MP3 format).
riskLevelstringYesSegment risk level. PASS: normal. REVIEW: suspicious. REJECT: violation.
riskLabel1stringYesLevel 1 risk label. Returns normal when riskLevel is PASS.
riskLabel2stringYesLevel 2 risk label. Empty when riskLevel is PASS.
riskLabel3stringYesLevel 3 risk label. Empty when riskLevel is PASS.
riskDescriptionstringYesRisk description. Returns "Normal" when riskLevel is PASS. Format: "Level 1: Level 2: Level 3". For reference only — do not use for programmatic logic.
riskDetailobjectNoRisk detail for this segment. See Segment riskDetail.
allLabelsarrayNoAll risk labels matched in this segment. See Segment allLabels.
businessLabelsarrayNoAll business labels matched in this segment. See Segment businessLabels.

Note on field casing: audioStarttime and audioEndtime are preserved exactly as returned on the wire (lowercase t). Flag for v5 cleanup alongside other inconsistent casings such as face_num and b_advertise_risk_tokenid.

Segment allLabels

Each element in the allLabels array:

ParameterTypeRequiredDescription
riskLabel1stringYesLevel 1 risk label.
riskLabel2stringYesLevel 2 risk label.
riskLabel3stringYesLevel 3 risk label.
riskDescriptionstringYesRisk description. For reference only — do not use for programmatic logic.
riskLevelstringYesRisk level: PASS, REVIEW, or REJECT.
probabilityfloatNoConfidence score (0–1). Higher values indicate greater confidence.
riskDetailobjectNoRisk detail. Same structure as Segment riskDetail.

Segment riskDetail

ParameterTypeRequiredDescription
audioTextstringNoAudio-to-text transcription result for this segment.
riskSourceint32NoRisk source: 1000 (no risk), 1001 (text risk), 1003 (audio risk).
matchedListsarrayNoMatched custom list information. Returned only when a custom list is hit. See Matched Lists.
riskSegmentsarrayNoHigh-risk content segments. Present when political, terrorism, prohibited, competitive, or advertising-law content is detected. See Risk Segments.
Matched Lists
ParameterTypeRequiredDescription
namestringYesName of the matched list.
wordsarrayYesSensitive word details.
words[].wordstringYesThe matched sensitive word.
words[].positionarrayYesPosition of the sensitive word.
Risk Segments
ParameterTypeRequiredDescription
segmentstringNoHigh-risk content segment text.
positionarrayNoPosition of the segment (0-indexed).

Segment businessLabels

Each element in the businessLabels array:

ParameterTypeRequiredDescription
businessLabel1stringYesLevel 1 business label.
businessLabel2stringYesLevel 2 business label.
businessLabel3stringYesLevel 3 business label.
businessDescriptionstringYesBusiness label description. Format: "Level 1: Level 2: Level 3".
confidenceLevelint32NoConfidence level (0–2). Higher values indicate greater confidence.
probabilityfloatNoConfidence score (0–1).
businessDetailobjectNoDetailed information. Reserved field.

audioTags Object

⚠️

Legacy compatibility field. New integrations should consume businessLabels inside audioDetail instead.

ParameterTypeRequiredDescription
genderobjectNoGender detection result.
gender.labelstringYesGender label name (e.g., Male, Female).
gender.probabilityint32NoGender probability on a legacy 0–100 scale (higher values indicate greater likelihood). Note that other probability fields in this API use the modern 0–1 scale.
timbrearrayNoVoice timbre detection results. Each element contains label and probability. Possible label values: Uncle, Young Man, Boy, Elderly Man, Queen, Mature Woman, Young Woman, Loli, Middle-aged Woman, Male, Female, No Voice.
languagearrayNoLanguage detection results. Each element contains label (see Language Labels) and probability (modern responses) or confidence (legacy responses).

Language Labels

LabelDescription
0Mandarin Chinese
1English
2Cantonese
3Tibetan
4Uyghur
5Mongolian
6Korean
-1Other languages

auxInfo Object

ParameterTypeRequiredDescription
errorCodeint32NoProcessing-stage error code. 2003: audio download failure. 2007: no valid audio data to moderate.

Examples

Request Example

{
  "accessKey": "YOUR_ACCESS_KEY",
  "btId": "1604311839040"
}

Response Example

{
  "requestId": "6a9cb980346dfea41111656a514e9109",
  "btId": "1604311839040",
  "code": 1100,
  "message": "Success",
  "riskLevel": "PASS",
  "audioDetail": [
    {
      "requestId": "6a9cb980346dfea41111656a514e9109_a0000",
      "audioStarttime": 0,
      "audioEndtime": 10,
      "audioUrl": "http://example.com/audio_segment_a0000.mp3",
      "businessLabels": [
        {
          "businessDescription": "Singing: Singing: Singing",
          "businessLabel1": "sing",
          "businessLabel2": "changge",
          "businessLabel3": "changge",
          "confidenceLevel": 2,
          "probability": 0.858334402569294
        }
      ],
      "allLabels": [],
      "riskLevel": "PASS",
      "riskLabel1": "normal",
      "riskLabel2": "",
      "riskLabel3": "",
      "riskDescription": "Normal",
      "riskDetail": {
        "audioText": ""
      }
    },
    {
      "requestId": "6a9cb980346dfea41111656a514e9109_a0001",
      "audioStarttime": 10,
      "audioEndtime": 20,
      "audioUrl": "http://example.com/audio_segment_a0001.mp3",
      "riskLevel": "PASS",
      "riskLabel1": "normal",
      "riskLabel2": "",
      "riskLabel3": "",
      "riskDescription": "Normal",
      "riskDetail": {
        "audioText": ""
      }
    },
    {
      "requestId": "6a9cb980346dfea41111656a514e9109_a0002",
      "audioStarttime": 20,
      "audioEndtime": 30,
      "audioUrl": "http://example.com/audio_segment_a0002.mp3",
      "riskLevel": "PASS",
      "riskLabel1": "normal",
      "riskLabel2": "",
      "riskLabel3": "",
      "riskDescription": "Normal",
      "riskDetail": {
        "audioText": ""
      }
    }
  ],
  "audioTags": {
    "gender": {
      "label": "Female",
      "probability": 95
    },
    "language": [
      {
        "confidence": 0,
        "label": 2
      },
      {
        "confidence": 99,
        "label": 0
      },
      {
        "confidence": 0,
        "label": 1
      }
    ],
    "song": 0,
    "timbre": [
      {
        "label": "Female",
        "probability": 95
      },
      {
        "label": "Queen",
        "probability": 12
      },
      {
        "label": "Mature Woman",
        "probability": 37
      },
      {
        "label": "Young Woman",
        "probability": 56
      },
      {
        "label": "Middle-aged Woman",
        "probability": 67
      },
      {
        "label": "Loli",
        "probability": 24
      }
    ]
  }
}