For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DocumentationAPI ReferenceSelf HostModel CardsClient LibrariesIntegrationsDeveloper ToolsChangelog
DocumentationAPI ReferenceSelf HostModel CardsClient LibrariesIntegrationsDeveloper ToolsChangelog
  • Getting Started
    • Introduction
    • Models
    • Authentication
  • Text to Speech (Lightning)
    • Quickstart
    • Overview
    • Sync & Async
    • Streaming
    • Pronunciation Dictionaries
    • Voices & Languages
    • HTTP vs Streaming vs WebSockets
  • Speech to Text (Pulse)
    • Quickstart
    • Overview
      • Word Timestamps
      • Language Detection
      • Utterances
      • Diarization
      • Redaction
      • Gender Detection
      • Emotion Detection
      • Keyword Boosting
      • Punctuation Formatting
      • End-of-Utterance Timeout
      • Inverse Text Normalization
      • Finalize Control
  • LLM (Electron)
    • Quickstart
    • Overview
    • Chat Completions
    • Streaming
    • Tool / Function Calling
    • Prefix Caching
    • Supported Parameters
    • Migrate from OpenAI
    • Best Practices
  • Cookbooks
    • Speech to Text
    • Text to Speech
    • Voice Agent (Electron + Pulse + Lightning)
  • Voice Cloning
    • Instant Clone (UI)
    • Instant Clone (API)
    • Instant Clone (Python SDK)
    • Delete Cloned Voice
  • Best Practices
    • Voice Cloning Best Practices
    • TTS Best Practices
  • Troubleshooting
    • Error reference
LogoLogo
Voice AgentsModels
Voice AgentsModels
On this page
  • Enabling Redaction
  • Real-Time WebSocket API
  • Redaction Types
  • PII Redaction (redact_pii)
  • PCI Redaction (redact_pci)
  • Output Format
  • Sample Response with Redaction
  • Response Fields
  • Redaction Placeholder Format
  • Compliance and Privacy
Speech to Text (Pulse)Features

PII and PCI Redaction

||View as Markdown|
Was this page helpful?
Previous

Speaker diarization

Next

Gender detection

Built with
Real-Time

Redaction allows you to identify and mask sensitive information from transcriptions to protect privacy and comply with data protection regulations. The Pulse STT API supports two types of redaction: PII (Personally Identifiable Information) and PCI (Payment Card Information).

Enabling Redaction

Add redact_pii and/or redact_pci parameters to your WebSocket connection query parameters. Both parameters default to false. Options: true, false.

Real-Time WebSocket API

1const url = new URL("wss://api.smallest.ai/waves/v1/pulse/get_text");
2url.searchParams.append("language", "en");
3url.searchParams.append("encoding", "linear16");
4url.searchParams.append("sample_rate", "16000");
5url.searchParams.append("redact_pii", "true");
6url.searchParams.append("redact_pci", "true");
7
8const ws = new WebSocket(url.toString(), {
9 headers: {
10 Authorization: `Bearer ${API_KEY}`,
11 },
12});

Redaction Types

PII Redaction (redact_pii)

When redact_pii=true is enabled, the following types of personally identifiable information are automatically identified and redacted:

  • Names: First names and surnames
  • Addresses: Street addresses and locations
  • Phone numbers: Various phone number formats

Redacted PII items are replaced with placeholder tokens like [FIRSTNAME_1], [FIRSTNAME_2], [PHONENUMBER_1], etc.

PCI Redaction (redact_pci)

When redact_pci=true is enabled, the following types of payment card information are automatically identified and redacted:

  • Credit card numbers: 16-digit credit/debit card numbers
  • CVV codes: Card verification values
  • ZIP codes: Postal/ZIP codes
  • Account numbers: Bank account numbers

Redacted PCI items are replaced with placeholder tokens like [CREDITCARDCVV_1], [ZIPCODE_1], [ACCOUNTNUMBER_1], etc.

Output Format

When redaction is enabled, the transcription text contains placeholder tokens instead of the original sensitive information. The response also includes a redacted_entities array listing all the redacted entity placeholders.

Sample Response with Redaction

1{
2 "session_id": "sess_12345abcde",
3 "transcript": "[CREDITCARDCVV_1] and expiry [TIME_2] slash 34.",
4 "is_final": true,
5 "is_last": true,
6 "language": "en",
7 "languages": ["en"],
8 "redacted_entities": [
9 "[CREDITCARDCVV_1]",
10 "[TIME_2]"
11 ]
12}

Response Fields

FieldTypeWhen IncludedDescription
redacted_entitiesarrayredact_pii=true or redact_pci=trueList of redacted entity placeholders (e.g., [FIRSTNAME_1], [CREDITCARDCVV_1])
transcriptstringAlwaysTranscription text with redacted entities replaced by placeholder tokens

Redaction Placeholder Format

Redacted entities are replaced with placeholder tokens following the pattern:

  • [ENTITYTYPE_N] where ENTITYTYPE indicates the type of information (e.g., FIRSTNAME, PHONENUMBER, CREDITCARDCVV, ZIPCODE, ACCOUNTNUMBER)
  • N is a sequential number starting from 1 to uniquely identify each instance

Examples:

  • [FIRSTNAME_1], [FIRSTNAME_2] - First names
  • [PHONENUMBER_1] - Phone numbers
  • [CREDITCARDCVV_1] - Credit card CVV codes
  • [ZIPCODE_1] - ZIP/Postal codes
  • [ACCOUNTNUMBER_1] - Account numbers

For the highest level of protection and effective compliance auditing, enable both redact_pii=true and redact_pci=true flags in your request.

Additionally, use the redacted_entities array in the response as an audit trail to track what data has been redacted from each transcript.

Compliance and Privacy

Redaction helps with compliance requirements for:

  • HIPAA: Health Insurance Portability and Accountability Act (healthcare data)
  • GDPR: General Data Protection Regulation (EU data protection)
  • CCPA: California Consumer Privacy Act (California data protection)
  • PCI DSS: Payment Card Industry Data Security Standard (payment card data)
  • SOC 2: System and Organization Controls (security and privacy)

Note: Redaction is a tool to help protect sensitive information, but it should be used as part of a comprehensive data protection strategy. Always consult with legal and compliance teams to ensure your implementation meets regulatory requirements.