***
title: PII and PCI Redaction
description: Automatically redact sensitive information from transcriptions
---------------------------------------------------------------------------
Real-Time
Redaction allows you to identify and mask sensitive information from transcriptions to protect privacy and comply with data protection regulations. The Pulse STT API supports two types of redaction: PII (Personally Identifiable Information) and PCI (Payment Card Information).
## Enabling Redaction
Add `redact_pii` and/or `redact_pci` parameters to your WebSocket connection query parameters. Both parameters default to `false`. Options: `true`, `false`.
### Real-Time WebSocket API
```javascript
const url = new URL("wss://waves-api.smallest.ai/api/v1/pulse/get_text");
url.searchParams.append("language", "en");
url.searchParams.append("encoding", "linear16");
url.searchParams.append("sample_rate", "16000");
url.searchParams.append("redact_pii", "true");
url.searchParams.append("redact_pci", "true");
const ws = new WebSocket(url.toString(), {
headers: {
Authorization: `Bearer ${API_KEY}`,
},
});
```
## Redaction Types
### PII Redaction (`redact_pii`)
When `redact_pii=true` is enabled, the following types of personally identifiable information are automatically identified and redacted:
* **Names**: First names and surnames
* **Addresses**: Street addresses and locations
* **Phone numbers**: Various phone number formats
Redacted PII items are replaced with placeholder tokens like `[FIRSTNAME_1]`, `[FIRSTNAME_2]`, `[PHONENUMBER_1]`, etc.
### PCI Redaction (`redact_pci`)
When `redact_pci=true` is enabled, the following types of payment card information are automatically identified and redacted:
* **Credit card numbers**: 16-digit credit/debit card numbers
* **CVV codes**: Card verification values
* **ZIP codes**: Postal/ZIP codes
* **Account numbers**: Bank account numbers
Redacted PCI items are replaced with placeholder tokens like `[CREDITCARDCVV_1]`, `[ZIPCODE_1]`, `[ACCOUNTNUMBER_1]`, etc.
## Output Format
When redaction is enabled, the transcription text contains placeholder tokens instead of the original sensitive information. The response also includes a `redacted_entities` array listing all the redacted entity placeholders.
### Sample Response with Redaction
```json
{
"session_id": "sess_12345abcde",
"transcript": "[CREDITCARDCVV_1] and expiry [TIME_2] slash 34.",
"is_final": true,
"is_last": true,
"full_transcript": "Hi, my name is [FIRSTNAME_1] [FIRSTNAME_2] You can reach me at [PHONENUMBER_1] and I paid using my Visa card [ZIPCODE_1] [ACCOUNTNUMBER_1] with [CREDITCARDCVV_1] and expiry [TIME_1].",
"language": "en",
"languages": ["en"],
"redacted_entities": [
"[CREDITCARDCVV_1]",
"[TIME_2]"
]
}
```
## Response Fields
|
Field
|
Type
|
When Included
|
Description
|
|
`redacted_entities`
|
array
|
`redact_pii=true`
or
`redact_pci=true`
|
List of redacted entity placeholders (e.g.,
`[FIRSTNAME_1]`
,
`[CREDITCARDCVV_1]`
)
|
|
`transcript`
|
string
|
Always
|
Transcription text with redacted entities replaced by placeholder tokens
|
|
`full_transcript`
|
string
|
`full_transcript=true`
AND
`is_final=true`
|
Cumulative transcript with redacted entities (when
`full_transcript=true`
is enabled)
|
## Redaction Placeholder Format
Redacted entities are replaced with placeholder tokens following the pattern:
* `[ENTITYTYPE_N]` where `ENTITYTYPE` indicates the type of information (e.g., `FIRSTNAME`, `PHONENUMBER`, `CREDITCARDCVV`, `ZIPCODE`, `ACCOUNTNUMBER`)
* `N` is a sequential number starting from 1 to uniquely identify each instance
Examples:
* `[FIRSTNAME_1]`, `[FIRSTNAME_2]` - First names
* `[PHONENUMBER_1]` - Phone numbers
* `[CREDITCARDCVV_1]` - Credit card CVV codes
* `[ZIPCODE_1]` - ZIP/Postal codes
* `[ACCOUNTNUMBER_1]` - Account numbers
For the highest level of protection and effective compliance auditing, enable both `redact_pii=true` and `redact_pci=true` flags in your request.
Additionally, use the `redacted_entities` array in the response as an audit trail to track what data has been redacted from each transcript.
## Compliance and Privacy
Redaction helps with compliance requirements for:
* **HIPAA**: Health Insurance Portability and Accountability Act (healthcare data)
* **GDPR**: General Data Protection Regulation (EU data protection)
* **CCPA**: California Consumer Privacy Act (California data protection)
* **PCI DSS**: Payment Card Industry Data Security Standard (payment card data)
* **SOC 2**: System and Organization Controls (security and privacy)
Note: Redaction is a tool to help protect sensitive information, but it should be used as part of a comprehensive data protection strategy. Always consult with legal and compliance teams to ensure your implementation meets regulatory requirements.