Agent Traces - PII Scrubbing

Last updated: July 27, 2026

Span detects and redacts personally identifiable information (PII) from agent traces. Detected values are replaced in place with a stable marker, so identical values correlate across events of the same trace without the original ever being stored. Each marker also indicates the detected entity type and where the redaction happened — on-device or in Span's pipeline.

Where redaction happens

On-device (released). Pattern-based, deterministic redaction runs inside coding-hooks on the developer's machine, before any data leaves it. No model and no network call are involved.
In Span's data pipeline. A second pass on Span's ingestion pipeline that redacts raw data before it is stored, adding semantic detection (e.g. names, addresses) that patterns alone can't catch. This stage runs entirely within Span's infrastructure — it never makes external calls, so your data is never sent to any third party.

Supported entities

Detection is built on Microsoft Presidio's predefined recognizers. Two complementary approaches cover two different kinds of PII.

Pattern-based, deterministic — on-device

Structured PII that has a recognizable shape, detected with regular-expression patterns plus checksum/structure validators. Fast, exact, and runs entirely on the developer's machine.

Some of these patterns are intentionally naive and will occasionally over-match (produce false positives). Because the patterns target concise, well-bounded shapes, the resulting information loss is typically minimal — so we err toward redacting more than strictly necessary rather than risk leaving PII exposed.

Contact & network
- Email addresses
- Phone numbers
- IP addresses (IPv4 and IPv6)
- MAC addresses
Financial
- Credit-card numbers (Luhn-validated)
- IBANs
- US bank routing numbers
- Bank account numbers
- Crypto wallet addresses
National & government IDs
- United States: SSN, ITIN, NPI (provider identifier)
- United Kingdom: NHS number, National Insurance number (NINO), postcode
- Spain: DNI, NIE
- India: Aadhaar, PAN
- Australia: TFN, ABN, Medicare number
- Singapore: NRIC/FIN
- Italy: Codice Fiscale
- Finland: Personal Identity Code
- South Korea: Resident Registration Number (RRN)
- Poland: PESEL
- Sweden: Personnummer
- Canada: SIN
Other identifiers
- Passport numbers
- Driver's license numbers

NER-based, semantic — Span's pipeline

Free-form PII that has no fixed shape and can only be recognized from context, detected by named-entity-recognition (NER) models running in Span's pipeline. The tentative set below — kept to the entity types NER detects most reliably:

Person names
Physical / postal addresses
Phone numbers
Email addresses
Secrets & credentials (e.g. API keys, tokens, passwords)

Phone numbers and email addresses are also covered deterministically by the on-device patterns above; the NER stage detects them as a semantic backstop.

Default settings

By default, Span redacts all of the supported entity types. This is configurable per tenant — you can restrict redaction to a chosen subset of entity types, or disable on-device scrubbing entirely.

Extending on-device scrubbing

If you want to extend to the PII rules, reach out to your Span representative with:

an entity name (used in the redaction marker),
a regular-expression pattern, and
a few example values.