Agent Traces - PII Scrubbing
Last updated: June 16, 2026
Span detects and redacts personally identifiable information (PII) from agent traces. Detected values are replaced in place with a stable marker, so identical values correlate across events of the same trace without the original ever being stored. Each marker also indicates the detected entity type and where the redaction happened — on-device or in Span's pipeline.
Where redaction happens
On-device (released). Pattern-based, deterministic redaction runs inside
coding-hookson the developer's machine, before any data leaves it. No model and no network call are involved.In Span's data pipeline — [work in progress, not yet released]. A second pass on Span's ingestion pipeline that redacts raw data before it is stored, adding semantic detection (e.g. names, addresses) that patterns alone can't catch. This stage runs entirely within Span's infrastructure — it never makes external calls, so your data is never sent to any third party.
Supported entities
Detection is built on Microsoft Presidio's predefined recognizers. Two complementary approaches cover two different kinds of PII.
Pattern-based, deterministic — on-device (released)
Structured PII that has a recognizable shape, detected with regular-expression patterns plus checksum/structure validators. Fast, exact, and runs entirely on the developer's machine.
Some of these patterns are intentionally naive and will occasionally over-match (produce false positives). Because the patterns target concise, well-bounded shapes, the resulting information loss is typically minimal — so we err toward redacting more than strictly necessary rather than risk leaving PII exposed.
Contact & network
Email addresses
Phone numbers
IP addresses (IPv4 and IPv6)
MAC addresses
Financial
Credit-card numbers (Luhn-validated)
IBANs
US bank routing numbers
Bank account numbers
Crypto wallet addresses
National & government IDs
United States: SSN, ITIN, NPI (provider identifier)
United Kingdom: NHS number, National Insurance number (NINO), postcode
Spain: DNI, NIE
India: Aadhaar, PAN
Australia: TFN, ABN, Medicare number
Singapore: NRIC/FIN
Italy: Codice Fiscale
Finland: Personal Identity Code
South Korea: Resident Registration Number (RRN)
Poland: PESEL
Sweden: Personnummer
Canada: SIN
Other identifiers
Passport numbers
Driver's license numbers
NER-based, semantic — Span's pipeline (work in progress, not yet released)
Free-form PII that has no fixed shape and can only be recognized from context, detected by named-entity-recognition (NER) models running in Span's pipeline. The tentative set below — kept to the entity types NER detects most reliably — is based on our prototype and may change before release:
Person names
Physical / postal addresses
Phone numbers
Email addresses
Secrets & credentials (e.g. API keys, tokens, passwords)
Phone numbers and email addresses are also covered deterministically by the on-device patterns above; the NER stage detects them as a semantic backstop.
Default settings
By default, Span redacts all of the supported entity types. This is configurable per tenant — you can restrict redaction to a chosen subset of entity types, or disable on-device scrubbing entirely.
Extending on-device scrubbing
If you want to extend to the PII rules, reach out to your Span representative with:
an entity name (used in the redaction marker),
a regular-expression pattern, and
a few example values.