Detector · identity

Person names tokenization

Names are hard to detect reliably from regex alone. Cypherz uses three layers: title patterns (Mr./Mrs./Dr. + Name), salutation context (Dear/Sincerely/Best + Name), and per-project custom dictionaries you can paste in directly. Add your customer list once; every prompt that mentions a customer gets tokenized automatically.

  • 01

    Title-anchored detection

    Anchored on Mr/Mrs/Ms/Miss/Dr/Prof + capitalized name pattern.

  • 02

    Salutation-anchored detection

    Dear/Hi/Hello/Sincerely/Regards + name pattern.

  • 03

    Custom dictionaries

    Paste your customer roster, product names, or internal codewords. Cypherz tokenizes them on sight regardless of context.

  • 04

    False-positive resistant

    Common place names (New York, San Francisco, etc.) are denylisted to avoid mis-tokenization.

Same input, with and without Cypherz

// Without Cypherz — the model sees real data:
Hi Marcus, please ship the contract to Sasha Petrov.

// With Cypherz — the model sees surrogates:
Hi <NAME_a1b2c3d4e5f6>, please ship the contract to <NAME_d4e5f6a1b2c3>.

// The application gets the original back inside its trust boundary.

Common questions

Frequently asked.

Are person names tokens deterministic?

Yes — within a project, the same input always maps to the same surrogate token. This makes joins, dedupe, and analytics keep working on tokenized data without ever decrypting.

Can I disable this detector for a specific project?

Yes — every detector is toggleable per project at creation time and editable from the dashboard.

What if I have a custom format Cypherz doesn't recognize?

Add a custom regex or literal list per project. Cypherz applies your rules after the built-in detectors run.

Are tokenization mappings encrypted at rest?

Yes — AES-256-GCM with envelope encryption. Each project has its own data-encryption key wrapped under the master key.

Get started

Add person names protection to your AI features.

Sign up, create a project, copy your API key. The first request is tokenized in under sixty seconds.