A Golang port of the Phileas Java library for deidentifying and redacting PII, PHI, and other sensitive information from text.
- Check out the documentation or details and code examples.
- Built by Philterd.
- Commercial support and consulting is available - contact us.
go-phileas analyzes text searching for sensitive information and can manipulate it in a variety of ways. It uses policies (defined in JSON or YAML) to configure what types of sensitive information to find and how to handle it when found.
Note that this port of Phileas is not 1:1 with the Java version. There are some differences:
- This project includes support for policies in YAML as well as JSON.
- This project does not include all redaction strategies present in the Java version.
- This project includes a CLI.
- This project does not include support for PDF documents which is present in the Java version.
There is also a phileas-python which is a Python port of the Java version.
- Ages (e.g., "45 years old", "aged 30", "61 y/o")
- Bank Routing Numbers
- Bitcoin Addresses
- Credit Card Numbers (Visa, MasterCard, American Express, Diners Club, Discover, JCB)
- Custom Dictionaries (inline word lists or file-based)
- Dates (multiple formats: MM/DD/YYYY, YYYY-MM-DD, Month DD YYYY, etc.)
- Driver's License Numbers
- Email Addresses
- IBAN Codes
- IP Addresses (IPv4 and IPv6)
- MAC Addresses
- Passport Numbers
- Phone Numbers (US and international)
- Social Security Numbers (SSN) and Taxpayer Identification Numbers (TIN)
- Tracking Numbers (UPS, FedEx, USPS)
- URLs
- Vehicle Identification Numbers (VINs)
- ZIP Codes
- Named Entities (persons, locations, organizations, etc. via the ph-eye NER service)
go get github.com/philterd/go-phileaspackage main
import (
"fmt"
"github.com/philterd/go-phileas/pkg/policy"
"github.com/philterd/go-phileas/pkg/services"
)
func main() {
pol := &policy.Policy{
Name: "my-policy",
Identifiers: policy.Identifiers{
SSN: &policy.SSNFilter{
SSNFilterStrategies: []policy.FilterStrategy{
{Strategy: policy.StrategyRedact, RedactionFormat: "{{{REDACTED-%t}}}"},
},
},
EmailAddress: &policy.EmailAddressFilter{},
},
}
svc, err := services.NewFilterService(pol)
if err != nil {
panic(err)
}
result, err := svc.Filter(pol, "my-context", "My SSN is 123-45-6789 and email is john@example.com.")
if err != nil {
panic(err)
}
fmt.Println(result.FilteredText)
// Output: My SSN is {{{REDACTED-ssn}}} and email is {{{REDACTED-email-address}}}.
for _, span := range result.Spans {
fmt.Printf("Found %s: %q at position %d-%d\n",
span.FilterType, span.Text, span.CharacterStart, span.CharacterEnd)
}
}The second argument to Filter is the context name. All Filter calls sharing the same context name use the same token→replacement store, ensuring that the same PII value always receives the same replacement within a context (referential integrity). The default in-memory store is created automatically by NewFilterService. To supply a custom store (e.g., Redis for multi-process deployments) use NewFilterServiceWithContext.
package main
import (
"fmt"
"github.com/philterd/go-phileas/pkg/services"
)
func main() {
policyJSON := `{
"identifiers": {
"age": {
"ageFilterStrategies": [{"strategy": "REDACT", "redactionFormat": "{{{REDACTED-%t}}}"}]
}
}
}`
result, err := services.FilterJSON(policyJSON, "context", "The patient is 45 years old.")
if err != nil {
panic(err)
}
fmt.Println(result.FilteredText)
// Output: The patient is {{{REDACTED-age}}}.
}Use Explain when you want to see what would be detected without modifying the text:
package main
import (
"fmt"
"github.com/philterd/go-phileas/pkg/policy"
"github.com/philterd/go-phileas/pkg/services"
)
func main() {
pol := &policy.Policy{
Name: "my-policy",
Identifiers: policy.Identifiers{
SSN: &policy.SSNFilter{},
EmailAddress: &policy.EmailAddressFilter{},
},
}
svc, err := services.NewFilterService(pol)
if err != nil {
panic(err)
}
spans, err := svc.Explain(pol, "my-context", "My SSN is 123-45-6789 and email is john@example.com.")
if err != nil {
panic(err)
}
for _, span := range spans {
fmt.Printf("Found %s: %q at position %d-%d\n",
span.FilterType, span.Text, span.CharacterStart, span.CharacterEnd)
}
// Output:
// Found ssn: "123-45-6789" at position 10-21
// Found email-address: "john@example.com" at position 35-51
}| Strategy | Description |
|---|---|
REDACT |
Replace with a redaction placeholder (default). Use %t in redactionFormat for the filter type and %v for the original value. |
RANDOM_REPLACE |
Replace with a randomly generated but realistic value of the same type (deterministic per context+value pair for referential integrity). |
STATIC_REPLACE |
Replace with a fixed static value specified in staticReplacement. |
CRYPTO_REPLACE |
Encrypt the sensitive information (requires crypto configuration in the policy). |
HASH_SHA256_REPLACE |
Replace the sensitive information with its SHA-256 hash. |
LAST_4 |
Keep only the last 4 characters of the sensitive information. |
MASK |
Replace each character with a mask character (default: *). Set maskCharacter to use a different character. |
go-phileas integrates with ph-eye, a standalone HTTP microservice that runs AI/NLP models for named-entity recognition (NER). This allows go-phileas to detect and redact named entities such as person names, locations, and organizations — types of sensitive information that cannot be reliably caught by regular expressions alone.
When a policy contains a pheye identifier, go-phileas sends the input text to the configured ph-eye service endpoint (POST /find). The service returns a list of detected entities with their character offsets, labels (e.g. Person), and confidence scores. go-phileas converts those into spans and applies the configured filter strategy (redact, replace, mask, etc.) just like any other identifier.
Because ph-eye is an external service, you need a running ph-eye instance reachable from your application. The default endpoint is http://localhost:18080. Refer to the ph-eye documentation for setup instructions.
The phEyeConfiguration object controls the connection to ph-eye:
| Field | Type | Default | Description |
|---|---|---|---|
endpoint |
string |
http://localhost:18080 |
URL of the ph-eye service. |
timeout |
int |
600 |
HTTP timeout in seconds. |
labels |
string |
Person |
Comma-separated entity labels to detect (e.g. "Person", "Person,Location"). |
Additional filter options:
| Field | Type | Default | Description |
|---|---|---|---|
phEyeFilterStrategies |
[]FilterStrategy |
REDACT |
How to handle identified spans. |
removePunctuation |
bool |
false |
Strip punctuation before sending text to ph-eye. |
bearerToken |
string |
— | Bearer token for authenticating with ph-eye. |
ignored |
[]string |
— | Terms to skip, compared case-insensitively. |
enabled |
bool |
true |
Set to false to disable without removing from the policy. |
Unlike the regex-based identifiers, pheye is a list — you can configure multiple ph-eye instances in one policy (e.g. to target different models or endpoints).
Go struct
pol := &policy.Policy{
Name: "my-policy",
Identifiers: policy.Identifiers{
PhEye: []policy.PhEyeFilter{
{
PhEyeConfiguration: policy.PhEyeConfiguration{
Endpoint: "http://localhost:18080",
Labels: "Person",
},
PhEyeFilterStrategies: []policy.FilterStrategy{
{Strategy: policy.StrategyRedact, RedactionFormat: "{{{REDACTED-%t}}}"},
},
},
},
},
}
svc, err := services.NewFilterService(pol)
if err != nil {
panic(err)
}
result, err := svc.Filter(pol, "context", "George Washington was the first president.")
// result.FilteredText → "{{{REDACTED-pheye}}} was the first president."JSON policy
{
"identifiers": {
"pheye": [
{
"phEyeConfiguration": {
"endpoint": "http://localhost:18080",
"labels": "Person"
},
"phEyeFilterStrategies": [
{"strategy": "REDACT", "redactionFormat": "{{{REDACTED-%t}}}"}
]
}
]
}
}For more details, see the identifiers reference.
go-phileas supports custom dictionary filters for redacting specific words or phrases. Terms can be provided inline or loaded from a file (one word per line). A policy can contain multiple dictionary filters, each with its own word list and strategy.
Go struct
pol := &policy.Policy{
Name: "my-policy",
Identifiers: policy.Identifiers{
Dictionaries: []policy.DictionaryFilter{
{
Terms: []string{"Alice", "Bob", "Acme Corp"},
DictionaryFilterStrategies: []policy.FilterStrategy{
{Strategy: policy.StrategyRedact, RedactionFormat: "{{{REDACTED-%t}}}"},
},
},
},
},
}
svc, err := services.NewFilterService(pol)
if err != nil {
panic(err)
}
result, err := svc.Filter(pol, "context", "Alice and Bob work at Acme Corp.")
// result.FilteredText → "{{{REDACTED-custom-dictionary}}} and {{{REDACTED-custom-dictionary}}} work at {{{REDACTED-custom-dictionary}}}."JSON policy
{
"identifiers": {
"dictionaries": [
{
"terms": ["Alice", "Bob", "Acme Corp"],
"dictionaryFilterStrategies": [
{"strategy": "REDACT", "redactionFormat": "{{{REDACTED-%t}}}"}
]
}
]
}
}To load words from a file, use the files field:
{
"identifiers": {
"dictionaries": [
{
"files": ["/etc/phileas/sensitive-names.txt"],
"dictionaryFilterStrategies": [{"strategy": "STATIC_REPLACE", "staticReplacement": "[NAME REMOVED]"}]
}
]
}
}For more details, see the identifiers reference.
{
"identifiers": {
"ssn": {
"ssnFilterStrategies": [{
"strategy": "REDACT",
"redactionFormat": "{{{REDACTED-%t}}}"
}]
},
"emailAddress": {
"emailAddressFilterStrategies": [{
"strategy": "STATIC_REPLACE",
"staticReplacement": "[EMAIL REMOVED]"
}]
},
"ipAddress": {},
"phoneNumber": {},
"creditCard": {},
"date": {},
"age": {},
"url": {
"requireHttpWwwPrefix": true
},
"zipCode": {
"requireDelimiter": false
},
"dictionaries": [
{
"terms": ["Alice", "Bob"],
"dictionaryFilterStrategies": [{"strategy": "REDACT"}]
}
]
}
}go-phileas includes a command-line tool, phileas, that redacts text from the command line.
make build-cliOr directly with go build:
go build -o phileas ./cmd/phileasphileas --policy <policy.json> --input <input.txt> [--context <context>]
phileas --policy <policy.json> --input <input.txt> --evaluate --spans <spans.json> [--context <context>]
| Flag | Required | Description |
|---|---|---|
--policy |
Yes | Path to the JSON policy file |
--input |
Yes | Path to the input text file to redact |
--context |
No | Context name to associate with the filter operation. If omitted, context checks are skipped. |
--evaluate |
No | Enable evaluation mode — prints precision, recall, and F1 instead of redacted text |
--spans |
When --evaluate is set |
Path to a JSON file containing ground-truth spans |
The redacted text is written to standard output. Errors are written to standard error and the process exits with a non-zero status.
Given a policy file policy.json:
{
"identifiers": {
"ssn": {
"ssnFilterStrategies": [{"strategy": "REDACT", "redactionFormat": "{{{REDACTED-%t}}}"}]
},
"emailAddress": {
"emailAddressFilterStrategies": [{"strategy": "STATIC_REPLACE", "staticReplacement": "[EMAIL]"}]
}
}
}And an input file input.txt:
My SSN is 123-45-6789 and my email is john@example.com.
Run:
phileas --policy policy.json --input input.txtOutput:
My SSN is {{{REDACTED-ssn}}} and my email is [EMAIL].
Use --evaluate with --spans to measure how well a policy detects sensitive information against a set of labeled ground-truth spans.
The --spans file is a JSON array of span objects with characterStart and characterEnd fields:
[
{"characterStart": 10, "characterEnd": 21},
{"characterStart": 38, "characterEnd": 54}
]Run:
phileas --policy policy.json --input input.txt --evaluate --spans spans.jsonOutput:
True Positives: 2
False Positives: 0
False Negatives: 0
Precision: 1.0000
Recall: 1.0000
F1: 1.0000
See CLI documentation for full details.
go build ./...go test ./...Copyright 2026 Philterd, LLC.
Licensed under the Apache License, Version 2.0. See LICENSE for details.
"Phileas" and "Philter" are registered trademarks of Philterd, LLC.
This project is a Go port of Phileas, which is also Apache-2.0 licensed.