Table of Contents generated with DocToc
- Privacy-LLM setup
- The two mechanisms recap
- Claude Code trust boundary
- Variant 1 — Claude Code only (default)
- Variant 2 — Local inference (Ollama)
- Variant 3 — Local inference (vLLM)
- Variant 4 — Apache-hosted endpoint
- Variant 5 — AWS Bedrock
- Variant 6 — Direct Anthropic API (opt-in)
- Verifying the setup
- Updating after a framework version bump
- Status — provisional pending ASF Legal
Privacy-LLM setup
How to configure the framework’s privacy-aware LLM routing for
your adopting project. Pick a variant below; copy the matching
<project-config>/privacy-llm.md block into your project; verify
with /setup-isolated-setup-verify (or the privacy-llm-specific
check once PR-3 lands the gate-call wiring).
The contract behind these recipes lives in
tools/privacy-llm/tool.md,
tools/privacy-llm/pii.md, and
tools/privacy-llm/models.md.
This file is how-to; those are the what and why.
The two mechanisms recap
The framework treats two distinct privacy concerns separately:
- PII redaction — applies to
<security-list>content (the reporter mail). The body is OK to flow through any approved LLM. The reporter’s own identity (name, email, etc.) flows as-is — they sent the mail and are operationally known to the security team. What gets redacted is PII the reporter discloses about other people (third-party researchers, victims, named individuals other than the reporter), replaced with hash-prefixed identifiers (N-a3f9d2, …) before any LLM step — unless the named individual is already a collaborator on the<tracker>repo (their identity is already public/known via collaborator status, no privacy gain from redacting). The mapping is local to the user’s machine. This applies under every variant below — even Variant 1 (Claude-only). - Approved-LLM gate — applies to
<private-list>content (PMC private mail) and any other private foundation lists. The skill refuses to fetch unless every LLM in the active stack is in the approved-model registry.
Picking a variant below configures the gate (the LLM stack). The redactor (mechanism 1) runs regardless and needs no per-variant config beyond the home-dir storage path.
Claude Code trust boundary
The framework treats the Claude Code instance running the skills as default-approved: a working position the maintainer chose on 2026-05-04 in the absence of a ratified ASF Legal Affairs list. This means:
- A pure Claude-only deployment (Variant 1) needs no per-LLM approval workflow — the gate is satisfied by construction.
- Adding any other LLM to the stack (a summarizer, a
delegated-analysis hop, an outbound classifier) requires
matching it against the registry per
tools/privacy-llm/models.md. - If ASF Legal subsequently rules that Anthropic-hosted endpoints require a data-processing agreement for foundation private data, the framework will narrow this default and bump the registry version. Adopters using Variant 1 at that point will need to re-evaluate.
Variant 1 — Claude Code only (default)
The simplest variant. Claude Code is the only LLM in the stack; no external endpoints; the gate is auto-satisfied.
<project-config>/privacy-llm.md content (copy verbatim,
substitute <private-list> for your project’s actual list):
## Currently configured LLM stack
- Claude Code (the agent running framework skills)
## Approved third-party endpoints (opt-in)
(none — Claude Code is the only LLM)
## Private mailing lists for this project
- private@<project>.apache.org
Setup steps:
- Place the file at
<project-config>/privacy-llm.mdin your adopter repo (alongsideproject.md). - Commit it. The file is project-config — it travels with the repo, not per-machine.
- Run
/setup-isolated-setup-verifyto confirm the existing secure-agent setup is in place — no new secure-setup steps are needed for Variant 1.
That is the entire variant. Every framework skill that consults
<project-config>/privacy-llm.md will see “Claude-only” and pass
the gate.
Variant 2 — Local inference (Ollama)
Use when the project wants a second LLM in the stack — typically for delegated summarisation of long mail threads — without sending data to any external service.
Prerequisites:
- Ollama installed locally (
brew install ollamaon macOS; per-distribution package on Linux). - A model pulled (
ollama pull llama3.1:8bor similar — the framework does not prescribe which model). - Ollama bound to
127.0.0.1only (the default; do not expose to external interfaces).
<project-config>/privacy-llm.md content:
## Currently configured LLM stack
- Claude Code (the agent running framework skills)
- Local Ollama at http://127.0.0.1:11434/ (model: llama3.1:8b)
## Approved third-party endpoints (opt-in)
(none — local Ollama is local-only inference, default-approved)
## Private mailing lists for this project
- private@<project>.apache.org
Setup steps:
- Confirm Ollama is reachable:
curl http://127.0.0.1:11434/api/tagsreturns the model list. - Confirm Ollama is not reachable from outside the host:
curl http://<your-LAN-IP>:11434/api/tagsshould fail. - Place the file at
<project-config>/privacy-llm.md. Commit. - The framework helper detects
127.0.0.1(andlocalhost,::1) hostnames as default-approved local inference; no third-party-endpoint declaration is needed.
Variant 3 — Local inference (vLLM)
Same shape as Ollama but targeting vLLM for projects that need a larger model than Ollama hosts comfortably or need OpenAI-API compatibility for downstream tooling.
<project-config>/privacy-llm.md content:
## Currently configured LLM stack
- Claude Code (the agent running framework skills)
- Local vLLM at http://127.0.0.1:8000/v1/ (model: meta-llama/Llama-3.1-70B-Instruct)
## Approved third-party endpoints (opt-in)
(none — local vLLM is local-only inference, default-approved)
## Private mailing lists for this project
- private@<project>.apache.org
Same 127.0.0.1-or-localhost test as Ollama applies.
Variant 4 — Apache-hosted endpoint
Use when the ASF (or your project’s PMC) hosts an inference
endpoint at an *.apache.org domain. These are
default-approved — anything served from an *.apache.org
hostname runs on infra under ASF governance.
<project-config>/privacy-llm.md content (substitute the
actual endpoint):
## Currently configured LLM stack
- Claude Code (the agent running framework skills)
- ASF inference at https://inference.apache.org/v1/ (model: llama3.1-asf)
## Approved third-party endpoints (opt-in)
(none — *.apache.org endpoints are default-approved)
## Private mailing lists for this project
- private@<project>.apache.org
Setup steps:
- Confirm the endpoint resolves under
*.apache.org. The framework helper greps the URL host suffix;apache.orgis the trigger. - Confirm authentication if the endpoint requires it. ASF
endpoints typically authenticate via the user’s ASF identity
(LDAP / OAuth); credentials live at
~/.config/apache-steward/<endpoint>-token.jsonor similar — never in the project tree (seeAGENTS.md— Local setup). - Place the file at
<project-config>/privacy-llm.md. Commit.
Variant 5 — AWS Bedrock
Opt-in. AWS Bedrock with a region-bounded endpoint is a common choice for projects whose contributors are split across organisations and need a managed-inference fallback. The opt-in mechanism reflects that the data-residency contract is Bedrock-specific (region pinning, no-training, IAM-bounded access) and the adopter’s security team is responsible for verifying it matches ASF expectations for foundation private data.
Prerequisites:
- An AWS account the adopter’s security team controls.
- Bedrock enabled in a region you’ve verified for data residency (typically a region inside the EU or a region with a Bedrock data-processing addendum that covers foundation private data).
- The model the project uses enabled in that region (Bedrock requires per-region model enablement).
- An IAM identity for the framework with
bedrock:InvokeModel(and nothing else) on the specific model ARN. - The IAM credentials at
~/.aws/credentials(default AWS SDK path; never in the project tree).
<project-config>/privacy-llm.md content:
## Currently configured LLM stack
- Claude Code (the agent running framework skills)
- AWS Bedrock at https://bedrock-runtime.eu-central-1.amazonaws.com
(model: anthropic.claude-3-5-sonnet-20241022-v2:0)
## Approved third-party endpoints (opt-in)
- AWS Bedrock — eu-central-1
- Data-residency contract: AWS DPA + Bedrock no-training default
(https://aws.amazon.com/service-terms/, section 50.4 last
reviewed YYYY-MM-DD)
- IAM principal: arn:aws:iam::<account>:role/<project>-bedrock-readonly
- Approved-by: <PMC-member-initials> <YYYY-MM-DD>
## Private mailing lists for this project
- private@<project>.apache.org
Setup steps:
- Verify the region’s data-residency contract matches your project’s expectations for foundation private data. Document the link in the Data-residency contract line above.
- Verify Bedrock has Model invocation logging disabled (or that any logging destination is inside the same compliance boundary). The default is disabled.
- Provision the IAM role; place credentials at
~/.aws/credentials. - Place the file at
<project-config>/privacy-llm.mdwith the Approved-by line filled in by a PMC member of the security team. Commit.
Variant 6 — Direct Anthropic API (opt-in)
Opt-in. Direct calls to the Anthropic API outside of Claude Code (e.g. for a delegated-summarisation hop) require a contract covering data-processing for ASF private data — typically a zero-data-retention agreement plus a no-training clause.
Prerequisites:
- An Anthropic account with a zero-data-retention agreement applied to the API key.
- The API key at
~/.config/apache-steward/anthropic-api.jsonor via$ANTHROPIC_API_KEYset from a home-dir-sourced shell-rc — never in the project tree.
<project-config>/privacy-llm.md content:
## Currently configured LLM stack
- Claude Code (the agent running framework skills)
- Direct Anthropic API at https://api.anthropic.com/v1/
(model: claude-3-5-sonnet-20241022)
## Approved third-party endpoints (opt-in)
- Anthropic API direct
- Data-residency contract: ZDR + no-training agreement applied
to API key xxxxxx-… (Anthropic console → Privacy → ZDR
confirmed YYYY-MM-DD)
- Approved-by: <PMC-member-initials> <YYYY-MM-DD>
## Private mailing lists for this project
- private@<project>.apache.org
The Approved-by line is required because Direct-Anthropic is
opt-in. A <project-config>/privacy-llm.md that lists this
endpoint without the Approved-by line will be flagged by the
gate as incomplete.
Verifying the setup
Once <project-config>/privacy-llm.md is in place:
-
Run
/setup-isolated-setup-verifyto confirm the underlying secure-agent setup is unchanged. -
(PR-3) Run the privacy-llm-specific check:
uv run --project <framework>/tools/privacy-llm/redactor \ privacy-llm-check --reads-private-listReturns exit code 0 if the active stack is fully approved.
-
Sanity-check the redactor end-to-end. The third party in this example is
Other Researcher(someone the reporter mentions in their report; the reporter’s own name would NOT be passed to--field):echo "I worked with Other Researcher (other@example.com) on this finding" | \ uv run --project <framework>/tools/privacy-llm/redactor \ pii-redact \ --field name:"Other Researcher" \ --field email:"other@example.com"Output should replace the two values with
N-…andE-…identifiers. -
List the resulting map:
uv run --project <framework>/tools/privacy-llm/redactor pii-list
Updating after a framework version bump
The registry of default-approved entries can change between
framework versions (e.g. ASF Legal ratifies a list, or a previously-
default-approved class is narrowed). After running
/setup-steward upgrade, re-run the verification checks above. If
an entry that was previously default-approved is now opt-in, the
gate will surface the gap and the adopter follows the recipe for
the matching variant above.
Status — provisional pending ASF Legal
This document and the registry it points at are provisional: they reflect the framework maintainer’s current working position in the absence of a ratified ASF Legal Affairs / Privacy policy for AI-assisted handling of foundation private data. When such a policy lands, the registry will be updated to point at it as source-of-truth, and the variants above will be re-checked against it.
If you are a PMC member or ASF Legal Affairs reviewer reading
this and want to formalise the list: open an issue on
apache/airflow-steward
referencing this file. The framework will track ratification as
a project memory and bump the registry version once the ratified
list lands.