Contacts
Get in touch
Close

PHI in Your LLM Context Window: What HIPAA Actually Says

10 Views

Summarize Article

Key takeaways

  • HIPAA’s Privacy Rule applies to protected health information regardless of format, paper, electronic, and now AI system prompts. Any AI vendor that creates, receives, maintains, or transmits PHI on behalf of a covered entity is a business associate and must sign a BAA, per HHS official guidance. This includes LLM API providers if the prompts they receive contain PHI.
  • De-identified health information is not PHI. HHS’s de-identification guidance defines two methods for achieving de-identification: Expert Determination and Safe Harbor. The Safe Harbor method requires removing 18 specific identifier types including names, dates (except the year), and geographic identifiers more specific than state.
  • AWS Bedrock documents HIPAA eligibility and confirms that customer data is never stored or used to train foundation models. HIPAA-eligible Bedrock workloads require a BAA with AWS executed via the Enterprise Agreement or equivalent licensed agreement.
  • Microsoft Azure AI services are covered under Microsoft’s BAA through the Data Protection Addendum, with Azure AI services HIPAA-eligible for text-based inputs. Azure OpenAI is covered for text-based inputs but Computer Vision and Face API are not eligible by default, these should not be used with PHI.
  • Vertex AI Agent Engine on Google Cloud supports HIPAA workloads with a BAA available through Google Cloud. Customer data is not used to train Gemini models on Vertex AI.
  • The critical operational risk in LLM healthcare applications is the ‘safe handoff’ problem: BAA-covered LLM systems sometimes call external tools including web search APIs that are not covered by the same BAA. Any tool call that passes PHI to an uncovered service constitutes an impermissible disclosure.

 

Most healthcare AI development teams understand that patient data requires HIPAA protection. The specific rules that apply when that data enters an LLM prompt are less universally understood, and the gaps in that understanding are where compliance failures occur in production systems.

The core question is precise: when a developer sends a system prompt or RAG context that includes patient information to an LLM API, is that a HIPAA-regulated event? The answer is yes, if the information meets the definition of PHI. The LLM API provider receives that information, processes it, and potentially stores it, making the provider a business associate under HIPAA if no zero-retention arrangement or BAA exists.

This post maps the specific regulatory rules that apply to each data handling scenario in LLM-based healthcare applications, documents which cloud AI platforms provide HIPAA BAAs for LLM workloads, explains both de-identification methods available under HIPAA, and identifies the operational risks that most commonly lead to impermissible disclosures in otherwise well-designed systems.

 

Building an LLM application for a healthcare client and need to scope the HIPAA architecture?

WebOsmotic designs HIPAA-compliant AI systems for healthcare teams. We scope the BAA requirements, design the de-identification pipeline, architect the data access layer, and build the audit logging required for compliance, as first-class deliverables, not afterthoughts.

→  Talk to our healthcare AI team

 

When PHI in an LLM prompt becomes a HIPAA event

The HIPAA Privacy Rule applies to protected health information regardless of the medium in which it exists. A system prompt containing a patient’s name, diagnosis, and medication list is PHI. The transmission of that prompt to an LLM API endpoint is a disclosure of PHI. If no BAA exists between the covered entity and the LLM provider, that disclosure is impermissible.

HHS’s cloud computing guidance establishes the framework directly: when a covered entity or business associate engages a cloud service provider to create, receive, maintain, or transmit ePHI, the CSP is a business associate, even if it processes only encrypted PHI and lacks the encryption key. This principle extends to LLM API providers. An LLM that receives a prompt containing PHI creates, receives, and processes ePHI on behalf of the covered entity.

  • Scenario 1: LLM prompt contains PHI and no BAA exists with the provider. This is an impermissible disclosure. It does not matter whether the provider is a major cloud vendor or a small LLM startup. Without a BAA, sending PHI to that provider violates HIPAA’s Privacy Rule
  • Scenario 2: LLM prompt contains PHI and a valid BAA exists with the provider. The disclosure is permissible for healthcare operations, treatment, or payment purposes, subject to the minimum necessary standard. The provider’s data handling under the BAA is governed by HIPAA’s business associate provisions
  • Scenario 3: LLM prompt contains de-identified data only. De-identified health information is not PHI under HIPAA. There are no HIPAA restrictions on the use, disclosure, or processing of properly de-identified data. The prompt can be sent to any provider without a BAA, as long as the de-identification was performed correctly

 

HIPAA de-identification: the two methods and how they apply to LLM inputs

HHS’s de-identification guidance defines two methods for achieving de-identification under the Privacy Rule. Both methods, if correctly applied, produce information that is no longer PHI and therefore not subject to HIPAA’s use and disclosure restrictions.

Method 1: Safe Harbor

Safe Harbor requires the removal of 18 specific identifier categories from the health information. If all 18 identifier types are removed and the covered entity has no actual knowledge that the remaining information could be used to identify an individual, the information is de-identified.

  • The 18 identifier types include: names, dates (except year), phone numbers, email addresses, Social Security numbers, medical record numbers, geographic identifiers more specific than state, IP addresses, device identifiers, biometric identifiers, and other unique codes or characteristics

Method 2: Expert Determination

Expert Determination requires a qualified expert to apply statistical or scientific principles to determine that the risk of identifying an individual from the de-identified data is very small. The expert must document the methods and results of the analysis.

 

Which cloud AI platforms provide HIPAA BAAs for LLM workloads

 

PlatformHIPAA BAA availableScope of coverageKey condition
AWS BedrockYesHIPAA-eligible for LLM inference workloads. Customer data is not stored or used to train foundation modelsBAA must be executed via Enterprise Agreement or equivalent AWS licensed agreement before PHI workloads are deployed
Microsoft Azure OpenAIYes (text-based inputs)Covered under Microsoft’s BAA through the Data Protection Addendum for text-based AI inputsComputer Vision and Face API are not HIPAA-eligible by default. Do not use these with PHI unless explicitly approved. Verify the Microsoft DPA is in place for your licensing model
Vertex AI (Google Cloud)YesVertex AI Agent Engine supports HIPAA workloads. Customer data is not used to train Gemini modelsBAA available through Google Cloud. Verify scope covers the specific Vertex AI services in your workload
Anthropic Claude (direct API)Available for enterpriseAnthropic offers HIPAA BAAs for enterprise customers. Confirm current availability and scope directly with Anthropic before deploying PHI workloadsZero Data Retention option available. Do not use consumer or developer tier APIs for PHI without confirming current BAA coverage
OpenAI APIAvailable for enterpriseHIPAA BAA available for enterprise customers with Zero Data Retention. Consumer ChatGPT does not carry HIPAA coverageConfirm current enterprise BAA coverage directly with OpenAI. Developer tier accounts typically do not include BAA coverage

 

The safe handoff problem: the most common HIPAA violation in LLM healthcare systems

The safe handoff problem is the most common HIPAA violation in well-designed LLM healthcare systems: a BAA-covered LLM makes tool calls to external APIs, such as web search or drug databases, that pass PHI to services not covered by any BAA. The original vendor’s BAA covers only its own endpoint.

  • The safe handoff principle: any tool call that exits the BAA-covered environment must be preceded by a de-identification step that strips PHI from the query before the external API receives it. Research published in 2026 by University of Texas Medical Branch describes this as the point where a clinician’s PHI-containing query must be transformed into a HIPAA Safe Harbor-compliant version before leaving the protected environment
  • Implementation: the LLM’s tool call handler must include a de-identification layer that applies Safe Harbor stripping to any query before it is sent to a non-BAA-covered external endpoint. The clinical context, the diagnostic reasoning, and the information needed can be preserved. The 18 identifier types must be removed
  • Logging requirement: every tool call that exits the BAA environment, whether de-identified or not, should be logged with the tool name, the de-identification status, and the timestamp. This creates the audit trail that compliance reviews will examine when assessing whether impermissible disclosures occurred
  • Agentic AI systems amplify this risk: autonomous clinical agents that call tools without explicit per-call developer authorization require per-agent identity governance and automatic de-identification before any external tool call

 

Architectural requirements for HIPAA-compliant LLM development

HIPAA-compliant LLM development is an infrastructure architecture problem, not an LLM engineering problem. The required components go beyond the model itself.

  • De-identification pipeline: a reusable pre-processing service that applies Safe Harbor stripping before patient data enters any LLM prompt
  • BAA verification layer: a configuration that maps each external API endpoint the system calls to its BAA status. Before any data is sent to an external service, the system confirms that a valid BAA covers that service, or that the data has been de-identified. Failed BAA checks should be logged and should halt the operation
  • Immutable audit logging: every LLM inference call that involves PHI, the prompt context, the response, the user identity, the clinical purpose, and the timestamp must be logged in a tamper-evident, encrypted format. This is the evidence that compliance audits will examine
  • Minimum necessary enforcement: the LLM system should receive only the PHI fields required for the specific clinical task being performed. The data access layer should enforce field-level minimum necessary constraints by agent role and task type

 

WebOsmotic’s healthcare AI development practice treats HIPAA compliance architecture as a first-class deliverable. For clients building LLM systems on AWS Bedrock, Azure AI, or Vertex AI, the BAA verification, de-identification pipeline, audit logging, and minimum necessary enforcement are all designed and built before the LLM application layer is implemented.

 

Ready to build an LLM application for healthcare that is HIPAA-compliant from day one?

WebOsmotic designs and builds HIPAA-compliant AI systems for healthcare providers, health tech companies, and digital health platforms. We scope BAA requirements, build de-identification pipelines, and architect audit logging as first-class deliverables.

→  Get your HIPAA AI architecture review

 

Frequently asked questions

Does sending PHI to an LLM API require a HIPAA Business Associate Agreement?

Yes, if the prompt or context contains information that meets the definition of protected health information. HHS’s cloud computing guidance establishes that any cloud service provider that creates, receives, maintains, or transmits ePHI on behalf of a covered entity is a business associate, even if it processes only encrypted ePHI and lacks the encryption key. Sending PHI to an LLM API makes the provider a business associate. Without a BAA, the disclosure is impermissible under the Privacy Rule.

How do I de-identify patient data before sending it to an LLM?

HHS’s de-identification guidance provides two methods. The Safe Harbor method requires removing 18 identifier categories including names, geographic identifiers more specific than state, dates other than year, phone numbers, email addresses, Social Security numbers, medical record numbers, and any other unique identifying code or characteristic. Removing all 18 identifier types produces data that is no longer PHI and has no HIPAA restrictions. For most LLM development workflows, Safe Harbor is operationally simpler than Expert Determination because its requirements are enumerable and implementable as a preprocessing step.

Is AWS Bedrock HIPAA compliant for LLM workloads?

Yes. AWS documents Bedrock as HIPAA-eligible, with customer data never stored or used to train foundation models. A BAA must be in place before PHI workloads are deployed, executed through an Enterprise Agreement or equivalent AWS licensed agreement. The BAA is part of the AWS Data Processing Agreement available through the AWS Artifact console. Verify that the specific Bedrock model variants and services in your workload are covered by the current BAA scope before deploying PHI workloads.

Is Microsoft Azure OpenAI covered under a HIPAA BAA?

Yes, for text-based inputs. Azure AI services are covered under Microsoft’s Business Associate Agreement through the Data Protection Addendum, which is automatically included for customers under valid licensing models including Enterprise Agreements. Azure OpenAI is HIPAA-eligible for text-based inputs. Computer Vision and Face API are not HIPAA-eligible by default and should not be used with PHI unless explicitly approved. Confirm your licensing includes the DPA.

What is the safe handoff problem in healthcare LLM systems?

The safe handoff problem occurs when a BAA-covered LLM system makes tool calls to external APIs that are not covered by the same BAA. The original LLM vendor’s BAA covers traffic to that vendor’s endpoint. Tool calls to web search APIs, research literature databases, drug interaction services, or any other external endpoint are separate disclosures. If those tool calls pass PHI from the LLM’s context window, each call is a potential impermissible disclosure. The solution is a de-identification layer in the tool call handler that strips Safe Harbor identifiers before any query exits the BAA-covered environment.

How does WebOsmotic build HIPAA-compliant LLM applications?

WebOsmotic treats HIPAA compliance as an architecture requirement, not a post-deployment checklist. Every healthcare LLM engagement includes: scoping which cloud AI services require BAA verification and confirming those BAAs are in place; designing the de-identification pipeline that applies Safe Harbor stripping before data leaves the BAA-covered environment; building the tool call handler with BAA verification and automatic de-identification for external API calls; implementing immutable audit logging for every PHI-involving inference call; and enforcing the minimum necessary standard at the data access layer. These components are designed and built before the LLM application layer is implemented.

Let's Build Digital Legacy!







    Related Blogs

    Unlock AI for Your Business

    Partner with us to implement scalable, real-world AI solutions tailored to your goals.