
Key takeaways
|
Most healthcare AI development teams understand that patient data requires HIPAA protection. The specific rules that apply when that data enters an LLM prompt are less universally understood, and the gaps in that understanding are where compliance failures occur in production systems.
The core question is precise: when a developer sends a system prompt or RAG context that includes patient information to an LLM API, is that a HIPAA-regulated event? The answer is yes, if the information meets the definition of PHI. The LLM API provider receives that information, processes it, and potentially stores it, making the provider a business associate under HIPAA if no zero-retention arrangement or BAA exists.
This post maps the specific regulatory rules that apply to each data handling scenario in LLM-based healthcare applications, documents which cloud AI platforms provide HIPAA BAAs for LLM workloads, explains both de-identification methods available under HIPAA, and identifies the operational risks that most commonly lead to impermissible disclosures in otherwise well-designed systems.
| Building an LLM application for a healthcare client and need to scope the HIPAA architecture? WebOsmotic designs HIPAA-compliant AI systems for healthcare teams. We scope the BAA requirements, design the de-identification pipeline, architect the data access layer, and build the audit logging required for compliance, as first-class deliverables, not afterthoughts. |
The HIPAA Privacy Rule applies to protected health information regardless of the medium in which it exists. A system prompt containing a patient’s name, diagnosis, and medication list is PHI. The transmission of that prompt to an LLM API endpoint is a disclosure of PHI. If no BAA exists between the covered entity and the LLM provider, that disclosure is impermissible.
HHS’s cloud computing guidance establishes the framework directly: when a covered entity or business associate engages a cloud service provider to create, receive, maintain, or transmit ePHI, the CSP is a business associate, even if it processes only encrypted PHI and lacks the encryption key. This principle extends to LLM API providers. An LLM that receives a prompt containing PHI creates, receives, and processes ePHI on behalf of the covered entity.
HHS’s de-identification guidance defines two methods for achieving de-identification under the Privacy Rule. Both methods, if correctly applied, produce information that is no longer PHI and therefore not subject to HIPAA’s use and disclosure restrictions.
Safe Harbor requires the removal of 18 specific identifier categories from the health information. If all 18 identifier types are removed and the covered entity has no actual knowledge that the remaining information could be used to identify an individual, the information is de-identified.
Expert Determination requires a qualified expert to apply statistical or scientific principles to determine that the risk of identifying an individual from the de-identified data is very small. The expert must document the methods and results of the analysis.
| Platform | HIPAA BAA available | Scope of coverage | Key condition |
| AWS Bedrock | Yes | HIPAA-eligible for LLM inference workloads. Customer data is not stored or used to train foundation models | BAA must be executed via Enterprise Agreement or equivalent AWS licensed agreement before PHI workloads are deployed |
| Microsoft Azure OpenAI | Yes (text-based inputs) | Covered under Microsoft’s BAA through the Data Protection Addendum for text-based AI inputs | Computer Vision and Face API are not HIPAA-eligible by default. Do not use these with PHI unless explicitly approved. Verify the Microsoft DPA is in place for your licensing model |
| Vertex AI (Google Cloud) | Yes | Vertex AI Agent Engine supports HIPAA workloads. Customer data is not used to train Gemini models | BAA available through Google Cloud. Verify scope covers the specific Vertex AI services in your workload |
| Anthropic Claude (direct API) | Available for enterprise | Anthropic offers HIPAA BAAs for enterprise customers. Confirm current availability and scope directly with Anthropic before deploying PHI workloads | Zero Data Retention option available. Do not use consumer or developer tier APIs for PHI without confirming current BAA coverage |
| OpenAI API | Available for enterprise | HIPAA BAA available for enterprise customers with Zero Data Retention. Consumer ChatGPT does not carry HIPAA coverage | Confirm current enterprise BAA coverage directly with OpenAI. Developer tier accounts typically do not include BAA coverage |
The safe handoff problem is the most common HIPAA violation in well-designed LLM healthcare systems: a BAA-covered LLM makes tool calls to external APIs, such as web search or drug databases, that pass PHI to services not covered by any BAA. The original vendor’s BAA covers only its own endpoint.
HIPAA-compliant LLM development is an infrastructure architecture problem, not an LLM engineering problem. The required components go beyond the model itself.
WebOsmotic’s healthcare AI development practice treats HIPAA compliance architecture as a first-class deliverable. For clients building LLM systems on AWS Bedrock, Azure AI, or Vertex AI, the BAA verification, de-identification pipeline, audit logging, and minimum necessary enforcement are all designed and built before the LLM application layer is implemented.
| Ready to build an LLM application for healthcare that is HIPAA-compliant from day one? WebOsmotic designs and builds HIPAA-compliant AI systems for healthcare providers, health tech companies, and digital health platforms. We scope BAA requirements, build de-identification pipelines, and architect audit logging as first-class deliverables. |
Does sending PHI to an LLM API require a HIPAA Business Associate Agreement?
Yes, if the prompt or context contains information that meets the definition of protected health information. HHS’s cloud computing guidance establishes that any cloud service provider that creates, receives, maintains, or transmits ePHI on behalf of a covered entity is a business associate, even if it processes only encrypted ePHI and lacks the encryption key. Sending PHI to an LLM API makes the provider a business associate. Without a BAA, the disclosure is impermissible under the Privacy Rule.
How do I de-identify patient data before sending it to an LLM?
HHS’s de-identification guidance provides two methods. The Safe Harbor method requires removing 18 identifier categories including names, geographic identifiers more specific than state, dates other than year, phone numbers, email addresses, Social Security numbers, medical record numbers, and any other unique identifying code or characteristic. Removing all 18 identifier types produces data that is no longer PHI and has no HIPAA restrictions. For most LLM development workflows, Safe Harbor is operationally simpler than Expert Determination because its requirements are enumerable and implementable as a preprocessing step.
Is AWS Bedrock HIPAA compliant for LLM workloads?
Yes. AWS documents Bedrock as HIPAA-eligible, with customer data never stored or used to train foundation models. A BAA must be in place before PHI workloads are deployed, executed through an Enterprise Agreement or equivalent AWS licensed agreement. The BAA is part of the AWS Data Processing Agreement available through the AWS Artifact console. Verify that the specific Bedrock model variants and services in your workload are covered by the current BAA scope before deploying PHI workloads.
Is Microsoft Azure OpenAI covered under a HIPAA BAA?
Yes, for text-based inputs. Azure AI services are covered under Microsoft’s Business Associate Agreement through the Data Protection Addendum, which is automatically included for customers under valid licensing models including Enterprise Agreements. Azure OpenAI is HIPAA-eligible for text-based inputs. Computer Vision and Face API are not HIPAA-eligible by default and should not be used with PHI unless explicitly approved. Confirm your licensing includes the DPA.
What is the safe handoff problem in healthcare LLM systems?
The safe handoff problem occurs when a BAA-covered LLM system makes tool calls to external APIs that are not covered by the same BAA. The original LLM vendor’s BAA covers traffic to that vendor’s endpoint. Tool calls to web search APIs, research literature databases, drug interaction services, or any other external endpoint are separate disclosures. If those tool calls pass PHI from the LLM’s context window, each call is a potential impermissible disclosure. The solution is a de-identification layer in the tool call handler that strips Safe Harbor identifiers before any query exits the BAA-covered environment.
How does WebOsmotic build HIPAA-compliant LLM applications?
WebOsmotic treats HIPAA compliance as an architecture requirement, not a post-deployment checklist. Every healthcare LLM engagement includes: scoping which cloud AI services require BAA verification and confirming those BAAs are in place; designing the de-identification pipeline that applies Safe Harbor stripping before data leaves the BAA-covered environment; building the tool call handler with BAA verification and automatic de-identification for external API calls; implementing immutable audit logging for every PHI-involving inference call; and enforcing the minimum necessary standard at the data access layer. These components are designed and built before the LLM application layer is implemented.