Position PaperHealth

Health Data Sovereignty in the Gulf: A Governance Framework for On-Premise Clinical AI

Toward a three-class model that separates model-training data, inference data, and audit data under distinct governance regimes.

Authors

Ameen Altajer - Chief Executive Officer, INFINITEWARE

November 8, 2025

11 min read

Abstract

This position paper proposes a governance framework for on-premise clinical AI in Gulf Cooperation Council hospitals. We argue that current discussions of health data sovereignty conflate three distinct data classes: model-training data, inference-time data, and audit data. Each carries different regulatory, contractual, and technical obligations. Separating these classes and specifying different governance regimes for each unlocks deployment patterns that a monolithic sovereignty stance forecloses, while preserving the substantive protections that the monolithic stance seeks. We outline the framework, describe the trade-offs it enables, and identify the regulatory questions the framework highlights rather than resolves.

1. Introduction

Health data sovereignty in the Gulf is usually discussed as a single question. Does patient-identifying data leave the hospital's network, or does it not? The question is treated as binary and the debate proceeds as if that were the only decision the hospital has to make about the data. In practice, sovereign hosting for clinical AI requires distinguishing between three data classes, each with different regulatory and technical obligations, and each with different consequences if handled poorly. Collapsing them into a single sovereignty question makes the technical implementation harder than it needs to be, and it forecloses deployment patterns that would otherwise be safely available.

This paper proposes a three-class framework for health data sovereignty in the GCC context, describes the governance regime appropriate to each class, and outlines what the framework enables and forecloses. It is written for hospital administrators, clinical informatics leaders, and the regulators developing AI governance in the region. It is not a legal opinion.

2. Three data classes

Clinical AI systems handle three functionally distinct data classes. The distinction is not primarily about what the data is; it is about what it is being used for.

Model-training data. The corpus used to train or fine-tune the language and speech models the system relies on. This corpus is often assembled once, or updated on a slow cadence, and is typically de-identified before use. The model itself is a derivative artefact of this data.
Inference data. The patient-identifying data that flows through the model at the point of care, whether audio from a consultation, a physician's dictation, or the structured record fields the AI is completing. This data is generated at each encounter and is used to produce an immediate clinical artefact.
Audit data. The record of what the system did with the inference data, what it produced, who reviewed it, and what edits were made. This is the trail that would be needed to answer a regulator's question after the fact, or to reconstruct a decision in the event of a clinical adverse event.

In the monolithic sovereignty debate, all three are collapsed into one question and answered together. In the framework we propose, each has its own governance regime, with different residency, retention, and access rules.

The three data classes and the governance regime appropriate to each. Model-training data is the compact. Inference data is the perimeter. Audit data is the ledger.

3. Model-training data: the compact

Model-training data is the class where cross-border discussion is most productive. Training corpora that are properly de-identified and consented can, and often should, be shared across institutions to produce models that no single institution could build alone. The relevant governance instrument is a compact between contributing institutions, specifying de-identification standards, consent flow, permitted derivative uses, and revocation. The regulatory question is not whether the data leaves the hospital but whether the de-identification standard is defensible and whether the consent flow is intelligible.

In the GCC, we have observed that institutions treat training-data sharing as if it were the same question as inference-data sharing. It is not. A hospital that would never permit patient-identifying data to leave its network can, and often should, contribute to a de-identified training compact whose derived model is then hosted inside its own network for inference. Collapsing the two questions denies the institution both benefits.

4. Inference data: the perimeter

Inference data is the class where the sovereignty debate correctly lives. Patient-identifying data at the point of care must not leave the hospital's network under current GCC regulatory conditions, and in our observations it should not even leave the specific data zone within the hospital that is authorised for it. The governance instrument is a perimeter: a network boundary, an authentication regime, and an access log.

The design implication is that the model must be hosted inside the perimeter. This is a solved engineering problem for the current generation of models suitable for clinical documentation, and it is not a solved engineering problem for the very largest frontier models. A hospital that wants to use the largest available models for inference will find that its sovereignty position is currently incompatible with those models. The remedy is not to weaken the sovereignty position. It is to accept that the largest models are not currently in scope for inference, and to plan the deployment against models that fit inside the perimeter.

5. Audit data: the ledger

Audit data is the class that governance discussions often forget. It is generated by the AI system itself in the course of doing its work, and it is what makes the AI's decisions inspectable after the fact. The governance instrument is a ledger: an immutable, timestamped, cryptographically-anchored record of what the AI produced, what the physician did with it, and what the record shows now.

The audit data should live inside the perimeter, but its access regime is different from the inference data itself. Regulators, hospital compliance officers, and the physician themselves have legitimate reasons to inspect audit data at cadence that inference data would not tolerate. Building the ledger as a distinct class rather than a subset of inference data makes those inspections tractable without expanding the inference-data access surface.

6. Deployment patterns the framework enables

Separating the three classes and specifying different governance for each opens deployment patterns that a monolithic sovereignty position forecloses.

Cross-institutional model training with in-institution inference. Hospitals contribute to a de-identified training compact whose derived model is then hosted inside each contributing institution for inference. Each institution gets a better model than it could produce alone, without loosening its inference-time perimeter.
Federated audit inspection with regulator access. Regulators inspect a hospital's audit ledger without accessing the underlying inference data or patient identifiers. This is technically straightforward if the ledger is designed as a class distinct from inference data, and technically hard if it is not.
Time-limited external clinical support. External clinical consultants review audit data (not inference data) to support a specific case, under a documented and revocable access grant. The consultant sees the AI's decision trail without seeing the patient's identifying record.

7. Regulatory dependencies

The framework proposed here has regulatory dependencies that are in active development across the GCC. The three-class distinction is easier to make in a regulatory environment that already recognises the distinction, and harder to make in one that treats all clinical data as a single protected class. Bahrain, the United Arab Emirates, and Saudi Arabia are at different points on this axis at time of writing, and the position is not settled in any of them.

We flag this as a dependency rather than a resolution. Institutions adopting the framework should expect the regulatory regime around them to catch up in stages, and should design for that trajectory rather than the current position. The three-class distinction is more useful as a design principle than as a compliance claim.

8. Conclusion

Health data sovereignty in the Gulf is best discussed as three questions, not one. Model-training data belongs to a governance regime built around compacts and de-identification. Inference data belongs to a perimeter regime built around network boundaries and access authentication. Audit data belongs to a ledger regime built around immutability and inspectability. The three regimes are related but not identical, and separating them unlocks deployment patterns that the monolithic sovereignty stance forecloses without weakening the protections that the stance is trying to secure. We propose that GCC institutions building clinical AI adopt this three-class distinction as a design principle and press regulators to acknowledge the distinction in the AI governance frameworks now under development.

Keywords

Data SovereigntyClinical AI GovernanceGCC HealthcareOn-Premise DeploymentAudit LedgerPHI ProtectionRegulatory Framework

Related research

Field ReportHealth

Ambient Clinical Scribing vs. Structured Post-Encounter Dictation: A Field Comparison in Multilingual GCC Settings

In our GCC-Contextual Framework paper we identified workflow-native capture as one of four load-bearing requirements for clinical documentation AI. This paper compares the two dominant capture modalities against each other in the field, and proposes a specialty-and-language decision rule for choosing between them.

Read

Working PaperHealth

Clinical Documentation AI in Arabic Healthcare: A GCC-Contextual Framework

General-purpose clinical AI, trained on English corpora and evaluated against English benchmarks, fails systematically when deployed in GCC hospitals. This paper proposes four architectural requirements that jointly separate viable systems from those that fail in production, drawn from field observations of Historian deployments.

Read

Working PaperAI Systems

Verification-Gated Agentic Delegation: A Taxonomy and Field Framework for Multi-Harness AI Systems in Regulated Deployments

The practitioner literature on multi-agent AI systems is rich on autonomy and thin on inspectability. In regulated deployments, inspectability is the design constraint. This paper proposes two taxonomies (six delegation patterns and four verification gate types), reports the coupling constraints between them, and describes which pattern-gate combinations survive audit in the domains we have deployed in.

Read