Microsoft offers the most mature path to in-country frontier-model inference in South Africa via Azure OpenAI Regional PTU in southafricanorth. Google Vertex AI's africa-south1 and AWS Bedrock's af-south-1 are also present in country, but for current frontier closed models neither matches Azure's Regional PTU residency guarantee. For regulated workloads that have to keep prompts and grounding data in country — banking, healthcare, public sector, FAIS-covered advisory — that asymmetry is the deciding architectural input. This sub-tree maps the strategy decision (which Copilot surface fits the workload), the implementation patterns (how a Worker actually calls gpt-4o in Joburg), and the auth model that holds it all together.
Three live nodes today: the strategy-level explainer for the in-country LLM decision, the implementation node for the headless Worker pattern, and the AI Gateway upgrade that wraps it with observability. Three more in the pipeline — the Azure AI Agent Service alternative for full-code agents, M365 Copilot deep dive on the SA M365 Geo, and the Copilot Studio BYO-model playbook.
The headless pattern — a JNB-colocated Worker calling gpt-4o in southafricanorth. Four production-grade concepts: api-key fetch, SSE streaming, throttle handling with KV idempotency, and Entra ID OAuth client credentials.
The observability and resilience upgrade. One URL change buys cache, retry, prompt logs, cost dashboards, and per-tenant rate limits — for ~5 ms latency overhead. The default for any LLM workload heading to production.
The full-code escape hatch when Copilot Studio's low-code ceiling isn't enough. Threads, tool calls, file search, code interpreter — orchestrated against the same SA-North gpt-4o substrate.
Tenant-Geo verification, the surfaces that leak data out of country (Bing grounding, plugins, Loop, Whiteboard), Purview controls, and the audit log path for a POPIA-defensible "where does my data go" answer.
Wiring a Copilot Studio agent to a customer-owned Azure OpenAI deployment in SA North. The full audit evidence pack — Power env region, AOAI policy, BYO config, live trace verification.
Azure OpenAI, Microsoft Graph, Power Platform, and Copilot Studio all share the same identity layer: Entra ID (formerly Azure AD). Learn the OAuth client-credentials flow once and it works for every API in this branch. Two scopes cover almost everything an integration needs.
Register an app in Entra ID, generate a client secret, request a token from login.microsoftonline.com with the scope for the API you want. Tokens are 60-minute by default — cache them in KV with a 55-minute TTL and rotate the secret in Key Vault without touching code.
// Two scopes cover the whole Microsoft surface const aiScope = 'https://cognitiveservices.azure.com/.default'; // Azure OpenAI const graphScope = 'https://graph.microsoft.com/.default'; // M365 / Copilot // Same flow, different scope const body = new URLSearchParams({ client_id: env.AZURE_CLIENT_ID, client_secret: env.AZURE_CLIENT_SECRET, scope: aiScope, // or graphScope grant_type: 'client_credentials', });
The payoff: every leaf in this hub uses the same auth function. The Worker that calls Azure OpenAI is one scope away from being the Worker that reads a user's mailbox. See concept 04 in the Workers explainer for the production-grade KV-cached implementation.
Three different residency commitments, three different boundaries. Azure OpenAI in southafricanorth keeps inference in country only on Regional PTU deployments — and currently only for the GPT model versions with local capacity. The default Global Standard SKU keeps data at rest in country but processes prompts and completions in any Microsoft Foundry region globally. The strict-residency path is Regional PTU; the pragmatic path is Global Standard with the M365 Geo data-at-rest commitment. Confirm specific model availability with Microsoft before architectural decisions — current GPT Standard SKU is documented in eight US / Sweden regions only, with SA North coverage via PTU. M365 Copilot with the SA M365 Geo stores prompts, completions, and grounding data at rest in country; LLM compute uses the global Azure OpenAI pool. Copilot Studio with BYO model inherits whatever residency the BYO model offers — strict in country if the BYO is a Regional PTU AOAI deployment, hybrid otherwise.
The strict reading (SARB Directive 6, certain DPSA mandates) requires the AOAI guarantee. The pragmatic reading (most POPIA s.72 cases) accepts the M365 Geo guarantee. The decision is in the strategy explainer — implementation lives in the patterns explainer.
Microsoft is the LLM substrate. Cloudflare is the runtime that calls it. ERP / CRM systems are what the agents talk to. The legal branch is the constraint layer. Every Microsoft leaf depends on all four.