Build 2026 · Preview
The AI-Native Era of Azure Ali Farahnak · v1.0 · May 2026
Microsoft Build 2026 · Architect Keynote

The AI-Native Era
of Azure.

Cloud-native gave us elastic infrastructure. AI-native gives us elastic intelligence. Here's the platform, the patterns, and the playbook for the next decade of building on Azure.

🧠 Foundry · 1,800+ models 🤖 Agent Service · GA 🔌 MCP-first tools Live data · May 2026
Ali FarahnakArchitect, Microsoftaka.ms/ai-native
The shift

From cloud-native to AI-native.

02 / 10
2015 — 2024

Cloud-Native

  • Containers & microservices
  • Stateless request/response
  • CI/CD + DevOps
  • Scale on RPS / CPU
  • Cost: VM-hours, GB-months
2025 — 2030

AI-Native

  • Agents & orchestrators
  • Stateful sessions & memory
  • Eval-driven delivery
  • Scale on tokens / tools
  • Cost: tokens, tool calls, GPU-sec
The market in motion

By the numbers, 2026.

03 / 10
Global AI spend
$632B
Worldwide, infra + apps
▲ +47% YoY
Foundry catalog
1,800+
Frontier + open + fine-tuned
▲ +312 in 12mo
Agent task latency
1.4s
Median, multi-tool plans
▲ 4× faster vs 2024
Fortune 500 in prod
73%
≥1 agent in production
▲ from 18% (2024)

Sources: IDC AI Spend Tracker 2026 · Microsoft Foundry telemetry · Azure Customer Insights · Build 2026 keynote data.

Reference architecture

The AI-native Azure stack.

04 / 10
Channels WEB · MOBILE · TEAMS Front Door WAF · CDN · AUTH APIM AI Gateway SEMANTIC CACHE · QUOTAS Agent Service PLANNER · ROUTER Foundry Models GPT · LLAMA · PHI · MAI MCP Tools FUNCTIONS · AKS · APIS AI Search + Cosmos VECTORS · GRAPH · MEMORY Content Safety EVAL · POLICY · PI App Insights TRACES · TOKENS · COST
The economics

Tokens up. Cost per million down.

05 / 10

What this means

Inference tokens · 9.4× growth

Aggregate Azure inference volume grew from 14T to 132T tokens/month between '24 and '26.

Cost per 1M tokens · -86%

Frontier model pricing fell faster than Moore's law. Distilled small models dropped another 12×.

Tool calls per session · +6.1×

Agents now spend 70% of compute orchestrating tools, not generating text.

Take-away

The bottleneck is no longer compute or cost. It's architecture, evals, and trust.

The platform

Five pillars of AI-native.

06 / 10

Models

Frontier, small, distilled, fine-tuned. One catalog, one billing surface, one trust boundary.

foundrybyomdistill

Agents

First-class runtime with planners, memory, tools, and policy. Built on Agent Service + Semantic Kernel.

agent svcskautogen

Data

Grounding is the new schema. Vectors, graph, and full-text — unified across Cosmos DB & AI Search.

vectorgraphrag

Tools

MCP everywhere. APIs become capabilities, capabilities compose into workflows.

mcpfunctionsapim

Trust

Evals, content safety, Purview — woven into the platform, not bolted on later.

evalsafetypurview
Patterns that ship

Three reference patterns.

07 / 10
01

Grounded RAG

Query Retrieve Answer
Hybrid (vector + BM25) retrieval over AI Search, citations baked in, drift-guarded by evals.
azure ai searchopenai
02

Multi-Agent Plan

Planner Worker A Worker B Critic
Planner decomposes intent, workers run in parallel under budget, critic verifies before commit.
agent svcautogen
03

Tool Fabric (MCP)

Agent CRM Files DB Web
Every API becomes an MCP tool with auth, schema, observability — agents call them like functions.
mcpapim
The adoption ladder

Crawl. Walk. Run. Fly.

08 / 10
1
Q1 · Crawl

Spike & learn

Foundry playground, 1–2 RAG prototypes, internal demo, evals 101.

2
Q2 · Walk

First production agent

Single domain, owned by one team. APIM AI Gateway, content safety, golden eval set.

3
Q3 · Run

Agent fleet

Multi-agent flows, MCP tool catalog, FinOps for tokens, RBAC + Purview integration.

4
Q4 · Fly

AI-native platform

Internal agent marketplace, self-service evals, cost & safety as platform primitives.

Where teams stumble

Top pitfalls we see in the field.

09 / 10
# Anti-pattern Risk How often we see it Fix
01 No evalsVibes-based shipping High
Golden set + automated regression on every PR
02 Direct-to-model callsNo gateway, no quota, no cache High
APIM AI Gateway + semantic cache + per-tenant quotas
03 Stuffed prompts10k-token system prompts Med
Promptflow + retrieval + structured tool schemas
04 Tool sprawlEvery team rebuilds connectors Med
Central MCP catalog with versioning & ownership
05 Cost surprisesNo token-level FinOps Med
App Insights + cost-per-task dashboards from day one
06 Shadow agentsBuilt outside platform standards Low
Paved-road templates + internal agent marketplace
The takeaway

Build for the agent.

The next decade of Azure apps won't be screens you click — they'll be conversations with software that does. Architect for it now.

aka.ms/ai-native aka.ms/foundry aka.ms/agent-service
Thank you · Ali Farahnak · Microsoft Architect Office