Does AI Matters?: Copilot Studio Agentic: Maverick Edition

Breaking Data Silos with Microsoft Fabric, OneLake, and Fabric Data Agents — The Foundation Every Production Agent Needs

Zen Chong — Mon, 09 Mar 2026 15:36:43 GMT

Platform: Microsoft Copilot Studio + Microsoft Fabric | Level: Intermediate | Build Time: 60 minutes

The Problem Nobody Talks About When They Talk About AI Agents

Every conversation about AI agents in the enterprise circles back to the same fantasy: an intelligent system that understands your whole business, answers complex questions instantly, and takes action across your operations without being asked twice.

Here is what kills that fantasy before it starts. Not bad prompts. Not weak models. Not insufficient compute.

Siloed data.

An agent is only as intelligent as the data it can access. An agent tethered to a single system — your CRM, your ERP, your SharePoint site — has a narrow view of the world. It can answer questions about what lives in that one silo. It cannot cross the gap between your sales data and your inventory data to tell you why your best-performing product is suddenly unavailable to your highest-value customers. It cannot connect your support ticket data to your product roadmap to identify which feature gaps are driving churn. It cannot answer the question a CEO actually asks, because CEOs never ask questions that fit neatly inside a single database.

This is the problem the series has been building toward for fifteen days. You have built individual agents, connected them to SharePoint, configured governance, added HITL checkpoints, and designed multi-agent workflows. All of that is the capability layer. Today is the data layer. And without it, everything built so far is operating with one hand tied behind its back.

🚨 What Fragmented Data Actually Costs — Confirmed Research

25,000+ paid Fabric customers — fastest-growing analytics product in Microsoft history

Microsoft 2025 Annual Report

80% of Fortune 500 now on Microsoft Fabric

FabCon Vienna Sep 2025 confirmed

379% ROI over three years — Forrester TEI study, $9.79M NPV, 10,000-employee composite org

Forrester TEI 2024 — Microsoft Fabric Blog

90% reduction in time data engineers spend searching, integrating, and debugging

Forrester TEI 2024 confirmed

25% increase in data engineering productivity — $1.8M saved in composite org over 3 years

Forrester TEI 2024 confirmed

$779,000 in infrastructure savings by consolidating tools onto Fabric

Atlan — Forrester TEI confirmed

What Microsoft Fabric Is — and Why It Is Different from What Came Before

The typical enterprise data estate is not a system. It is an archaeology project. Layer upon layer of tools added over time — a data warehouse from one era, a data lake from another, a reporting platform chosen by a team three reorganisations ago, a dozen SharePoint sites nobody fully catalogued. Each tool served a purpose at the moment it was acquired. Together, they create a sprawling, fragmented landscape where data exists in abundance but insight remains scarce.
📖 Microsoft Corporate Vice President, Azure Data — Build 2025

“When I talk to customers, the message I consistently get is: please unify. I’m the Chief Information Officer. I don’t want to be the Chief Integration Officer.”
Arun Ulag, Corporate Vice President for Azure Data, Microsoft
Source: VentureBeat — Build 2025 confirmed

Microsoft Fabric is the answer to that problem. Launched in 2023 and now adopted by 28,000 organisations worldwide, Fabric is a unified SaaS data platform that brings data engineering, warehousing, analytics, data science, and real-time intelligence together in a single environment. At the core of Fabric is OneLake — a single, multi-cloud data lake that stores data once in open formats and makes it instantly accessible to every workload on the platform.

Think of OneLake the way you think of OneDrive. A decade ago, sharing documents meant email attachments and network drives — every person maintaining their own copy, version control a fiction, collaboration a negotiation. OneDrive transformed that by creating a single accessible home for files. OneLake is doing the same for data. One copy. One governed location. Every tool, every team, every agent works from the same source.

The result for AI agents is not incremental. It is architectural. Instead of an agent that can see one data source, you get an agent that can see everything — customer records, transaction history, inventory levels, support tickets, financial records, operational data — governed, secured, and updated in near real time. That is the difference between an agent that answers questions about a department and an agent that understands a business.

The Three Fabric Capabilities That Transform What Your Agents Can Do

Three Fabric capabilities are directly relevant to every agent built in this series. You do not need to be a data engineer to understand them. You need to understand what they unlock.

OneLake
Break down silos with a single data copy that every agent can access.
OneLake virtualises your entire data estate into a single, governed lake without requiring you to move or duplicate data. Using Shortcuts, OneLake can point to data in Azure Blob Storage, Azure Data Lake, Amazon S3, Google Cloud Storage, and Dataverse — data stays where it is, but agents see it as if it were all in one place.
Using Mirroring, OneLake maintains a near-real-time synchronised replica of external databases — Azure SQL, Azure Cosmos DB, Snowflake, Fabric SQL Database, Dataverse — without ETL pipelines. No more cumbersome pipelines, no more sprawling out-of-date copies of data, no more silos across every part of your business.
What this means for your agent: instead of an agent grounded in one SharePoint site (Day 9), you now have an agent grounded in every relevant data source your organisation has — all governed by the same security controls, all updated in near real time.
Reference: OneLake — Your Foundation for an AI-Ready Data Estate

Fabric Data Agents
A purpose-built AI layer that answers natural language questions about your unified data.
Fabric data agents are AI-powered assistants that go beyond simple data retrieval from OneLake — they engage in natural language conversations about it. These agents understand your enterprise data schema, enforce your governance policies, and interpret your business context to surface insights that are timely, relevant, and actionable.
A Fabric data agent can reason across Lakehouses, Warehouses, KQL databases, Power BI semantic models, and unstructured documents — all in the same query. A business user asks: ‘Which products are underperforming this quarter, and what’s driving the trend?’ The agent queries structured sales data and unstructured support tickets in a single response, with row-level security enforced throughout.
What this means for your agent: instead of a Copilot Studio agent that reasons over SharePoint documents, you now have a Copilot Studio agent that reasons over your entire governed enterprise data estate — including databases that previously required a data engineer to query.
Reference: Fabric Data Agents + Copilot Studio — A New Era of Multi-Agent Orchestration

Fabric IQ
A semantic layer that turns raw data into business meaning — so agents reason in your language, not in table names.
Announced at Microsoft Ignite 2025, Fabric IQ is the semantic intelligence layer that elevates Fabric from a data platform to an intelligence platform. At its core is the Ontology item — a structured model of your business entities, relationships, rules, and objectives. Define Customer, Order, and Revenue once, and every Power BI report, data agent, and Copilot Studio agent speaks the same language.
Without Fabric IQ, an agent reading raw database tables must interpret what Revenue means — and it will make that interpretation differently each time, based on which table it accessed. With Fabric IQ, Revenue has one meaning, one definition, one calculation — and every agent is grounded in that shared understanding.
What this means for your agent: from Day 1, the 17x error amplification risk in multi-agent systems comes from inconsistent grounding. Fabric IQ eliminates the root cause — agents that disagree about what the data means — before the first agent is built.
Reference: From Data Platform to Intelligence Platform — Introducing Fabric IQ (Ignite 2025)

How a Fabric Data Agent Connects to Your Copilot Studio Agent

The integration between Fabric data agents and Copilot Studio is now in preview. The mechanism is agent-to-agent collaboration via Model Context Protocol (MCP) — the same pattern introduced in Day 14. Your Copilot Studio agent is the orchestrator. The Fabric data agent is the specialist. When a user asks a question that requires enterprise data, your Copilot Studio agent delegates to the Fabric data agent, receives a governed, context-aware answer, and returns it to the user.
The architecture works like this: without connected agents, each system works in isolation. Your Copilot Studio agent can search SharePoint. Your data lives in a lakehouse. Never the two shall meet. With the Fabric data agent connection, your Copilot Studio agent can instantly delegate data queries to a specialist agent that has governed access to your entire enterprise data estate — and return accurate, security-enforced responses in seconds, not hours.

Prerequisites Before You Build — Check These First

Microsoft Fabric capacity required: F2 or higher, or Power BI Premium Per Capacity (P1 or higher) with Microsoft Fabric enabled.
Licensing: Microsoft 365 Copilot license AND a user license for each person building and managing custom agents in Copilot Studio.
Tenant alignment: Both the Fabric data agent and your Copilot Studio agent must be on the same tenant.
Authentication: Sign in to both Microsoft Fabric and Microsoft Copilot Studio with the same account that has access to the data agent.
Tenant settings: Enable the Fabric data agent tenant setting in Power BI admin portal: Tenant settings > Copilot > Fabric data agent. Enable XMLA endpoints if using Power BI semantic models.

Once prerequisites are confirmed, the build is five steps.

Build and Publish Your Fabric Data Agent in Microsoft Fabric
Open Microsoft Fabric and select your Lakehouse, Warehouse, or Power BI semantic model containing the data you want your agent to access.
Create a new Data Agent from the workspace. Connect it to your data sources and write clear instructions that define your business context: what terms mean, what queries are appropriate, what data the agent should and should not access.
Invest time here. The quality of data agent instructions directly impacts answer accuracy. Column metadata and business definitions dramatically improve accuracy. ‘Revenue = Qty x Price minus Discounts. Exclude cancelled orders.’ This is the equivalent of writing good system prompt instructions in Copilot Studio.
Test the agent in the built-in test pane until you are satisfied with response quality. Then publish the agent with a rich and detailed description — it must be published before Copilot Studio can discover and connect to it.

Enable the Fabric Connection in Copilot Studio
Open Microsoft Copilot Studio and navigate to your existing agent (or create a new one). Select the Agents tab from the top pane.
Select Add an agent. Under Connect to an external agent, select Microsoft Fabric from the available agent types.
Select the desired connection from the list, or create a new connection: select the dropdown, choose Create new connection, and authenticate with the account that has access to the Fabric data agent.
Reference: Connect to a Microsoft Fabric Data Agent — Copilot Studio docs
S3
Select Your Fabric Data Agent and Configure Its Description
From the list of Fabric data agents you have access to, select the data agent you want to connect and select Next.
Adjust the description as needed to make it more contextual for your main agent. Make the description specific if you have other tools or agents where descriptions might overlap.
This description is what Copilot Studio’s generative orchestration reads when deciding whether to invoke the Fabric data agent or handle a query directly. A precise description produces correct delegation. A vague description produces ambiguous routing.

Configure Your Copilot Studio Agent’s Instructions for Delegation
Update your Copilot Studio agent’s system instructions to explicitly describe when to call the Fabric data agent versus when to handle queries from its other knowledge sources.

Example delegation instructions:

For questions about live business data — sales figures, inventory levels, customer transaction history, operational metrics — delegate to the [YourFabricAgentName] data agent.
For questions about policies, procedures, project documentation, and team information — use the SharePoint knowledge source.
For questions requiring both operational data and business context — call the Fabric data agent first, then enrich the response with SharePoint context.

Clear delegation instructions eliminate the majority of hallucinations and misrouted queries in multi-source agents.

Test with Progressively Complex Queries
Test Pane: start with simple queries that require only the Fabric data agent. Confirm correct delegation and accurate responses.
Progress to queries that require both Fabric data and SharePoint knowledge — these test whether orchestration routing is working correctly.
Open the Activity Map. Verify which agent was called at each step, what context was passed, and what output was returned. If the wrong agent is called for a query, refine the Fabric data agent description or the delegation instructions and re-test.
Full 20-minute lab: Connecting Fabric Data Agents with Copilot Studio — MCS Labs
The Medallion Pattern — How Enterprise IT Teams Structure Data for AI Agents
When you build a Fabric lakehouse to power your agents, you will encounter the medallion architecture — the standard pattern for organising data in OneLake. Understanding this pattern helps you know which data your agents should ground on at each stage of maturity.

Business-ready

Data optimised for reporting and analytics. This is where your agent should ground by default — validated, governed, and semantically aligned with your Power BI models and Fabric IQ ontology.
The safest starting point for your Fabric data agent is the Gold layer: Power BI semantic models. These are already validated, governed, and business-aligned. They leverage over 20 million existing semantic models across the Fabric platform. Starting with a semantic model as your data agent’s primary source means your agent inherits all the business logic already encoded by your data team — without you needing to rebuild any of it.

Day 15 Design Prompt — Map Your Data Estate Before Building

Data Estate Audit Prompt — Run Before Adding Any Fabric Data Agent

“My Copilot Studio agent currently handles [describe workflow]. To upgrade it with a Fabric data agent connection, help me design the data architecture.

For each data source my agent currently needs or should need, identify: (1) where that data currently lives, (2) which Fabric ingestion method is most appropriate — Shortcut, Mirroring, or native Fabric database, (3) which medallion layer the data belongs to (Bronze/Silver/Gold), (4) whether a Power BI semantic model already exists that covers this data, (5) whether this data requires any special governance considerations.

Then describe: what business questions my agent could answer with access to the full Gold layer dataset that it cannot answer today. For each new capability, estimate the business value using the throughput framework from Day 7.

Output this as a Data Estate Map: columns for Data Source, Current Location, Fabric Ingestion Method, Medallion Layer, Existing Semantic Model (Y/N), Governance Flag, and New Agent Capability Unlocked.”

The Day 15 Principle — Centralising Data Is the Starting Point, Not the Finish Line

Microsoft’s Jessica Hawk, Corporate Vice President for Data, AI, and Digital Applications, framed this precisely at FabCon Vienna: centralising data, once the finish line, is now the starting point.
For twenty years, the ambition of every enterprise data initiative was to get everything into one place. That was the goal. That was the KPI. That was what the project was measured on at completion. The organisations that achieved it felt like they had won.
They had not won. They had set the table. The organisations winning today are not the ones with the most consolidated data. They are the ones who have made that consolidated data accessible to agents that can reason over it, act on it, and improve from it — continuously, at machine scale, without a human having to extract and interpret a report first.
The agents built in Days 1 through 14 are capable. They can automate tasks, retrieve information, route approvals, and connect to external systems. But their intelligence has been bounded by the data they could see. Days 1 through 14 built the engine. Day 15 expands the fuel.

The Data Foundation Principle

An agent grounded in SharePoint can answer questions about your processes. An agent grounded in OneLake can answer questions about your business.
An agent grounded in Fabric IQ can answer questions about your business in the language of your business — with consistent definitions, live data, and governed access — and can act on what it finds.
Breaking down data silos is not the goal of an agentic AI framework. It is the prerequisite. Without it, every agent you build will be answering yesterday’s questions with incomplete information, inside a boundary it cannot see past.

What’s in your data estate that your current agents cannot see?

Run the Data Estate Audit Prompt above against your Day 4 business case.

Format your answer:

“My agent currently sees: [data sources]. What it cannot see but needs to: [missing data]. The business question I cannot answer today but could with Fabric: [question].”

The most clearly articulated data gap gets featured in the Day 16 example — and I will show you exactly how to bridge it using OneLake Shortcuts or Mirroring, depending on where your data lives.

Follow for daily drops. Day 16: Real-Time Intelligence in Fabric — how agents move from historical data to live signals, and the exact configuration that lets your Copilot Studio agent act on events as they happen, not after the fact.

Claude Studio Surge

Zen Chong — Sun, 08 Mar 2026 13:10:55 GMT

The counterintuitive truth no one saw coming in early 2026: Anthropic’s very public refusal to let the Pentagon use its models for unrestricted autonomous weapons or mass surveillance didn’t hurt them. It supercharged enterprise adoption inside Microsoft Copilot Studio.

Enterprise teams are now deliberately routing their most compliance-sensitive agents — financial audits, healthcare workflows, legal review, regulated reporting — to Claude models instead of GPT. The result: faster board approvals, higher internal trust scores, and quicker go-live times. Because ethics suddenly became the ultimate enterprise differentiator.

⚠ Model Accuracy Note — Read Before You Screenshot This

The user request referenced “Opus 4.6 + Sonnet 4.5.” Here is the corrected, verified picture as of March 9, 2026:

• Copilot Studio prompt builder (February 2026 update): Opus 4.6 + Sonnet 4.5. Sonnet 4.5 is also in beta for Computer Use agents specifically.
• Microsoft Foundry and GitHub Copilot: Sonnet 4.6 is now available (released Feb 17), delivering near-identical computer use to Opus at one-fifth of the cost.
• Recommendation: If your Copilot Studio environment hasn’t updated to Sonnet 4.6 yet, you are two to four weeks away. Plan your routing rules for 4.6 now. Both models are covered in this newsletter.

What Actually Happened — The Verified Feb–March 2026 Timeline

On February 26–27, 2026, Anthropic publicly rejected the Pentagon’s final demands, stating they could not in good conscience remove safeguards against lethal autonomous systems or domestic surveillance. CEO Dario Amodei drew a hard line, preferring to lose the contract rather than compromise constitutional AI principles.

“We cannot in good conscience remove safeguards against lethal autonomous systems or domestic surveillance.”
— Dario Amodei, CEO, Anthropic · Feb 26–27, 2026

What no analyst predicted: the ethics standoff and the model releases landed in the same four-week window — and the compound effect on enterprise trust was immediate.

Feb 5, 2026

Claude Opus 4.6 released. Agent teams. 14h 30min METR task horizon. 1M token context. $5/$25 per million tokens.

GitHub Copilot — Opus 4.6 GA

Feb 5 onward

Copilot Studio February 2026 update adds Opus 4.6 + Sonnet 4.5 to prompt builder. Sonnet 4.5 enters beta for Computer Use agents.

brewingthought.com — What’s New in Copilot Studio Feb 2026

Feb 17, 2026

Claude Sonnet 4.6 released. 79.6% SWE-bench. 72.5% OSWorld computer use — within 0.2% of Opus. $3/$15 per million tokens. Now the default on claude.ai.

GitHub Copilot — Sonnet 4.6 GA

Feb 17 onward

Sonnet 4.6 confirmed in Microsoft Foundry. Browser automation at scale, no API key dependency, cross-app context handoff.

Microsoft Foundry Blog — Sonnet 4.6 for enterprise scale

Feb 26–27, 2026

Anthropic publicly rejects Pentagon’s final demands. Dario Amodei confirms decision directly.

CNN — Anthropic rejects Pentagon offer

Early March 2026

Claude hits #1 on US Apple App Store free chart, overtaking ChatGPT as users defect in support.

Business Insider — Claude Hits No. 1 on App Store

Early March 2026

TechCrunch covers the App Store rise and the enterprise trust narrative.

TechCrunch — Claude rises to No. 1 following Pentagon dispute

The February 2026 Model Lineup — Exact Specs for Copilot Studio Builders

Three models are now available across Microsoft’s AI infrastructure. Here is the complete, verified picture for routing decisions.

The February 2026 Computer Use Milestone — Why This Changes Everything

Computer use has been Claude’s defining differentiator since October 2024. February 2026 is the month it crossed human parity. This is not marketing. These are third-party benchmark numbers.

What Computer Use Means Inside Copilot Studio

Sonnet 4.5 is already in Copilot Studio Computer Use agents (beta). Here is what your agents can now do that required custom RPA tooling six months ago:

• Navigate legacy systems with no API: ServiceNow, SAP, old intranet portals. The agent sees the screen, clicks, and types exactly as a human would.
• Cross-app context handoff: Read a Teams message, check a SharePoint record, create a Jira ticket — without the user orchestrating each step.
• Browser automation at scale: Navigate forms, extract data, submit approvals across any browser-based surface. No API key dependency.
• The upgrade path: When Sonnet 4.6 lands in Copilot Studio CU agents, your existing agents inherit the +11.1 point OSWorld improvement with no rebuild required.

Why This Is a Massive ‘I Didn’t Know That’ Moment for Copilot Studio Builders

Most people assume bigger models or cheaper tokens win. That was true in 2024. It is not the dominant variable in enterprise AI in 2026.

In 2026 enterprise reality, trust, auditability, and defensibility are the new ROI multipliers.

When your agent touches PII, financial data, patient records, or regulated processes, procurement and legal teams now ask one question first:

“Can we defend this model choice in an audit or congressional hearing?”

Claude wins that question every single time right now. And Microsoft made switching literally one dropdown away.

Real Impact Teams Are Reporting This Month

Directional signals from early-adopter practitioners, not peer-reviewed studies. Verify against your own use case.

• Banks reporting shorter agent approval cycles for compliance-sensitive workflows when Claude is specified as the model.
• Healthcare providers reporting higher user trust scores on Claude-powered agents via Studio evaluation runs.
• Insurance and finance teams defaulting compliance agents to Claude for explainable, auditable decisions.
• Smart-routing teams (Claude for high-stakes, GPT for high-volume) report 15–22% lower total spend. Claude costs slightly more per token but significantly less in governance overhead and rework.

Official Microsoft Integration — Live Links, Verified March 9, 2026

Resource

Microsoft Official Blog — Anthropic joins Copilot Studio (Nov 2025)

blogs.microsoft.com — Anthropic joins the multi-model lineup in Microsoft Copilot Studio

Microsoft Foundry — Sonnet 4.6 enterprise release (Feb 2026)

techcommunity.microsoft.com — Claude Sonnet 4.6 in Microsoft Foundry: Frontier Performance for Scale

GitHub Copilot — Opus 4.6 GA (Feb 5, 2026)

github.blog — Claude Opus 4.6 is now generally available for GitHub Copilot

GitHub Copilot — Sonnet 4.6 GA (Feb 17, 2026)

github.blog — Claude Sonnet 4.6 is now generally available in GitHub Copilot

Copilot Studio February 2026 update — Opus 4.6, Sonnet 4.5, CU beta

brewingthought.com — What’s New in Copilot Studio – February 2026 Update

Anthropic — Current model specs and pricing

anthropic.com/pricing — All models, token rates, and tiers

Anthropic API docs — Models overview (Opus 4.6, Sonnet 4.6, Haiku 4.5)

platform.claude.com/docs — Claude models overview

Expanding model choice in Microsoft 365 Copilot

microsoft.com — Expanding model choice in Microsoft 365 Copilot

How to Capitalise on the Claude Studio Surge Today — Updated for the 4.6 Era

Four steps. Zero code changes required. Updated for the February 2026 model landscape.

1

Open any agent in Copilot Studio → go to Model settings
From your agent canvas: Settings → AI capabilities → Generative AI → model selection dropdown. If you do not see Sonnet 4.6 yet, select Opus 4.6 or Sonnet 4.5 — the routing logic below applies to all three.

Select the right model for your agent’s risk profile
For compliance agents (financial audit, HR, legal, regulated reporting): Opus 4.6. Deep reasoning, 14h+ task horizon, strongest constitutional refusals.
For operational agents (knowledge base, navigation, form processing, Computer Use): Sonnet 4.6 (or Sonnet 4.5 in CU beta). 72.5% OSWorld at $3/$15 per million tokens. Near-identical to Opus on computer use at one-fifth of the cost.

Add the governance wrapper to your system instructions
Use the R-I-S-E prompt framework in the section below. Copy it directly into any topic as a system instruction. 90 seconds. Gives your compliance team an audit trail before the agent processes its first message.

Set up the multi-model orchestration node
Pro move: use Studio’s multi-model orchestration node. Route low-risk, high-volume queries to GPT for speed and cost. Route any agent touching PII, regulated data, or requiring explainability to Claude automatically. One rule. One agent. Two models. Zero code.
The R-I-S-E Prompt — Copy This Into Copilot Studio Today
R-I-S-E (Role, Input, Steps, Expectation) is the structured prompt format that consistently delivers audit-ready, defensible outputs from Claude in regulated environments. Drop this directly into any Copilot Studio topic as a system instruction.
ROLE: Enterprise governance lead choosing models for Copilot Studio.
INPUT: We handle both standard customer service and highly regulated
financial/healthcare processes.
STEPS: Create a decision matrix for when to use Claude vs GPT models
inside Studio agents. Include:
• Exact criteria (data sensitivity, audit risk, explainability needs)
• Specific system prompt wrappers for compliance use cases
• Model selection: Opus 4.6 vs Sonnet 4.6 vs Sonnet 4.5 CU
• Testing recommendations using Studio’s built-in evals
EXPECTATION: Output a ready-to-implement routing rule +
3 sample agents with correct model + safety wrapper:
• Customer support → GPT (high-volume, low-risk)
• Invoice approval → Sonnet 4.6 (structured, auditable)
• Regulatory reporting → Opus 4.6 (deep reasoning, refusals)
Why R-I-S-E works for compliance contexts: it forces the model to declare its role (accountability anchor), name constraints explicitly (audit trail), and deliver structured outputs (defensible format). An incomplete prompt is an invitation to hallucinate. R-I-S-E is the antidote.

Edge Cases and Nuances You Must Know

Scenario

The honest answer

When NOT to use Claude

Pure creative or ultra-high-volume workloads where raw speed matters more than explainability. A high-volume customer service bot handling thousands of simple queries per hour: GPT is likely the better economic choice. Claude’s advantage is in reasoning depth, constitutional refusals, and auditability — not raw throughput.

Sonnet 4.6 vs Opus 4.6 in practice

On computer use (OSWorld 72.5% vs 72.7%) they are statistically identical. Sonnet 4.6 actually beats Opus 4.6 on office productivity (GDPval-AA: 1633 vs 1606 Elo) and financial analysis. Escalate to Opus only for deep scientific reasoning (GPQA Diamond: Opus 91.3% vs Sonnet 74.1%) or when multi-agent coordination and maximum-reliability matter more than cost.

The 4.5 vs 4.6 question for Computer Use

Sonnet 4.5 is what’s currently in Copilot Studio CU beta (61.4% OSWorld). Sonnet 4.6 scores 72.5% on the same benchmark — a +11.1 point jump. When 4.6 lands in Studio CU agents, that improvement is yours automatically. Build your workflows on 4.5 now; they will only get better.

Cost sweet spot

Most smart-routing teams report 15–22% lower total spend. Sonnet 4.6 at $3/$15 is the most compelling value in frontier AI right now — near-Opus performance at one-fifth of the Opus price. Governance overhead and rework costs change the ROI calculation entirely when you include the cost of an audit failure.

Constitutional AI is not a marketing claim

Claude’s constitutional AI gives you refusal behaviours and citation patterns built into the model. Instead of writing hundreds of custom guardrails, you start from a model that already knows how to decline, hedge, and reference sources in high-stakes contexts. This is an architectural advantage that transfers directly to shorter system prompts and faster compliance reviews.

One important caution

The enterprise adoption trends cited here are directional signals from early-adopter teams and press coverage, not peer-reviewed data. Run your own Studio evaluations before making model decisions for production agents. The methodology is in Day 5 of this newsletter series.

The Contrarian Take Backed by Early 2026 Data

Everyone thought the Pentagon standoff would slow Anthropic down. It was supposed to be a political and commercial liability. In the same four-week window, Anthropic shipped two major models, hit #1 on the App Store, and deepened its integration into Microsoft’s entire AI infrastructure.
While OpenAI chases consumer ads and general-purpose reach, Anthropic is quietly becoming the ‘boring but bulletproof’ choice for anyone who has to answer to regulators, boards, or the public.

“In enterprise AI, doing the right thing is no longer a cost centre — it’s the fastest path to production and board approval.”

— The Claude Studio Surge thesis, March 2026

The teams quietly switching to Claude right now are not just being ethical. They are being strategically brilliant. The ethics premium is real, it is measurable in approval cycle times, and it is available to any Copilot Studio builder starting today.

Run the Decision Matrix This Week

Use the R-I-S-E prompt above with Claude Opus 4.6 or Sonnet 4.6 in Copilot Studio this week. You will instantly see which of your existing agents should route to Claude — and your compliance teams will have the evidence they need for the conversation you were going to have anyway.

What is the first agent you are routing to Claude? Drop it in the comments — let’s compare notes.

Andrew Ng Was Right. The 10-Minute Loan Is Built Here. End-to-End Workflow Redesign + Multi-Agent Orchestration in Copilot Studio

Zen Chong — Sat, 07 Mar 2026 15:24:24 GMT

Andrew Ng wrote from the World Economic Forum in Davos: “A recurring theme of conversations with CEOs is that running many experimental, bottom-up AI projects — letting a thousand flowers bloom — has failed to lead to significant payoffs. Instead, bigger gains require workflow redesign: taking a broader, perhaps top-down view of the multiple steps in a process and changing how they work together from end to end.”

Thirteen days into this series, that statement describes exactly where most makers get stuck.

Days 1 through 12 gave you individual agent skills. Today we stop building flowers and start building the greenhouse.

The 10-Minute Loan. What Ng Actually Meant.

Consider a bank issuing loans. The workflow has several discrete stages: Marketing → Application → Preliminary Approval → Final Review → Execution. Preliminary Approval used to require an hour-long human review. A new agentic system can do this automatically in 10 minutes. Swapping human review for AI review — but keeping everything else the same — gives a minor efficiency gain but isn’t transformative. Here is what would be transformative: instead of applicants waiting a week for a human to review their application, they get a decision in 10 minutes. When that happens, the loan becomes a more compelling product. That better customer experience allows lenders to attract more applications and ultimately issue more loans. Even though AI is applied only to one step, Preliminary Approval, we end up implementing not just a point solution but a broader workflow redesign that transforms the product offering.

This is the move from Day 4’s single-agent business case to today’s multi-agent system design. And it applies directly to whatever you built for your own department.

The question is not “which step can I automate?” It is “what does the entire workflow become when one step is no longer the bottleneck?”

That question is answered by multi-agent orchestration — and this is where most builders go wrong in exactly the way we covered on Day 1. From the December 2025 arxiv paper: independent multi-agent systems amplify errors 17.2x. More agents, worse results — unless the architecture is right.

Today we build the architecture correctly.

The Three Multi-Agent Patterns: Choose Before You Build.

Microsoft’s official guidance provides three distinct patterns for multi-agent design. The key is matching the pattern to the use case. Using the wrong pattern is the most common mistake in multi-agent architecture.

Pattern 1 — Orchestrator + Subagents (Russian Doll / Magentic)

This hierarchical pattern is ideal for clear separation of concerns. A Sales Copilot agent might orchestrate one agent for lead scoring and another for generating proposals. The orchestrator manages the overall user conversation and high-level decisions — “do we need to involve another agent?” — while subagents focus on execution. This approach creates a simpler user experience with one entry point, while using multiple agents optimised for quality and reliability for their targeted domain or function.

Use this pattern when: your workflow has distinct domain areas with different data sources or permissions, you want one interface that spans multiple specialist functions, and reuse of those specialist agents across other systems is important.

The Ng loan workflow in this pattern: One Loan Orchestrator agent as the single customer entry point. Four subagents: Marketing Agent (retrieves pre-approved product offers), Application Agent (collects and validates borrower data against eligibility rules), Preliminary Approval Agent (runs the automated 10-minute credit assessment), Final Review Agent (collates the approved application for human underwriter sign-off via HITL from Day 12). The orchestrator manages the customer conversation throughout. The subagents execute deterministically within their domains.

Pattern 2 — Workflow-Oriented (Sequential or Concurrent)

Model each step in the workflow with explicit sequencing and guards — clear preconditions, postconditions, and numerical thresholds. Design agents for autonomy and re-entrance, ensuring idempotency with robust retry logic and dead-letter handling. Incorporate approval gates and human-in-the-loop review steps through familiar channels like Teams or Outlook. Enforce security with a least-privilege approach at each step.

Use this when: the workflow has a defined order that cannot change, quality gates between steps are mandatory, and each agent’s output is the next agent’s input. The loan workflow maps here for the automated stages. Sequential orchestration: Marketing output feeds Application input feeds Preliminary Approval input. No agent proceeds until the previous step passes its quality gate. In Copilot Studio: this is your agent flow architecture from Day 3 — each step deterministic, each output validated before the next step triggers.

Execute concurrently only when the use case genuinely benefits from parallel processing and the workflow is simple enough for a single agent to handle parallel branches. Creating parallel branches increases complexity and may reduce quality when combining concurrent outputs. This is the Day 1 finding applied in practice. The loan workflow’s stages are sequential by definition — a concurrent architecture would break it.

Pattern 3 — Connected Agents via MCP and A2A (For Cross-Platform Workflows)

MCP (Model Context Protocol) provides a straightforward way for agents to interact with external objects — APIs, data sources, or other agents — with strong controls for a single orchestrator to select, invoke, filter, reason, and synthesise outcomes. A2A (Agent2Agent protocol) enables cross-platform agent-to-agent messaging with published task contracts. Use MCP for tool and data access. Use A2A for cross-platform agent integration.

Use this when: the specialist agents you need already exist in another system — Dynamics 365, Microsoft Fabric, Salesforce, or a third-party workflow — and rebuilding them in Copilot Studio would duplicate effort. Your orchestrator calls them via MCP or A2A rather than reimplementing the logic. Currently in public preview for Copilot Studio.

The Critical Rule Before You Choose Any Pattern.

Multi-agent orchestration is not always necessary and you should consider carefully before adopting it. Use a single agent when: you are building a single use case to respond to a single intent, a single developer or small cohesive team manages the entire solution, or you want to logically group tools and knowledge into clearly defined components within a larger single agent.

Connected agents increase latency due to extra orchestration hops. They increase the testing, management, and governance surface area for a solution.

The test before adding any agent: “Would a single well-configured agent with generative orchestration handle this if I wrote better instructions and added the right tools?” If yes: do not add another agent. If no — because the domains genuinely require different permissions, different knowledge sources, or different deployment targets — then and only then: add a connected agent.

This test eliminates 60% of unnecessary multi-agent architectures before they are built.

The Three-Layer Control Architecture — Production Grade.

In a production-grade multi-agent system, three layers of control coexist: the deterministic layer — traditional rule-based logic for mission-critical or irreversible actions like processing a payment or deleting a record, enforced through strictly authored topics or flows with no AI interpretation. The agentic layer — the LLM-driven planning layer that interprets intent, selects tools and agents, and composes multi-step plans with guardrails. The autonomous layer — event-triggered operations that run without user input, governed by explicit decision boundaries and audit logging.

Map every step of your workflow to one of these three layers before writing a single instruction. Steps that must never fail go in the deterministic layer. Steps that require intelligent interpretation go in the agentic layer. Steps that should run proactively without user initiation go in the autonomous layer.

The Ng loan workflow mapped: Marketing (autonomous trigger — pre-approved offer generated when credit score threshold met), Application (agentic — intent interpretation, slot filling from Day 3), Preliminary Approval (deterministic — rule-based credit policy check, same input always same output), Final Review (deterministic + HITL from Day 12 — rule-based underwriter routing), Execution (deterministic — loan disbursement, no AI interpretation, audit trail mandatory).

Your Day 13 Build Prompt — End-to-End Workflow Redesign.

“Apply Andrew Ng’s end-to-end workflow redesign framework to the following department workflow: [paste your current process from Day 4]. For each stage: (1) identify whether it is currently a bottleneck or a non-bottleneck (TOC from Day 7), (2) assign it to deterministic, agentic, or autonomous control layer, (3) specify whether it requires a single agent, child agent, connected agent, or human-in-the-loop step, (4) identify the data source it reads from and the system it writes to, (5) define the precondition that must be true before this stage can execute and the postcondition that must be true before the next stage can proceed. Then describe what the workflow becomes as a product — not a process — when the bottleneck stage is reduced from [current time] to [target time]. What does the customer or end user experience that they could not before?”

The last question is the one Ng asks. Answer it and you have your business case. Answer it and you understand what you are actually building.

Build Steps — Validated Against Live Microsoft Docs.

Step 1 — Map your workflow to the three patterns Take your Day 8 Agent Brief → identify which stages require separate agents (different permissions, different domains) vs which stages are subroutines of a single orchestrator → choose Pattern 1, 2, or 3 for each agent boundary → confirm with the single-agent test before adding any connected agent

Step 2 — Build your parent orchestrator Go to copilotstudio.microsoft.com → Create → New agent → name it as the workflow, not the technology (e.g., Loan Application Agent, not Multi-Agent Orchestrator) → write instructions that describe the entire workflow from the customer’s perspective → list the child agents it will call, when to call them, and what context to pass → enable Generative Orchestration (Settings → Generative AI → Orchestration → ON)

Step 3 — Add child agents Your orchestrator → Agents tab → Add agent → select existing Copilot Studio agents from your environment → for each connected agent: configure the input variables the parent passes (conversation context, user data collected so far), the output variables the child returns, and the condition under which the orchestrator hands off to this agent vs handles it directly. Reference: Add other agents overview

Step 4 — Apply guardrails to every cross-agent boundary The connected agent might have access to things the parent agent does not. Ensure that calling the connected agent does not inadvertently bypass restrictions. If the parent agent cannot delete records but the connected agent can, the parent agent should not call the connected agent in scenarios where deletion might happen without proper approval. Treat a connected agent call like any other powerful action. Add a HITL checkpoint (Day 12) before any cross-agent call that triggers a write action the parent agent does not have direct permission to perform.

Step 5 — Test the full workflow with Activity Map Test Pane → run your complete end-to-end workflow from customer trigger to execution → open Activity Map → verify: which agent was called at each step, what context was passed, which tool was invoked, what the output was → identify any step where the orchestrator called the wrong agent or passed incomplete context → refine the child agent’s description and the parent’s instruction for that handoff → re-run

Two validated lab builds to examine before building your own: Using Multi-Agent in Copilot Studio — MCS Labs — 30 minutes. See the orchestrator + child agent pattern end to end in a simple workflow. Variables, Multi-Agent Architectures, and Channel Deployment — MCS Labs — 30 minutes. Variables as the data handoff mechanism between agents.

The Day 13 Principle.

Ng’s closing line from Davos: “Bottom-up innovation matters because the people closest to problems often see solutions first. But scaling such ideas to create transformative impact often requires seeing how AI can transform entire workflows end to end — and this is where top-down strategic direction and innovation can help.”

You have spent 12 days building from the bottom up. You now understand the individual components — knowledge sources, agent flows, HITL, evaluation, governance, autonomous triggers, feedback architecture, ROI measurement — at the level required to build them correctly.

Day 13 is the day you step back and see the whole workflow.

Not the flower. The greenhouse.

What does your department’s most important process become as a product when the bottleneck step is no longer the bottleneck?

That answer is your next 17 days of builds.

Drop your workflow redesign below.

Format: “My workflow: [current stages]. The bottleneck is [stage]. When that stage goes from [current time] to [target time], the product becomes [what the customer/user now experiences that they couldn’t before].”

🎯 SERIES STATUS — DAYS 1–13

Day Theme Key Deliverable

Day 1 Environment + mindset Agent backlog, Copilot Studio access
Day 2 APL-7008 + Real Estate Dataverse agent Knowledge-grounded natural language search
Day 3 Agent in a Day + Contoso Coffee Entities, slot filling, order flows
Day 4 Bring your own business case Process → agent → flow → ROI activated
Day 5 Eval before you ship + publish to Teams Evaluation pass rate, soft-launch playbook
Day 6 Custom feedback architecture 3-layer feedback: reactions + adaptive card + CSAT
Day 7 TOC throughput ROI — not hours saved Constraint identification + throughput case
Day 8 When a business approaches you Intake framework + Agent Brief template
Day 9 Cisco’s 72% + SharePoint knowledge Data-first grounding + 7 failure modes eliminated
Day 10 Accenture warning + autonomous agents Trigger-driven build — works without user input
Day 11 Air Canada + governance + compliance DLP configured, 3-layer governance, compliance agent
Day 12 Pentagon vs. Anthropic + HITL Oversight framework mapped, HITL checkpoints built
Day 13 Andrew Ng + end-to-end workflow redesign Multi-agent architecture mapped, 10-minute product identified

MCP Is the USB-C for AI Agents — And Fabian Williams Built a Real Event Agent to Prove It

Zen Chong — Sat, 07 Mar 2026 14:28:39 GMT

Before today’s build, watch this first: How Fabian Williams built a real-world event agent using MCP + Copilot Studio + Microsoft Graph

It is from the Microsoft 365 & Power Platform community call on December 16 2025. It is 25 minutes. It is the best single demonstration of what MCP makes possible in Copilot Studio. And by the end of today, you will understand exactly how to replicate the pattern for your own use case.

If you want to experience the live agent before watching: go to conferencehaven.ai and click the chat button. That is a production, MCP-backed, Microsoft Graph-connected event scheduling agent that Fabian built and shipped. That is what we are building the foundation for today.

What Is MCP — The Answer That Actually Sticks.

Model Context Protocol enables makers to connect to existing knowledge servers and APIs directly from Copilot Studio. When connecting to an MCP server, actions and knowledge are automatically added to the agent and updated as functionality evolves.

The USB-C analogy is the one that sticks: before USB-C, every device had a different connector. You needed a different cable for every system. That is the integration world before MCP — every agent needed a custom connector, custom authentication, custom field mapping, custom error handling. Built once, for one system, maintained forever.

MCP is an open standard that allows AI agents to access tools, context, and data from external systems through one universal integration standard, unlocking unprecedented capabilities and transforming how AI models access and use real-time data.

With MCP, you connect once to a server that exposes tools. Any agent can use those tools. The tools self-describe — name, description, inputs, outputs are all inherited automatically. When the external system changes, the MCP server updates, and your agent inherits the update. No connector rebuild. No field remapping. No maintenance sprint.

Copilot Studio now connects across more than 1,400 systems of record via Model Context Protocol, Power Platform connectors, and the Microsoft Graph. Makers can build sophisticated agents without coding.

That is 1,400 systems you can connect your agent to today — without a developer.

What Fabian Williams Actually Built: Conference Haven.

Conference Haven is a conference session discovery and scheduling agent. The user arrives at a conference website, opens the chat, and asks in plain language: “Show me sessions on AI governance” or “Book me the 2pm keynote and add it to my calendar.”

The agent does four things in one conversation:

It reads conference data from an MCP server Fabian built — exposing the session schedule, speaker list, room assignments, and session metadata as live, queryable tools. Not a static FAQ. Live structured data accessible via natural language.

It queries Microsoft Graph via MCP to check the user’s calendar availability before suggesting session times — so it never recommends a session that conflicts with something already booked.

It writes to Microsoft Graph calendar — creating the calendar event, setting the location, adding the session description, and confirming to the user — all within a single conversational turn.

It exposes an Agent-to-Agent (A2A) endpoint — a2a.conferencehaven.ai — so other agents can call Conference Haven as a service. From Day 13’s multi-agent pattern: Conference Haven is a specialist subagent that any orchestrator can invoke.

That is the full MCP + Graph + A2A architecture in a single working production example. The video walks through every design decision. Watch it once before you build.

The Three MCP Patterns You Can Build Today.

Pattern 1 — Connect to an existing MCP server (the fastest start)

The simplest way to connect to an existing MCP server is directly within Copilot Studio using the MCP onboarding wizard. Go to the Tools page for your agent → Select Add a tool → Select New tool → Select Model Context Protocol → fill in the Server name, Server description, and Server URL.

The description field is the most important input you make. Generative orchestration relies on the description to determine when your agent should use the tool. Write clear, specific descriptions including what the tool does and when it should be used. A vague description means your agent will call the wrong tool or miss the right one. Be precise.

Authentication options: None (public MCP servers), API key (header or query parameter), or OAuth 2.0 (dynamic client registration or static). For Microsoft Graph-connected MCP servers, OAuth 2.0 is required.

One critical prerequisite confirmed: Generative Orchestration must be enabled to use MCP. Settings → Generative AI → Orchestration → ON. Without this, the MCP tool will never be called — silently.

One critical deprecation confirmed: SSE transport for MCP is deprecated after August 2025. Copilot Studio no longer supports SSE for MCP. If an MCP server you want to connect to still uses SSE transport, it will not work. The server must use Streamable HTTP. Check before connecting.

After connecting: All tools are turned on by default when you add an MCP server. If you don’t want to use all the tools offered by an MCP server, turn off the Allow all toggle — toggles become available for each individual tool. Turn off tools that aren’t needed to ensure your agent only uses the most relevant features. Selective tool enablement is not optional when working with large MCP servers — enabling all tools from a rich server increases token usage and can confuse the orchestrator.

Pattern 2 — Use Agent 365 MCP servers for Microsoft 365 actions

Agent 365 tooling servers are enterprise-grade MCP servers that expose deterministic, auditable tools for Microsoft 365 workloads — Outlook, Teams, SharePoint, OneDrive, Dataverse, Word, and more — through the Agent 365 tooling gateway. They give your agents secure, user-scoped access to work content and actions: creating calendar events, finding free/busy slots, accepting/declining invitations, email composition, SharePoint file management, Word document creation. You still design and orchestrate everything in Copilot Studio, but when your agent needs to reason over a user’s work data or take concrete actions, Agent 365 MCP servers provide the bridge.

This is the Conference Haven pattern — Microsoft Graph calendar queries and writes — exposed as enterprise-governed, compliance-audited MCP tools rather than direct API calls.

Licence and programme requirement: Agent 365 MCP servers require a full Microsoft 365 Copilot licence for users of the agent. The capability currently requires Frontier programme enrollment. Without a full Microsoft 365 Copilot licence, users will not be able to use Agent 365 MCP servers from Copilot Studio. Check with your admin before building your agent around this pattern.

Agent 365 MCP servers allow agents to schedule meetings in Microsoft Teams, draft documents in Word, send emails in Outlook, and update CRM records in Microsoft Dynamics 365 — with full compliance and audit support.

Pattern 3 — Build a custom MCP server for your own data

Custom MCP servers let you connect Microsoft Copilot, Copilot Studio, VS Code, Claude, and other AI agents to third-party apps and internal systems your business relies on — such as Docusign, Salesforce, GitHub, or ServiceNow. Makers and developers can create or clone reusable, governed MCP servers that bring together connector actions, tools from other MCPs, and custom APIs, giving agents the ability to take meaningful, secure actions across platforms. You can build new MCP servers by assembling connector actions and tools from other MCP servers, or clone existing Microsoft-authored MCP servers like the Dataverse MCP server and tailor them by adding, removing, or replacing tools.

This is the Conference Haven architecture — a custom server exposing your conference data, your product catalogue, your service records, your knowledge base — as MCP tools that any agent in your organisation can call without rebuilding the integration.

Your Day 14 Build: Connect Your Agent to a Live MCP Server in 20 Minutes.

This is the no-code path. No server required. Uses a pre-built public MCP server to demonstrate the full connection and tool invocation pattern before you build your own.

Prompt to identify your MCP use case first:

“I am building a Copilot Studio agent for [department/use case]. My agent currently answers questions from static knowledge sources. Identify 3 types of real-time data or actions this agent would need to provide genuinely useful answers — information it cannot know from a static document because it changes daily or requires a live system query. For each: (1) describe what the agent would need to look up or do, (2) name the system that holds that data, (3) confirm whether that system has a public API, MCP server, or Power Platform connector available. This is my MCP integration backlog.”

Build steps — confirmed against live Microsoft docs:

Step 1 — Enable Generative Orchestration Your agent → Settings → Generative AI → Orchestration → Generative Orchestration → ON. Non-negotiable. MCP will not function without it.

Step 2 — Add an MCP server via the onboarding wizard Your agent → Tools tab → Add a tool → New tool → Model Context Protocol → MCP onboarding wizard appears. For your first build, use a pre-built Microsoft MCP connector from the available library (search: Dataverse, GitHub, or any listed connector) rather than a custom URL. This eliminates authentication configuration while you learn the pattern. Reference: Connect to existing MCP server

Step 3 — Add the server to your agent and configure selective tools After the server connects → Add to agent → go to the Tools tab → select the MCP server → turn off Allow all → enable only the specific tools your agent needs. Write a clear, specific tool description for each enabled tool. Reference: Add tools and resources from MCP server

Step 4 — Test with the Activity Map Test Pane → ask a question that should trigger the MCP tool → after the response → open Activity Map → verify the MCP tool node appears in the execution path → confirm the tool was called with the correct inputs → confirm the output was returned to the agent correctly. If the MCP tool node does not appear: check Generative Orchestration is ON and recheck your tool description — it may be too vague for the orchestrator to select.

Step 5 — Run your Day 5 evaluation with MCP tools active Re-run your existing evaluation test set → look specifically for Capability Use test method pass rate → this confirms the agent is calling the MCP tool rather than generating a plausible-sounding answer from training data. A pass on Capability Use + General Quality together is your production readiness signal for an MCP-connected agent.

Three validated lab builds to study alongside today’s build: MCS Labs: Dynamics 365 MCP Lead Qualifier — 15 minutes — the fastest MCP connection demonstration available. MCS Labs: Dataverse MCP Connector — Live Data Integration — 30 minutes — MCP against your own Dataverse data. MCS Labs: Build a Custom MCP Server — 65 minutes — Fabian Williams’ full pattern: custom server, tool exposure, agent connection.

The Conference Haven Endgame — And What It Means for Your Series Build.

Conference Haven answers the “so what” question that every maker faces at Week 2 of a build series: what does a production-grade MCP agent actually look like?

It looks like this: a public-facing website, a chat button, a natural language interface, a live structured data source exposed as MCP tools, Microsoft Graph queries for real-time calendar context, calendar write actions for scheduling, and an A2A endpoint for cross-agent integration. All built by one person. All demonstrably working.

Agent-to-MCP-server communication unlocks the ability for agents to handle complex tasks that involve subjects outside their domain — they can seamlessly coordinate tasks, bridge gaps between tools, and accelerate complex projects.

The event agent is not the point. The pattern is the point. Your conference data is your product catalogue. Your calendar is your appointment system. Your Microsoft Graph query is your CRM lookup. The Conference Haven architecture maps to any domain where a user needs to find something, check availability against a live system, and take an action — in one conversation.

That is every department workflow you have built in this series. Now it is connected to the live world.

Join the Community That Builds This Stuff Live.

Fabian Williams presents builds like this every week at the free Microsoft 365 & Power Platform community calls. Download recurring invites and join. These are the people who ship production agents before Microsoft writes the docs for them. That community is your best resource for the remaining 16 days of this series.

What external system does your agent most need to query in real time?

Format: “My agent needs to look up [what] from [which system] so it can [action or response].”

The most concrete integration gets a full MCP architecture spec in the comments — server type, authentication method, tool list, and the exact Copilot Studio connection steps for that specific system.

🎯 SERIES STATUS — DAYS 1–14

Day Theme Key Deliverable

Day 1 Environment + mindset Agent backlog, Copilot Studio access

Day 2 APL-7008 + Real Estate Dataverse agent Knowledge-grounded natural language search

Day 3 Agent in a Day + Contoso Coffee Entities, slot filling, order flows Day 4 Bring your own business case Process → agent → flow → ROI activated

Day 5 Eval before you ship + publish to Teams Evaluation pass rate, soft-launch playbook

Day 6 Custom feedback architecture 3-layer feedback: reactions + adaptive card + CSAT

Day 7 TOC throughput ROI — not hours saved Constraint identification + throughput case

Day 8 When a business approaches you Intake framework + Agent Brief template

Day 9 Cisco’s 72% + SharePoint knowledge Data-first grounding + 7 failure modes eliminated

Day 10 Accenture warning + autonomous agents Trigger-driven build — works without user input

Day 11 Air Canada + governance + compliance DLP configured, 3-layer governance, compliance agent

Day 12 Pentagon vs. Anthropic + HITL Oversight framework mapped, HITL checkpoints built

Day 13 Andrew Ng + end-to-end workflow redesign Multi-agent architecture mapped, 10-minute product

Day 14 Conference Haven + MCP + Microsoft Graph Agent connected to live external data via MCP

The Pentagon vs. Anthropic Standoff: What It Means for Every AI Builder — and Why Human-in-the-Loop Is the Answer the Military Doesn’t Want

Zen Chong — Fri, 06 Mar 2026 15:06:39 GMT

This week the most consequential AI story of 2026 broke — and almost nobody in the Copilot Studio community connected it to the agents they are building right now.

They should. Because what is happening between the Pentagon and Anthropic is not a contract dispute. It is a live demonstration of the most important design decision in every agent you will ever build.

What Actually Happened.

The Pentagon is pushing four leading AI labs — Anthropic, OpenAI, Google, and xAI — to let the military use their tools for “all lawful purposes,” even in the most sensitive areas of weapons development, intelligence collection, and battlefield operations. Anthropic has not agreed to those terms, and the Pentagon is getting fed up after months of difficult negotiations. Anthropic insists that two areas remain off limits: the mass surveillance of Americans and fully autonomous weaponry.

Claude is currently the only frontier AI deployed on classified Pentagon networks, operating via Palantir’s AI Platform. The dispute gained additional urgency after the Wall Street Journal reported that Claude was used during the January military operation to capture Venezuelan President Nicolás Maduro.

If Anthropic ultimately does not agree with the Pentagon’s terms, the agency could label the company a “supply chain risk” — a designation typically reserved for foreign adversaries. This would require its vendors and contractors to certify that they do not use Anthropic’s models. One senior official said it would be difficult for the military to quickly replace Claude, because “the other model companies are just behind” when it comes to specialised government applications.

The deeper problem is not who is right in this negotiation. It is that the negotiation is happening at all. The terms governing how the military uses the most transformative technology of the century are being set through bilateral haggling between a defence secretary and a startup CEO, with no democratic input and no durable constraints. “All lawful purposes” covers a lot more territory than it used to — surveillance law was written long before AI could monitor millions of people simultaneously.

This is the story. Now here is what it means for your Tuesday morning Copilot Studio build.

The Design Decision at the Centre of Everything.

Anthropic’s two hard limits are not corporate politics. They are a design philosophy expressed in contract language. And that philosophy has a name you have seen before in this series.

Human-in-the-loop.

Anthropic doesn’t want Claude used to develop weapons that fire with no human involvement — no fully autonomous lethal action, no “robot pulls trigger” without a person deciding. This isn’t Anthropic saying “no military use.” It’s Anthropic saying “yes, but not that.”

Every agent you build operates on the same axis. Not between warfare and peace — between autonomous action and human accountability. The question your stakeholders will eventually ask about every agent you deploy is the same question the Pentagon is asking about Claude right now: who is responsible when it acts?

The riskiest AI deployments in 2026 are not the fully autonomous ones in low-stakes environments. They are the supposedly autonomous systems making high-stakes decisions without appropriate human checkpoints. Would you let an algorithm approve a $500,000 purchase order without review? Fire an employee based solely on performance scores? Diagnose a patient without a doctor’s confirmation? Yet many organisations deploy AI systems that essentially do exactly that — making consequential decisions in black boxes, with no meaningful human oversight until something goes wrong.

The Air Canada tribunal said the same thing in a different courtroom. Anthropic is saying it in a different negotiating room. The principle is identical: the agent’s decisions are yours. The guardrail is not optional. The guardrail is your liability management.

The Three Levels of Human Oversight — Choose the Right One for Every Workflow.

Not every AI application needs the same level of oversight. The key is matching your control mechanism to your risk profile. There are three control models: Human-in-the-loop — a human must initiate or approve actions before the AI executes them, your highest-control scenario. Human-on-the-loop — the AI operates autonomously but a human monitors in real time and can intervene or abort. Human-out-of-the-loop — fully autonomous operation, only appropriate for low-stakes, well-defined tasks with proven accuracy.

Map every workflow you have built in this series against this framework right now:

Human-in-the-loop: Any agent that writes to a financial record, approves a leave request, cancels an order above a threshold, sends a communication on behalf of a person, or updates a compliance record. The agent proposes. The human approves. The agent executes. This is the Copilot Studio HITL pattern — agent pauses, fires an Outlook approval form, resumes on response.

Human-on-the-loop: Any agent that generates a daily report, summarises emails, monitors a queue, or makes low-value routing decisions. The agent acts autonomously, the Activity tab records every decision, and a human reviews the log weekly. If something looks wrong, they override. This is your Day 10 autonomous agent with an active monitoring protocol.

Human-out-of-the-loop: FAQ agents, greeting agents, search agents, status-check agents. The agent answers. Nothing consequential happens either way. No approval needed. No monitoring overhead. Ship it and move on.

The transition to full agentic deployment follows a clear maturity path. Start with high-volume, low-judgment workflows — repetitive, data-rich, rule-constrained. Introduce HITL approval gates next — agents propose, humans validate. This builds confidence, exposes edge cases, and refines agent behaviour. Only then remove routine approvals while maintaining real-time monitoring and override mechanisms. Autonomy should always be reversible.

The Pentagon is demanding to skip Phase 2. Anthropic is refusing. The standoff is not about guns. It is about whether human judgment remains part of a consequential decision loop when the stakes are irreversible.

Your stakes are not military. But irreversibility exists in your agents too: a sent email cannot be unsent, a cancelled order cannot be instantly reinstated, a financial approval record cannot be quietly deleted. Design for reversibility. Build in the pause.

Day 12 Build: Add HITL to Your Production Agent.

HITL is especially useful when an agent needs clarification, additional context, or explicit approval to proceed. It supports scenarios such as confirming project updates, confirming procurement orders, validating financial reports, escalating complex customer support cases, resolving ambiguous data, or gathering information that only a person can provide. The result is more flexible and reliable automation that adapts to real-world conditions.

The prompt to design your HITL checkpoints:

“I have built a Copilot Studio agent that handles [describe your workflow]. Using the three-level oversight framework — in-the-loop, on-the-loop, out-of-the-loop — review every action my agent takes and classify each as: (1) requires human approval before execution, (2) can execute autonomously but must be logged for weekly human review, or (3) fully autonomous with no oversight required. For every action classified as requiring approval, describe: who the approver should be, what information they need to make the decision, what the consequences of approving and rejecting are, and what the agent should do if no response is received within 24 hours. Output this as a HITL checkpoint map.”

Build steps — confirmed in November 2025 What’s New and official HITL docs:

Open your agent → topic containing the action that needs human approval → at the decision point → Add node → Add a tool → search for Human-in-the-loop → select Request for information in Copilot Studio agent flows (preview)
Configure the HITL action: Title — what the approver sees as the subject of the approval request. Message — a clear one-sentence description of what the agent is proposing to do and why. Assignee — the named approver or a dynamic variable containing their email. Input fields — Approve / Reject radio buttons plus an optional Comments text field for the approver’s reasoning
After the HITL node → add a Condition branch: if HITL_Response = "Approve" → proceed with the action → if HITL_Response = "Reject" → send a Teams notification to the requestor explaining the outcome → end topic
Timeout handling: add a parallel branch for no response within your defined window → agent sends a reminder → after a second window → agent escalates to the approver’s manager using the People knowledge source from Day 9’s upgrade. This is the difference between a HITL that occasionally blocks workflows and one that maintains throughput while ensuring accountability
Test in the Test Pane → confirm the Outlook form fires, captures Approve / Reject + Comments, and returns those values to the agent correctly → confirm both branches complete without error

Full 45-minute lab with Expense Claims HITL pattern: MCS Labs — Human-in-the-Loop

Governance overlay: If you want to block autonomous agents from operating without any HITL checkpoints at the environment level, go to Power Platform Admin Centre → DLP policies → Microsoft Copilot Studio connector → block the Event triggers connector action. This prevents agent makers from adding event triggers — effectively requiring that all autonomous workflows go through a human approval gate before they can be enabled in that environment. This is the governance control that Anthropic wished it had enforced contractually with the Pentagon before January.

The Day 12 Principle.

Anthropic took the money. It put Claude on classified networks. It partnered with Palantir. So when Anthropic says “we didn’t mean for it to go there,” a lot of people are going to respond with “How did you not see where this goes?” That is why this story does not have a clean hero and villain. When you sign the deal and install on classified networks and run toward defence use cases, you are already in the arena. You do not get to be shocked when the arena acts like the arena.

You are in the arena too. Every time you deploy an agent that takes a consequential action without a human checkpoint, you are making a governance decision by default. The question is whether you made it deliberately — or whether you will discover it in the first complaint email, the first incorrect financial record, the first compliance audit.

Build the HITL checkpoint. Map the oversight level correctly. Make the deliberate choice before the arena makes it for you.

Drop your HITL checkpoint map below.

Format: “[Agent name] — [Action requiring approval] — [Approver] — [Consequence of wrong decision].”

The most clearly articulated checkpoint gets featured as the Day 13 opening example — and I will show you how to configure the exact timeout + escalation logic for that specific use case.

🎯 SERIES STATUS — DAYS 1–12

Day Theme Key Deliverable

Day 1 Environment + mindset Agent backlog, Copilot Studio access

Day 2 APL-7008 + Real Estate Dataverse agent Knowledge-grounded natural language search

Day 3 Agent in a Day + Contoso Coffee Entities, slot filling, order flows

Day 4 Bring your own business case Process → agent → flow → ROI activated