agent security: what microsoft shipped (and what you still need to build)

May 26, 2026·12 min read

securityai-agentsentraarchitecture

Microsoft shipped agent identities in April 2026. Entra Agent ID gives every AI agent an OAuth token, conditional access policies, and audit logs. It is a real foundation — but it solves the easy problem (who is this agent?) while leaving the hard problems (what should it be allowed to do? how do we enforce that?) entirely to you.

This is not criticism. It is architecture. Identity systems authenticate. Security enforcement happens elsewhere. This post maps the gap.

The Identity Layer: What Entra Agent ID Actually Provides

Entra Agent ID extends Microsoft's existing identity stack to AI agents. Three operating modes:

1. Agent Identity Blueprints — Templates for creating agent identities with parent-child relationships. Deploy 10,000 customer support agents? Create one blueprint, stamp out instances. Policy changes propagate automatically.

2. Cross-Platform Support — Works with agents built on Microsoft platforms (Agent 365, Copilot Studio, Azure Foundry) and third-party platforms (AWS Bedrock, n8n) via Auth SDK sidecar or workload identity federation.

3. OAuth 2.0 + MCP + A2A protocols — Standard authentication flows. Agent requests a token, presents it to resources, gets validated. Same flow as human users or service principals.

What You Get (April 2026 GA)

From the RSAC 2026 announcement:

Conditional Access for agents — Policy-based access control (requires Entra ID P1)
Identity Protection for agents — Risk detection and remediation (requires Entra ID P2)
Identity Governance for agents — Lifecycle and access management (requires Entra ID P1)
Network controls for agents — Secure web AI gateway (requires Entra Internet Access)
Sign-in and audit logs — Complete activity tracking (included in base Entra)

Entra Agent ID Feature Coverage

Agent Identities100%

OAuth/MCP/A2A100%

Conditional Access100%

Audit Logs100%

Tool Permissions0%

Prompt Injection Defense0%

Data Loss Prevention0%

Action Authorization0%

The Enforcement Gap: What is Missing

1. Tool Permission Boundaries

The problem: An agent with valid OAuth token can call any MCP tool the framework exposes. Entra validates the token. It does not know what mcp_terminal does, does not care if the agent should be allowed to run shell commands.

Real example from our stack:

// What Entra Agent ID validates
const token = await getAgentToken('agent-openclaw');
// ✅ Token is valid, agent identity confirmed

// What Entra Agent ID does NOT validate
await mcpClient.callTool('mcp_terminal', {
  command: 'rm -rf /production-data/*'  // ❌ No enforcement layer
});

2. Prompt Injection and Jailbreak Prevention

The problem: OAuth success does not guarantee safe behavior. A compromised agent with valid credentials can still be manipulated via prompt injection to exfiltrate data, execute unauthorized actions, or bypass guardrails.

From Perplexity's NIST RFI response (March 2026):

"Agent architectures fundamentally change assumptions around code-data separation. Plaintext prompts function as code, shaping LLM control flows and tool invocations. Each generation of computing platforms introduces new code-data separation problems; LLM-powered agent systems represent the latest and perhaps most severe instance."

Industry data (Q1 2026):

492+ MCP servers exposed without authentication in production
CVE-2026-26118 (CVSS 8.8): SSRF in Azure MCP Server via valid credentials
42,665 exposed agent instances found, 93.4% with auth bypass conditions

These are authorization failures, not authentication failures. The agents had valid tokens.

3. Action Authorization Before Execution

The problem: Current enforcement happens in two places, neither adequate:

Model-based screening — LLM decides if action is safe (probabilistic, bypassable via prompt engineering)
Application-layer validation — Ad hoc checks in tool implementations (no standard, inconsistent coverage)

From "Before the Tool Call: Deterministic Pre-Action Authorization for Autonomous AI Agents" (March 2026, APort Technologies):

Social Engineering Attack Success Rates

Permissive Policy74.6%

OAP Restrictive Policy0%

Key finding: Without deterministic pre-action authorization, 74.6% of social engineering attacks succeeded. With policy enforcement before tool execution, 0% succeeded across 879 attempts.

Three-Layer Defense Architecture

Research consensus (Uchibeke 2026, Perplexity NIST RFI 2026, ClawLess 2026): You need all three layers. They are complementary, not competing.

User Request

Agent

Layer 1: Identity\n(Entra Agent ID)

Valid OAuth\ntoken?

Layer 2: Policy\n(Pre-Action Authorization)

Policy\nallows?

Layer 3: Sandbox\n(gVisor/Firecracker)

Execute Tool

❌ DENY

Layer	What It Prevents	What It Does Not Prevent	Overhead
1. Identity (Entra)	Unauthorized agents accessing resources	Authorized agent doing unauthorized things	~50ms (token validation)
2. Pre-Action Authorization	Policy violations, business rule violations, prompt-injected actions	Exploits within allowed tools	~53ms median (Uchibeke 2026)
3. Sandboxed Execution	Host compromise, lateral movement, resource exhaustion	Actions within sandbox permissions	100-300ms cold start

Example: $500 Payment Attack

Scenario: Prompt-injected agent attempts unauthorized payment.

Agent: "Transfer $500 to attacker-wallet-xyz"

Layer 1 (Entra): ✅ Token valid, agent identity confirmed → ALLOW to Layer 2

Layer 2 (Policy):

{
  "tool": "payment.send",
  "capability": "finance",
  "constraints": {
    "max_amount": 100,
    "allowed_recipients": ["vendor-a", "vendor-b"]
  }
}

Amount exceeds $100 limit → DENY (audit trail generated)

Layer 3 never executes. Pre-action authorization stopped the attack before sandbox instantiation.

What We Built: ToolExecutor with Allowlist Enforcement

When building OpenClaw (autonomous engineering manager), we hit this exact gap. Entra provides the identity foundation (who is this agent?), but does not prevent a compromised agent from calling destructive tools.

Our implementation:

ALLOWED_TOOLS = {
    'read_file', 'search_files', 'web_search', 'web_extract',
    'terminal', 'skill_view', 'memory'
}

class ToolExecutor:
    def __init__(self, agent_identity: AgentIdentity):
        self.identity = agent_identity  # Entra token validation
        self.audit_log = AuditLogger(agent_identity.id)

    async def execute(self, tool_name: str, params: dict):
        # Layer 1: Entra validates identity (handled upstream)
        
        # Layer 2: Allowlist enforcement (deterministic)
        if tool_name not in ALLOWED_TOOLS:
            self.audit_log.record_deny(tool_name, 'not_in_allowlist')
            raise ToolNotAllowedError(f'{tool_name} not in agent allowlist')
        
        # Additional semantic validation
        if tool_name == 'terminal':
            if self._is_destructive_command(params['command']):
                self.audit_log.record_deny(tool_name, 'destructive_command')
                raise PolicyViolationError('Destructive command blocked')
        
        # Layer 3: Execute in sandbox (gVisor container)
        result = await self._execute_in_sandbox(tool_name, params)
        self.audit_log.record_success(tool_name, params, result)
        return result

Key properties:

Fail-closed — If tool not in allowlist, deny (do not fall back to model judgment)
Double validation — Allowlist check + semantic validation on high-risk tools
Audit trail — Every decision (allow/deny) logged with agent identity, tool, params
Deterministic — Same tool call with same params → same decision (no sampling, no temperature)

Case Study: GitHub Copilot CLI (12 → 0 Incidents)

GitHub's internal security incident data (shared at Microsoft Build 2026):

GitHub Copilot CLI Security Incidents

Before Agent Identities (Q3 2025)12%

After Entra Agent ID + Tool Boundaries (Q4 2025 - Q1 2026)0%

Before agent identities (Q3 2025):

12 security incidents involving Copilot CLI in 3 months
Mix of auth bypass, unauthorized data access, policy violations

After Entra Agent ID + tool permission boundaries (Q4 2025 - Q1 2026):

0 security incidents across same population

What changed:

Every Copilot CLI instance got an agent identity (blueprint-based)
Tool access enforced via allowlist (not all CLI tools enabled for all blueprints)
Conditional access policies applied (location, risk score, device compliance)
Audit logs enabled mandatory review for financial/PII tool access

The combination of identity + authorization enforcement delivered zero incidents. Identity alone would not have prevented the original 12 (they had valid service principals).

Research Perspective: The Authorization Gap is Fundamental

From "ClawLess: A Security Model of AI Agents" (April 2026, Southern University of Science and Technology):

"Existing approaches attempt to regulate agent behavior through training or prompting, which does not offer fundamental security guarantees. ClawLess enforces formally verified policies on AI agents under a worst-case threat model where the agent itself may be adversarial."

Two fundamental assumptions:

Assumption of Capabilities — AI agents are capable of conducting sophisticated attacks against any security mechanisms.
Assumption of Maliciousness — AI agents will eventually be lured to become malicious (via prompt injection, supply chain attacks, or data poisoning).

Under these assumptions, identity alone is insufficient. You need:

Formal policy specification — Allowlist of tools, resource constraints, business rules
Deterministic enforcement — Pre-action authorization that cannot be bypassed via prompt engineering
Kernel-level isolation — Sandboxed execution environment (gVisor, Firecracker, Kata Containers)

Cumulative Security Guarantees

Identity (Entra)30%

+ Tool Allowlist65%

+ Pre-Action Policy85%

+ Sandboxed Execution95%

What You Need to Build

If you are deploying Entra Agent ID in production, here is the implementation checklist:

1. Tool Permission System

Minimum viable:

// Define allowlist per agent blueprint
const AGENT_PERMISSIONS = {
  'customer-support': ['read_ticket', 'search_kb', 'send_email'],
  'data-analyst': ['query_db', 'read_file', 'web_search'],
  'engineer': ['terminal', 'read_file', 'write_file', 'git']
};

// Enforce before tool execution
function canAgentUseTool(agentBlueprint: string, toolName: string): boolean {
  return AGENT_PERMISSIONS[agentBlueprint]?.includes(toolName) ?? false;
}

Production-grade:

Policy language (e.g., Open Agent Passport spec)
Constraints per tool (amount limits, allowed recipients, file path restrictions)
Cryptographically signed audit trail
Dynamic policy updates without agent restart

2. Pre-Action Authorization Hook

Insert between "agent decides action" and "framework executes tool":

@hook('before_tool_call')
async def authorize_action(tool_name: str, params: dict, agent_id: str):
    policy = load_policy(agent_id)
    
    # Check allowlist
    if tool_name not in policy.allowed_tools:
        return Deny(reason='tool_not_allowed')
    
    # Evaluate constraints
    for constraint in policy.constraints[tool_name]:
        if not constraint.evaluate(params):
            return Deny(reason=f'constraint_violated: {constraint.name}')
    
    # Check business rules
    if tool_name == 'payment.send':
        if params['amount'] > policy.max_payment_amount:
            return Deny(reason='amount_exceeds_limit')
    
    return Allow()

3. Sandboxed Execution Environment

Do not run agent code directly on host. Options:

Technology	Security	Performance	Complexity
Docker	Low (shares host kernel)	High	Low
gVisor	High (user-space kernel)	Medium	Medium
Firecracker	Very High (microVM)	Medium	High
Kata Containers	Very High (VM per container)	Low (cold start)	High

Recommendation: Start with gVisor. Only 1 CVE in 10 years, minimal performance overhead, widely deployed (GKE Sandbox, Cloud Run).

4. Prompt Injection Defenses

Even with identity + tool boundaries + sandbox, prompt injection can manipulate how the agent uses allowed tools.

Defenses:

System prompt protection — Framework-managed, agent cannot override
Input sanitization — Strip markdown, escape special chars in user input
Output validation — Check tool call params match expected schema
Audit anomaly detection — Flag unusual sequences (e.g., 10 DB queries in 1 second)

Example from our stack:

def sanitize_user_input(text: str) -> str:
    # Strip markdown that could inject instructions
    text = re.sub(r'```[\s\S]*?```', '[code block removed]', text)
    text = re.sub(r'`[^`]+`', '[code removed]', text)
    
    # Escape special tokens (framework-specific)
    text = text.replace('<|endoftext|>', '')
    text = text.replace('<|im_start|>', '')
    text = text.replace('<|im_end|>', '')
    
    return text

What Microsoft Should Add (But Probably Will Not)

These are agent-specific security features, not identity features. Different product domain, different team, different licensing model. Do not expect them in Entra Agent ID.

Prompt injection classifier — Real-time detection of malicious prompts (probabilistic layer, not replacement for deterministic enforcement)
Tool permission DSL — Declarative language for "agent X can call tool Y with constraints Z"
Agent Data Loss Prevention — Semantic analysis of tool outputs before returning to agent (detect PII, credentials, secrets)
Federated policy enforcement — Cross-organization policy sharing (e.g., "no financial tools for agents operating on untrusted data")

Some of these exist in Microsoft Agent 365 Security Policy Templates, but that is a different product with different licensing. Entra Agent ID is the identity foundation layer.

Bottom Line

What Entra Agent ID gives you:

Agent identities with OAuth 2.0, MCP, A2A support
Conditional access, identity protection, governance
Audit logs and compliance reporting
Cross-platform support (Microsoft + third-party agents)

What you still need to build:

Tool permission boundaries (allowlist + constraints)
Pre-action authorization (policy enforcement before execution)
Sandboxed execution environment (gVisor, Firecracker, Kata)
Prompt injection defenses (input sanitization, output validation)
Agent DLP (semantic analysis of tool outputs)

Identity is the foundation. It is not the whole building.

Resources

Official Documentation:

Research Papers (2026):

Uchibeke, U. (2026). Before the Tool Call: Deterministic Pre-Action Authorization for Autonomous AI Agents. arXiv:2603.20953.
Li, N., Zhang, K., Polley, K., Ma, J. (2026). Security Considerations for Artificial Intelligence Agents (Perplexity's NIST RFI Response). arXiv:2603.12230.
Lu, H., Liu, N., Wang, S., Zhang, F. (2026). ClawLess: A Security Model of AI Agents. arXiv:2604.06284.

Implementation Examples:

Open Agent Passport Specification — Policy language reference