Technical Infrastructure

World Context Protocol

The bridge between AI agents and the physical world, evolving from today's Model Context Protocol to enable true world-aware intelligence.

See the Evolution Technical Details

Model Context Protocol Today

Understanding how we currently interface with Language Foundation Models

Model Context Protocol (MCP)

How we construct, manage, and inject context into LLM sessions today

Current Characteristics

• Linear and stateless: Past context packed into prompts
• Token-limited: Entire context fits within token windows
• Text-based: Structured through language and JSON
• Explicit injection: Manual context inclusion

Current Use Cases

• Prompt engineering and template management
• RAG (Retrieval-Augmented Generation) systems
• Tool calling and function execution
• Memory injection via vector search

The Four Phases of Context Protocol Evolution

From static prompts to dynamic world interfaces

Phase 1Static Context Protocol (Today)

LLMs like GPT-4, Claude 3, Gemini 1.5

Function

Serial, stateless communication: "Prompt → Output"

Context

Token-based windows, manual injection

Tools

LangChain, RAG systems, prompt templates

Phase 2Dynamic Context Protocol (Near-Term)

LLMs with memory, agents, and tool usage

Function

State ↔ Query ↔ Tools ↔ Model ↔ Feedback

Context

External tools, long-term memory, perceptual inputs

Tools

Agents, function calling, memory APIs

Phase 3Contextualized World Protocol (Mid-Term)

Early WFM systems with simulation and sensors

Function

Spatial world models, simulation state, physical causality

Context

Multimodal streams, time-aware memory

Tools

3D spatial maps, sensor fusion, interactive planners

Phase 4Integrated World Interface Protocol (Full WFM)

True WFM systems with embodied experience

Function

Ongoing, living interface with continuous world models

Context

Real-time sensory inputs, social understanding

Tools

ROS-like AI protocols, multi-agent collaboration

WCP Technical Specification

Phase 4 World Context Protocol JSON structure and implementation

World Interface Protocol Structure

{
  "timestamp": "2025-06-03T17:30:00Z",
  "agent": {
    "id": "wfm-007",
    "name": "EVA",
    "role": "Home Assistant",
    "memory_snapshot": {
      "user_preferences": { "music_genre": "ambient" },
      "object_locations": { "keys": "entry_table" }
    }
  },
  "environment": {
    "location": "home_kitchen",
    "map": {
      "objects": [
        {
          "id": "mug01",
          "type": "mug", 
          "location": [1.2, 0.8, 0.9],
          "state": "on_table"
        }
      ]
    },
    "sensors": {
      "vision": { "active_objects": ["mug01"] },
      "audio": { "last_transcript": "Clean the kitchen" }
    }
  },
  "intent": {
    "user_command": "clean under the table",
    "parsed_goal": {
      "action": "clean_area",
      "target": "under_table"
    }
  },
  "planning": {
    "current_plan": [
      { "step": "locate vacuum_bot", "status": "complete" },
      { "step": "navigate to table", "status": "in_progress" }
    ]
  },
  "simulation": {
    "predicted_outcome": "success likely",
    "confidence": 0.92
  },
  "response": {
    "text": "Starting cleaning under the table now.",
    "speech": "playing"
  }
}

Agent Context

Identity, memory snapshot, and internal state of the AI agent operating in the world.

Environment State

Physical and sensory description of the world, including object locations and sensor data.

Intent Processing

Parsed user instruction and AI interpretation of goals and objectives.

Planning Engine

Current task breakdown and execution state with progress tracking.

Simulation Layer

Predicted outcomes and confidence levels from internal world modeling.

Response Output

Multi-modal output including text, speech, and physical actions.

WCP ⇄ Agent Bridge Architecture

How World Foundation Models interface with real and simulated environments

System Architecture

[User Commands & Interface Layer]

↓

WCP ⇄ Agent Bridge

☐ World Context Model

☐ Intent Parser

☐ Planner + Memory

☐ Simulation Engine Hook

↓ Sensor Input

Agent Decisions ↑

[World APIs
IoT, Simulation]

←→ WCP ←→

[World Foundation
Model Agent]

Bridge Components

World Context Collector

Transforms raw inputs into structured WCP messages

Intent + Planning Pipeline

Converts commands into structured goals and plans

Agent Loop Hook

Connects WCP to core World Foundation Model

Action Dispatch

Executes decisions in real or simulated environments

Technology Stack

State Synchronization

Redis, Convex, Kafka Streams

Memory & Timeline

Temporal.io, EventStore, Pinecone

Simulation & Planning

Three.js, Unity, IsaacSim, ROS2

Agent Interface

OpenAI Functions, LangGraph, Transformers

MCP vs WCP Comparison

Understanding the fundamental differences in approach and capability

Feature	MCP (Today)	WCP (Future)
Context Format	Text + JSON tool calls	Structured multimodal world graph
Session Model	Stateless or short-term context	Persistent agent state across time
Interaction	Request/response (chat turn)	Continuous interaction loop
Input Types	Text, structured data	Vision, audio, sensors, spatial state
Memory	Vector search (RAG)	Embedded memory + world map
Grounding	Weak (statistical patterns)	Strong (sensorimotor feedback)

Build the Future Interface

Start exploring how the World Context Protocol will enable the next generation of AI systems that truly understand and interact with our world.

See WCP in Action Technical Documentation Partner With Us