Internet-Draft Structured Data Schema Interaction June 2026
Zhou & Peng Expires 24 December 2026 [Page]
Workgroup:
Working Group
Internet-Draft:
draft-zhou-structured-data-schema-interaction-00
Published:
Intended Status:
Standards Track
Expires:
Authors:
F. Zhou
Huawei Technologies
S. Peng
Huawei Technologies

Structured Data Schema Interaction Protocol for Multi-Agent Collaboration

Abstract

This document defines a structured data schema interaction protocol for multi-agent collaboration. As AI agents increasingly interoperate across heterogeneous platforms, natural-language-based communication suffers from semantic drift, high inference overhead, and ambiguous data flow. This protocol introduces a standardized key-value schema with semantic annotations, enabling deterministic, efficient, and interoperable agent-to-agent communication. A lightweight schema negotiation mechanism is provided for initial alignment at the beginning of communication, while an optional key-value update mechanism allows agents to reflect evolving requirements without breaking existing structured data schema interaction protocol.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 24 December 2026.

Table of Contents

1. Introduction

Recent advances in large language models (LLMs) enable AI agents to plan and execute multi-step workflows for complex tasks. Through agent communication protocols, AI agents can call third-party tools and delegate tasks to other AI agents. When an AI agent acts on behalf of a human user, these interoperations require structured, deterministic, and efficient information exchange.

Today's agent ecosystems are characterized by rich, heterogeneous interaction information. Agents communicate through natural language text, structured data, and platform-specific documents. While natural language is expressive and flexible, it introduces three critical problems in multi-agent collaboration:

This document introduces a structured data schema protocol, providing explicit semantic definitions with fixed data keys. The benefits of this approach are:

  1. Deterministic semantic alignment: By pre-defining key-value schemas with explicit semantic descriptions, client agents can parse user intent into structured payloads with minimal ambiguity, effectively suppressing semantic drift.
  2. Reduced token consumption and response latency: Structured communication eliminates the need for server-side agents to perform open-ended natural language understanding.
  3. Enhanced interoperability and decoupling: A standardized key-value format allows client agents to adapt dynamically to server agent interface changes, enabling massive heterogeneous agent onboarding without per-integration custom parsing logic.

This protocol is designed to complement existing agent protocols (e.g., A2A [A2A], MCP [MCP]) by defining the data-mode contract. It does not mandate transport and authorization mechanisms; those may be provided by underlying protocols. Existing structured data exchange protocols such as JMAP Sharing [RFC9670] demonstrate that standardized key-value data models with explicit sharing semantics can enable scalable cross-system interoperability, providing a design precedent for agent-to-agent schema negotiation.

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

2. Terminology

AI Agent
An entity with built-in intelligence, which can perform actions to accomplish tasks, possibly on behalf of an end-user or another agent.
Client Agent
The agent that initiates a structured interaction request, typically acting on behalf of an end-user.
Server Agent
The agent that exposes structured capabilities and responds to client agent requests with key-value formatted data.
Data Schema (Schema)
An LLM-readable template that defines the set of keys, value types, and semantic descriptions required for a specific interaction scenario.
Semantic Annotation
A human/LLM-readable description associated with a schema key, explaining its meaning, expected value range, and key semantics.
Structured Communication
The practice of exchanging information strictly within the boundaries of an agreed-upon data schema, as opposed to free-form natural language.
Schema Negotiation
The process by which a client agent discovers and obtains the data schema and semantic annotations from a server agent prior to sending structured payload.
Schema Update
An OPTIONAL mechanism by which a server agent refines its published schema based on observed dynamic usage patterns, returning updated new key definitions to the client agent.

3. Protocol Overview

The Structured Data Schema Interaction Protocol operates in three phases: Schema Negotiation, Structured Data Schema Exchange, and (optionally) Schema Update. The protocol assumes that transport-level connectivity (e.g., HTTP, JSON-RPC, or protocol-specific channels) has already been established by underlying agent communication frameworks.

3.1. Interaction Model

The protocol adopts a client-server interaction model between two cooperating agents:

Client Agent (C)
The agent that requires a service. It is responsible for (1) discovering the server agent's schema, (2) mapping user intent or upstream data into the negotiated key-value structure, and (3) transmitting the structured payload.
Server Agent (S)
The agent that exposes a capability. It is responsible for (1) publishing its key-value schema and semantic annotations, (2) validating incoming structured payloads against the schema, (3) executing the requested capability, and (4) optionally returning schema refinement suggestions.

The interaction is stateless from the protocol perspective; any session state MUST be managed by the application layer or underlying transport protocol.

3.2. Message Flow

Figure 1 illustrates the complete message flow of a structured key-value interaction.


+--------+       +--------+       +----------------+
|End User|       | Client |       |     Server     |
|        |       | Agent  |       |     Agent      |
+---+----+       +---+----+       +-------+--------+
    |                |                    |        |
    | (1) Natural    |                    |        |
    |     language   |                    |        |
    |     request    |                    |        |
    | - - - - - - - >|                    |        |
    |                |                    |        |
    |                |----\               |        |
    |                |    | (2) Agent     |        |
    |                |    |     discovery |        |
    |                |    |     + scenario|        |
    |                |    |     id        |        |
    |                |<---/               |        |
    |                |                    |        |
    |                | (3) Get schema     |        |
    |                |     for scenario   |        |
    |                |------------------->|        |
    |                |                    |        |
    |                | (3) Schema +       |        |
    |                |     semantic desc. |        |
    |                |< - - - - - - - - - |        |
    |                |                    |        |
    |                |----\               |        |
    |                |    | (4) LLM-based |        |
    |                |    |     parsing   |        |
    |                |    |     + mapping |        |
    |                |<---/               |        |
    |                |                    |        |
    |                | (5) Structured     |        |
    |                |     payload        |        |
    |                |------------------->|        |
    |                |                    |        |
    |                | (6) Execution      |        |
    |                |     result         |        |
    |                |< - - - - - - - - - |        |
    |                |                    |        |
    |                | (7) Optional       |        |
    |                |     schema update  |        |
    |                |< - - - - - - - - - |        |
    |                |                    |        |
+---+----+       +---+----+       +-------+--------+
|End User|       | Client |       |     Server     |
|        |       | Agent  |       |     Agent      |
+--------+       +--------+       +----------------+

Figure 1: Structured Key-Value Schema Interaction Message Flow

Step (1) User Request: The end user provides a natural language request to the client agent (e.g., "Help me order a large iced latte" or "Retouch this photo with a vintage filter"). This request is unconstrained free-form text that expresses the user's intent but does not yet identify the target service or data structure.

Step (2) Agent Discovery and Scenario Identification: The client agent analyzes the user's request using its internal LLM or intent-classification component to determine:

This step MAY leverage an agent registry, capability directory, or historical context to disambiguate between multiple candidate server agents. If no suitable server agent is found, the client agent SHOULD report the failure to the end user.

Step (3) Schema Negotiation: The client agent requests the server agent's schema template and semantic descriptions for a given interaction scenario (e.g., flight booking, photo editing). The server agent returns a JSON object containing the schema keys, value types, and semantic annotations.

Step (4) Structured Exchange: The client agent transmits the populated key-value payload to the server agent. All REQUIRED keys MUST be present; OPTIONAL keys MAY be omitted if not applicable.

Step (5) Execution Result: The server agent validates the payload, executes the requested capability, and returns a result. The result itself SHOULD be structured according to a pre-negotiated response schema when applicable.

Step (6) Optional Schema Update: If the server agent observes recurring patterns in the "other" key or receives explicit capability extension requests, it MAY return a schema update suggestion containing new keys or refined semantic descriptions. This step is OPTIONAL and serves as a lightweight feedback loop for long-term protocol refinement.

4. Structured Key-Value Format

Structured data formats such as JSContact [RFC9610] demonstrate that explicitly typed keys with semantic annotations enable reliable machine parsing. This protocol applies the same principle to agent-to-agent communication: every key in the schema template carries a declared type and a semantic description that constrains the client agent's generation space.

This section defines the syntax and semantics of the structured key-value interaction format.

4.1. Schema Template

A schema template is a JSON object that declares the expected structure of an interaction payload. It MUST contain the following top-level members:

schema_id
REQUIRED. A string that uniquely identifies the schema version within the server agent's namespace.
scenario
REQUIRED. A human-readable string describing the interaction scenario (e.g., "flight booking", "photo_retouch").
keys
REQUIRED. An array of key definition objects, each describing a single key expected in the payload.

Each key definition object MUST contain:

key_name
REQUIRED. A string identifier for the key, using snake_case convention. Key names MUST be unique within the schema.
key_type
REQUIRED. A JSON Schema type identifier (e.g., "string", "integer", "boolean", "array", "object").
semantic_description
REQUIRED. A human-readable and machine-processable string explaining the business meaning of the key, acceptable value enumerations, and mapping examples from natural language. This description serves as a prompt-level anchor for the client agent's LLM, constraining the generation space and reducing hallucinated key mappings. Server agents SHOULD maintain semantic descriptions in the same language as the expected end-user queries, or provide multilingual annotations when serving cross-lingual client agents.
required
REQUIRED. A boolean indicating whether the key MUST be present in the payload.
default_value
OPTIONAL. A default value to be used when the key is omitted and not required. If absent and the key is optional, the server agent MUST apply its own default logic or ignore the key.

Figure 2 shows an example schema template for a flight booking purchase scenario.


{
  "schema_id": "flight_booking_v1",
  "scenario": "flight_booking",
  "keys": [
    {
      "key_name": "origin",
      "key_type": "string",
      "required": true,
      "default_value": null,
      "semantic_description": "Departure city or airport code. Acceptable values: IATA airport codes (e.g., PEK, SHA, JFK) or city names in English or local language. Example mapping: 'from Beijing' -> 'PEK'."
    },
    {
      "key_name": "destination",
      "key_type": "string",
      "required": true,
      "default_value": null,
      "semantic_description": "Arrival city or airport code. Acceptable values: IATA airport codes or city names. Example mapping: 'to Shanghai' -> 'SHA'."
    },
    {
      "key_name": "departure_date",
      "key_type": "string",
      "required": true,
      "default_value": null,
      "semantic_description": "Date of departure in ISO 8601 format (YYYY-MM-DD). Example mapping: 'next Monday' -> '2026-05-04'."
    },
    {
      "key_name": "cabin_class",
      "key_type": "string",
      "required": false,
      "default_value": "economy",
      "semantic_description": "Cabin class preference. Acceptable values: economy, premium_economy, business, first. Example mapping: 'business class' -> 'business'."
    },
    {
      "key_name": "passenger_count",
      "key_type": "integer",
      "required": false,
      "default_value": 1,
      "semantic_description": "Number of passengers. Range: 1-9. Example mapping: 'two people' -> 2."
    },
    {
      "key_name": "other",
      "key_type": "string",
      "required": false,
      "default_value": null,
      "semantic_description": "Escape valve for unstructured semantic fragments that cannot be mapped to existing keys. Example mapping: 'window seat please' -> 'window seat'."
    }
  ]
}

Figure 2: Example Schema Template for Flight Booking

4.2. Semantic Annotation

Semantic annotations provide the contextual anchor that enables client agent LLMs to perform accurate intent-to-key mapping. Each key definition in the schema template MUST be augmented with a "semantic_description" field, as required in Section 4.1.

The semantic description serves as a prompt-level anchor for the client agent's LLM, constraining the generation space and reducing hallucinated key mappings. Figure 3 extends the flight booking example with a focused semantic annotation.


{
  "key_name": "cabin_class",
  "key_type": "string",
  "required": false,
  "default_value": "economy",
  "semantic_description": "Cabin class preference. Acceptable values: economy, premium_economy, business, first. Example mapping: 'business class' -> 'business'."
}

Figure 3: Semantic Annotation Example

Server agents SHOULD maintain semantic descriptions in the same language as the expected end-user queries, or provide multilingual annotations when serving cross-lingual client agents.

4.3. The "other" Key

The key named "other" is reserved within every schema template as an escape valve for unstructured semantic fragments that cannot be mapped to existing keys. Its usage is subject to the following rules:

The presence of meaningful content in the "other" key signals a potential schema coverage gap. Server agents MAY use this signal as input to the OPTIONAL schema update mechanism described in Section 6.2.

5. Illustrations of the protocol

This section illustrates the application of the structured key-value interaction protocol.

5.1. Flight Booking

Scenario: An end user asks a personal AI assistant (client agent) to book a flight through an airline service agent (server agent).

User utterance: "Book me a flight from Beijing to Shanghai next Monday, business class, and I prefer a window seat."

Schema negotiation: The client agent requests the flight booking schema from the server agent. The server agent returns the schema template shown in Figure 2, augmented with semantic annotations.

Intent parsing and mapping:

Structured payload:


{
  "schema_id": "flight_booking_v1",
  "payload": {
    "origin": "PEK",
    "destination": "SHA",
    "departure_date": "2026-05-04",
    "cabin_class": "business",
    "passenger_count": 1,
    "other": "window seat"
  }
}

Figure 4: Structured Payload for Flight Booking

Execution: The server agent validates the payload, queries the flight inventory, and returns a booking confirmation with flight number, departure time, and seat assignment.

Optional schema update: If the server agent observes frequent "other" entries mentioning seat preferences, it MAY return a schema update suggestion adding a "seat_preference" key with acceptable values "window", "aisle", or "none".

5.2. Photo Editing

Scenario: An end user asks a personal AI assistant to retouch a photo through a cloud-based image editing agent.

User utterance: "Please retouch this photo: smooth the skin, whiten teeth, make the background blurry, and add a vintage filter. Also, I want my eyes to look bigger."

Schema negotiation: The client agent discovers the photo editing schema from the server agent. Example keys include: skin_smoothing (integer 0-10), teeth_whitening (boolean), background_blur (boolean), filter_style (string), eye_enlargement (boolean).

Intent parsing and mapping:

Structured payload:


{
  "schema_id": "photo_retouch_v2",
  "payload": {
    "skin_smoothing": 7,
    "teeth_whitening": true,
    "background_blur": true,
    "filter_style": "vintage",
    "eye_enlargement": false,
    "other": "increase eye size proportionally"
  }
}

Figure 5: Structured Payload for Photo Editing

Note: The client agent mapped "eyes to look bigger" to the "other" key because the current schema does not define a granular eye_size adjustment key (only a boolean eye_enlargement). The server agent MAY later propose a schema update introducing "eye_size_scale" (float, 1.0-1.5) based on aggregated "other" patterns.

6. Capability Enhancement

This section describes OPTIONAL mechanisms that enhance the core structured interaction protocol. Implementations MAY support none, some, or all of these capabilities. They are designed to be transparent to agents that do not implement them.

6.1. LLM-based Semantic Parsing

The protocol assumes that client agents employ an internal LLM to perform natural language understanding (NLU) and map user intent to schema keys. The quality of this mapping directly affects the correctness of the structured payload.

Recommended practices for LLM-based semantic parsing include:

Server agents are NOT REQUIRED to perform NLU; their role is to validate and execute structured payloads. This separation of concerns reduces server-side inference costs and ensures deterministic execution.

6.2. Key-Value Schema Self-Evolution

The schema self-evolution mechanism provides a dynamic, backward-compatible path for server agents to refine their published schemas based on operational feedback and differential semantics extracted from the "other" key. It is OPTIONAL and does not alter the core structured exchange semantics.

The mechanism consists of three coordinated components: (1) a trigger mechanism based on a differential semantic pool, (2) a schema self-evolution algorithm that generates dynamic patches, and (3) a long-term evaluation framework that ensures convergence and prevents schema bloat.

6.2.1. Differential Semantic Pool and Trigger Mechanism

To avoid erroneous evolution caused by sporadic anomalies and to ensure that genuine common or personalized needs are not missed, server agents MAY maintain a Differential Semantic Pool (DSP).

The DSP collects natural-language fragments from the "other" key that could not be mapped to existing schema keys. Its operation follows three stages:

Semantic Vector Clustering
Each fragment extracted from the "other" key MUST be encoded into a semantic vector. Server agents SHOULD use dense vector embeddings (e.g., sentence-transformer-based) to represent semantic meaning. Vectors are clustered using similarity thresholds (e.g., cosine similarity >= 0.85). Each cluster represents a distinct, recurring semantic intent that is not covered by the current schema.
Heat Decay Counting
Each cluster maintains a heat score that reflects the intensity of the corresponding semantic demand. When a new fragment joins a cluster, its heat score increases by a fixed increment. The heat score MUST decay over time (e.g., exponential decay with a half-life of 24 hours) to prevent obsolete or transient patterns from accumulating undue weight.
Threshold-Based Triggering
A schema evolution trigger fires when a cluster's heat score exceeds a configurable evolution threshold (e.g., 50 accumulated heat units) AND the cluster contains a minimum number of distinct source requests (e.g., >= 5 unique client agents or >= 10 total occurrences within a 7-day window). This dual-gate design ensures that evolution is triggered only when the same semantic appeal has gathered sufficient strength across multiple interactions, filtering out accidental outliers while capturing genuine collective needs.

Server agents MAY maintain separate DSP instances per scenario or per client-agent cohort, enabling both global schema evolution and personalized schema branching.

6.2.2. Schema Self-Evolution Algorithm

Once the trigger mechanism identifies a mature semantic cluster, the server agent executes a self-evolution algorithm to synthesize a schema patch from the cluster's natural-language content.

Base Schema
The immutable, versioned schema template (see Section 4.1) that defines the stable contract. Base schema keys MUST NOT be removed or have their types changed by dynamic patches; only additive or semantic-description refinements are permitted in the patch layer.
Dynamic Patch Generation
For each triggered semantic cluster, the server agent MUST perform intent induction, conflict detection, and patch packaging.
Online Update Delivery
The server agent MAY deliver active patches through inline suggestions in the schema_update_suggestion field of the execution result response or through a dedicated schema polling endpoint.
Personalized Adaptation
Server agents MAY maintain per-client patch stacks. When a specific client agent repeatedly submits personalized requests that fall into a unique semantic cluster (e.g., a frequent business traveller who always requests "extra legroom" and "quiet cabin"), the server agent MAY generate a client-specific patch that adds keys such as "seat_preference" and "cabin_zone" only for that client. This transforms standardized task flows into privately customized service responses without polluting the global schema.
  • Intent Induction: Use an internal LLM or rule-based extractor to summarize the cluster's natural-language fragments into a candidate key_name, key_type, required flag, default_value, and semantic_description. The semantic_description MUST include mapping examples derived from real fragments in the cluster.
  • Conflict Detection: Before finalizing the patch, the server agent MUST check that the candidate key_name does not collide with existing keys in the base schema or active patches. If a collision occurs, the server agent SHOULD merge the candidate into the existing key by refining its semantic_description rather than creating a duplicate.
  • Patch Packaging: The approved candidate is packaged as a schema patch containing new_keys, modified_keys, or both. The patch MUST carry a patch_id, a parent_schema_id, a timestamp, and an expiration date.

6.2.3. Long-Term Evaluation Mechanism

To prevent schema bloat and the accumulation of erroneous keys, server agents that implement self-evolution MUST employ a long-term evaluation mechanism that continuously assesses the quality and necessity of every patch key.

Evaluation Metrics: For each key introduced through a dynamic patch, the server agent SHOULD track at minimum the following metrics over a configurable observation window (default 30 days):

  • usage_frequency: The ratio of payloads in which the key is present to total payloads for the scenario.
  • semantic_alignment_accuracy: The ratio of payloads where the key's value matches the intent expressed in the original natural-language request, as judged by an internal LLM or manual audit sampler.
  • value_type_correctness: The ratio of payloads where the received value conforms to the declared key_type.
  • client_adoption_rate: The ratio of distinct client agents that have successfully adopted the key to those that received the patch.

Two-Phase Lifecycle: All new keys introduced via dynamic patches MUST begin in an experimental state and transition through two phases.

Trial Period and Stabilization Decision
The key remains experimental for a configurable duration (default 7 days). During this period, the server agent collects the metrics above. The key MUST be advertised with an "experimental: true" flag in the patch so that client agents know the key is provisional. Upon completion of the trial period, the server agent evaluates the aggregated metrics.
Schema Convergence Guard
The server agent MUST enforce a maximum limit on the number of active experimental keys per scenario (e.g., no more than 10). When the limit is reached, new triggers MUST be queued until existing experimental keys are either promoted or deprecated. This guard prevents runaway schema bloat and ensures that the protocol converges to a stable, high-signal key set over time.
  • Promotion: If usage_frequency >= 15%, semantic_alignment_accuracy >= 80%, and value_type_correctness >= 90%, the key is promoted to stable. The "experimental" flag is removed, and the key becomes part of the long-term supported schema.
  • Deprecation: If usage_frequency < 5% OR semantic_alignment_accuracy < 60% OR value_type_correctness < 70%, the key is marked deprecated. Deprecated keys remain in the schema for a grace period (default 14 days) with a "deprecated: true" flag, after which they MAY be moved to a withdrawn state.
  • Withdrawal: A withdrawn key is removed from active patches. Server agents MUST still accept the key in incoming payloads for an additional backward-compatibility window (default 30 days) to avoid breaking legacy client agents, but SHOULD log a warning.

7. Security Considerations

Structured key-value payloads may contain sensitive personal information (e.g., dietary preferences, biometric retouching parameters, location data). Implementations MUST protect this data in transit and at rest using mechanisms appropriate to their threat model.

Schema negotiation and update messages MUST be integrity-protected to prevent man-in-the-middle attacks that could inject malicious keys or semantic descriptions designed to exfiltrate data or trigger unauthorized actions.

When the "other" key contains free-form natural language, server agents MUST apply the same input validation and sanitization practices as they would for any natural language input, preventing prompt injection or command injection attacks.

The OPTIONAL schema update mechanism MUST require authentication and authorization if it exposes new capabilities or modifies security-relevant keys (e.g., keys related to payment, identity, or access control).

8. IANA Considerations

This document has no IANA actions.

9. References

9.1. Normative References

[RFC9670]
Jenkins, N., Ed., "JSON Meta Application Protocol (JMAP) Sharing", RFC 9670, DOI 10.17487/RFC9670, , <https://www.rfc-editor.org/info/rfc9670>.
[RFC9610]
Jenkins, N., Ed., "JSON Meta Application Protocol (JMAP) for Contacts", RFC 9610, DOI 10.17487/RFC9610, , <https://www.rfc-editor.org/info/rfc9610>.
[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.

9.2. Informative References

[A2A]
Google, "Agent2Agent(A2A) Protocol", , <https://a2a-protocol.org/latest>.
[MCP]
Anthropic, "Model Context Protocol (MCP)", , <https://modelcontextprotocol.io/docs/getting-started/intro>.

Appendix A. Example Messages

This appendix provides complete, non-normative examples of schema negotiation, structured payload exchange, and optional schema update messages.

A.1. Schema Negotiation Request and Response

Client agent request:


{
  "method": "get_schema_template",
  "params": {
    "scenario": "flight_booking",
    "preferred_language": "en-US"
  }
}

Figure 6: Schema Negotiation Request

Server agent response:


{
  "schema_id": "flight_booking_v1",
  "scenario": "flight_booking",
  "keys": [
    {"key_name": "origin", "key_type": "string", "required": true, "semantic_description": "Departure city or airport code. Acceptable values: IATA airport codes (e.g., PEK, SHA, JFK) or city names. Example: 'from Beijing' -> 'PEK'."},
    {"key_name": "destination", "key_type": "string", "required": true, "semantic_description": "Arrival city or airport code. Example: 'to Shanghai' -> 'SHA'."},
    {"key_name": "departure_date", "key_type": "string", "required": true, "semantic_description": "Departure date in ISO 8601 format (YYYY-MM-DD). Example: 'next Monday' -> '2026-05-04'."},
    {"key_name": "cabin_class", "key_type": "string", "required": false, "default_value": "economy", "semantic_description": "Cabin class. Acceptable values: economy, premium_economy, business, first. Example: 'business class' -> 'business'."},
    {"key_name": "passenger_count", "key_type": "integer", "required": false, "default_value": 1, "semantic_description": "Number of passengers. Range: 1-9. Example: 'two people' -> 2."},
    {"key_name": "other", "key_type": "string", "required": false, "semantic_description": "Natural-language descriptions that cannot be mapped to existing keys. Example: 'window seat' -> 'window seat'."}
  ]
}

Figure 7: Schema Negotiation Response

A.2. Structured Payload Exchange

Client agent request payload:


{
  "schema_id": "flight_booking_v1",
  "payload": {
    "origin": "PEK",
    "destination": "SHA",
    "departure_date": "2026-05-04",
    "cabin_class": "business",
    "passenger_count": 1,
    "other": "window seat"
  }
}

Figure 8: Client Agent Request Payload

Server agent response:


{
  "booking_id": "BK-20260430-001",
  "status": "confirmed",
  "flight_number": "CA1234",
  "departure_time": "2026-05-04T09:00:00+08:00",
  "arrival_time": "2026-05-04T11:20:00+08:00",
  "seat": "12A",
  "total_amount": "CNY 2,450.00",
  "schema_update_suggestion": {
    "new_keys": [
      {
        "key_name": "seat_preference",
        "key_type": "string",
        "required": false,
        "default_value": "none",
        "semantic_description": "Seat preference. Acceptable values: window, aisle, none."
      }
    ]
  }
}

Figure 9: Server Agent Response

A.3. Photo Editing Schema Example


{
  "schema_id": "photo_retouch_v2",
  "scenario": "photo_retouch",
  "keys": [
    {"key_name": "skin_smoothing", "key_type": "integer", "required": false, "default_value": 0, "semantic_description": "Skin smoothing intensity. Range: 0-10, where 0 means off. Example: 'light skin smoothing' -> 3."},
    {"key_name": "teeth_whitening", "key_type": "boolean", "required": false, "default_value": false, "semantic_description": "Whether teeth whitening is enabled. Example: 'make the teeth whiter' -> true."},
    {"key_name": "background_blur", "key_type": "boolean", "required": false, "default_value": false, "semantic_description": "Whether background blur is enabled. Example: 'blur the background' -> true."},
    {"key_name": "filter_style", "key_type": "string", "required": false, "default_value": "none", "semantic_description": "Filter style. Acceptable values: none, vintage, cinematic, warm, cool. Example: 'vintage style' -> 'vintage'."},
    {"key_name": "eye_enlargement", "key_type": "boolean", "required": false, "default_value": false, "semantic_description": "Whether eye enlargement is enabled. Note: this key is a boolean switch; use other for fine-grained adjustment requests."},
    {"key_name": "other", "key_type": "string", "required": false, "semantic_description": "Photo-editing requirements that cannot be mapped to existing keys. Example: 'make the eyes a little bigger' -> 'increase eye size proportionally'."}
  ]
}

Figure 10: Photo Editing Schema Example

Authors' Addresses

Fangtong Zhou
Huawei Technologies
Huawei Bld., No.156 Beiqing Rd.
Beijing
100095
China
Shuping Peng
Huawei Technologies
Huawei Bld., No.156 Beiqing Rd.
Beijing
100095
China