Internet-Draft | UNCO | April 2025 |
Li & Li | Expires 26 October 2025 | [Page] |
This draft introduces the Unified Network and Cloud Orchestration Framework (UNCO), a framework designed to enable real-time joint orchestration of network and computing resources in 5G and future-generation networks. UNCO framework addresses inefficiencies in current resource scheduling mechanisms, resolves objective conflicts across domains, and provides unified policy and security management. It is applicable in emerging scenarios such as ultra-reliable low-latency communications (URLLC), mobile edge computing (MEC), and network slicing, where service quality and operational efficiency are paramount.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 26 October 2025.¶
Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
As 5G and next-generation networks evolve to support latency-sensitive, compute-intensive, and highly dynamic applications, traditional siloed orchestration mechanisms are no longer sufficient. The integration of network and computing resources is essential to enable real-time, adaptive service provisioning. Current industry efforts such as ETSI NFV [NFV033], 3GPP MEC, and IETF service chaining [RFC8969] have made progress in specific domains, but a holistic orchestration framework that bridges network and computing domains with unified security and policy governance remains lacking.¶
In addition, Telecom Clouds introduce new operational complexities that differ significantly from public cloud deployments. Unlike public clouds, which rely on third-party network providers, Telecom Clouds operate under a single administrative domain where both network and cloud infrastructure are tightly coupled and managed by the same operator. This integration opens up opportunities for real-time coordination between cloud service scaling events and network policy adjustments. However, most existing network management systems lack visibility into dynamic cloud states, which can lead to inefficient load balancing, suboptimal routing, and SLA violations for critical services like AI/ML pipelines, video streaming, and 5G slice traffic.¶
To address these limitations, the UNCO framework introduces a telemetry-driven mechanism whereby cloud-side resource and service status can be abstracted and delivered to network controllers in near real-time. This mechanism enables the dynamic adjustment of network policies such as UCMP and load balancing, based on ongoing changes in cloud resource availability or service deployment state. Unlike existing IETF efforts (e.g., TEAS [draft-ietf-teas-ietf-network-slice-framework], OPSAWG [draft-ietf-opsawg-service-assurance-architecture], CATS [draft-ietf-cats-framework]), which offer valuable foundations for traffic engineering and service-aware routing, UNCO builds upon and extends them by incorporating real-time cloud-derived metrics directly into the orchestration logic. This approach ensures SLA-compliant, fine-grained orchestration of both network and compute infrastructure in multi-cloud and Telecom Cloud environments.¶
The Unified Network and Cloud Orchestration framework (UNCO) addresses these gaps by enabling:¶
Unified orchestration of computing and network resources.¶
Dynamic, SLA-driven scheduling of heterogeneous resources.¶
Cross-domain policy alignment and enforcement.¶
Real-time observability and security management across domains.¶
UNCO introduces a layered architectural model with well-defined functional modules and interfaces to facilitate standardization and interoperability among diverse vendor ecosystems.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119] [RFC8174].¶
The following terms are used in this draft:¶
UNCO: Unified Network and Cloud Orchestration Framework.¶
NS-OSS: Network Service Orchestration and Scheduling System.¶
MEC: Multi-access Edge Computing, a framework that extends cloud capabilities to the edge of the network.¶
URLLC: Ultra-Reliable Low-Latency Communications, a category of 5G use cases requiring high reliability and very low latency.¶
SLA: Service-Level Agreement, a formalized agreement on expected service performance metrics.¶
UCMP: Unequal-Cost Multi-Path routing, a technique that uses paths with different costs simultaneously.¶
TEAS: Traffic Engineering Architecture and Signaling, an IETF working group focused on traffic engineering mechanisms.¶
CATS: Computing-Aware Traffic Steering, an emerging framework for steering traffic based on computing availability.¶
NFV: Network Functions Virtualization, an architecture for virtualizing network functions previously implemented in hardware. [NFV033]¶
YANG: Yet Another Next Generation, a data modeling language used to model configuration and state data manipulated by the NETCONF protocol.[RFC8969]¶
RBAC: Role-Based Access Control, a policy-neutral access control mechanism defined around roles and privileges.¶
IAM: Identity and Access Management, the security discipline that enables the right individuals to access the right resources at the right times.¶
SDN: Software Defined Networking, an approach to networking that uses software-based controllers to direct traffic on the network.¶
API: Application Programming Interface, a set of definitions and protocols for building and integrating application software.¶
QoS: Quality of Service, the description or measurement of the overall performance of a service.¶
AR/VR/XR: Augmented Reality / Virtual Reality / Extended Reality, technologies for immersive digital experiences.¶
4.1 Real-Time and Dynamic Resource Scheduling¶
Modern applications, such as immersive reality, smart manufacturing, and vehicular communication systems, demand rapid provisioning and adjustment of both compute and network resources. Traditional orchestrators often pre-allocate resources statically or based on historical models, which are ill-suited to handle:¶
Sudden surges in user demands (e.g., traffic spikes in live streaming).¶
Elastic scaling requirements (e.g., AI inference workload offloading).¶
Edge-cloud resource handoff and failover scenarios.¶
These limitations lead to under-utilization of expensive infrastructure and inconsistent quality of experience (QoE).¶
4.2 Contradictions Among Different Objectives¶
Multiple stakeholders often have conflicting optimization goals. For instance:¶
Maximizing compute utilization may increase network path redundancy.¶
Reducing latency by routing over low-latency paths may overload specific compute clusters.¶
Minimizing operational costs may sacrifice redundancy and resilience.¶
A successful orchestration strategy must balance these trade-offs dynamically, based on service priorities and system state.¶
4.3 Lack of Joint Effectiveness Evaluation¶
Scheduling strategies are often evaluated independently in the context of either network performance (e.g., throughput, delay) or computing performance (e.g., CPU usage, task completion time). However, next-gen services require holistic metrics that combine:¶
End-to-end latency from user device to compute execution node.¶
Task success rate under constrained bandwidth and CPU cycles.¶
Adaptive resource reallocation under failure or congestion.¶
Such unified metrics are crucial for validating orchestration policies.¶
4.4 Security and Strategy Fragmentation¶
Network policy (e.g., firewalls, ACLs, segmentation) and cloud security policy (e.g., IAM, security groups) are traditionally managed in isolation. This results in:¶
Inconsistent access controls between compute and data planes.¶
Increased cross-domain attack surface.¶
Complexity in policy auditing, validation, and enforcement.¶
UNCO proposes a security-unified model to enforce coherent policies across cloud and network domains.¶
This section provides an overview of the UNCO framework and an introduction to its key components. The high-level framework overview of UNCO is shown in Figure 1.¶
UNCO is composed of three primary modules:¶
NS-OSS (Network Service Orchestration and Scheduling System): The central decision-making and coordination entity responsible for managing service deployment, orchestrating cross-domain resources, and enforcing global policies.¶
Cloud Manager: A cloud-native resource controller that abstracts heterogeneous computing resources (VMs, containers, GPUs, NPUs, etc.) across edge and central cloud domains. It acts as the compute-plane orchestrator, reporting availability and enforcing workload deployment.¶
Network Controller: A domain-specific SDN or legacy-compatible controller that governs routing, QoS, and telemetry. It operates on the data plane and acts as a programmable policy agent for traffic forwarding, service chaining, and SLA-aware path selection.¶
These components are deployed in a logically centralized but physically distributed manner to support scalability and fault tolerance. They interact via well-defined interfaces and protocols to deliver seamless joint orchestration.¶
UNCO is designed to operate across hybrid infrastructures:¶
Public Cloud: Multi-cloud environments (e.g., AWS, Azure, Alibaba Cloud) .¶
Private Cloud/Enterprise DC: Bare-metal and virtualized compute clusters .¶
Edge Computing: Regional micro-DCs or device-near nodes .¶
Transport and Access Networks: L2/L3 infrastructure supporting MPLS, SRv6, or P4-based forwarding.¶
+----------------+ | Application | +----------------+ | | IN1.1 IN1.2 | | +----------------+ --IN2.1-- +----------------+ | NS-OSS | --IN2.2-- | Cloud Manager | +----------------+ +----------------+ | | | IN3.1 IN3.2 | | | | +-------------------+ | |Network Controller | | +-------------------+ | | | +------------------------+ +---------------------+ | Public Cloud |-------| Cloud(VM/containers,| | (WAN) | | GPUs/NPUs,etc.) | +------------------------+ +---------------------+ Figure 1 The overall framework of UNCO¶
Each module can scale independently, supporting multi-tenancy, high availability, and flexible deployment topologies. NS-OSS typically includes a policy engine, resource graph model, service catalog, and intent resolution logic. It may integrate with external OSS/BSS systems for commercial service integration.¶
The NS-OSS (Network Service Orchestration and Scheduling System) serves as the brain of the UNCO framework. It is designed to perform centralized decision-making while maintaining awareness of service requirements, real-time resource availability, and policy enforcement across domains. NS-OSS is capable of translating high-level application intents into concrete actions such as workload placement, bandwidth allocation, and route optimization.¶
It plays a vital role in translating service-level requirements into programmable tasks, ensuring optimal resource usage while maintaining SLA commitments. The NS-OSS also maintains a global topology and performance view of both computing and networking infrastructure, enabling end-to-end orchestration decisions. Moreover, it ensures feedback-driven loop closure, adapting orchestration actions based on monitored outcomes. Through coordination with both the Cloud Manager and the Network Controller, the NS-OSS can adjust deployments in response to failures, demand surges, or SLA violations.¶
The NS-OSS is a logically centralized orchestrator with the following extended capabilities:¶
Service Parsing & Decomposition: Translates high-level service intents into fine-grained resource requirements.¶
Topology Awareness: Maintains a live graph of compute, storage, and network nodes with performance telemetry.¶
Feedback Loops & SLA Assurance: Continuously collects performance metrics to adapt placements and routing in real-time.¶
Security Federation: Validates policy consistency across cloud-native RBAC and network access lists.¶
The Cloud Manager is responsible for abstracting and managing the compute, storage, and specialized acceleration resources (e.g., GPU, TPU, NPU) across different cloud domains, including edge, regional, and centralized clouds. It serves as the execution agent for deployment decisions initiated by the NS-OSS and provides real-time feedback on resource utilization and availability.¶
Beyond resource abstraction, the Cloud Manager enables policy-compliant service instantiation, performance monitoring, and failure detection at the compute layer. It supports diverse virtualization and containerization technologies, offering a unified interface for NS-OSS to interact with heterogeneous platforms such as Kubernetes, OpenStack, or bare-metal clusters. In the UNCO framework, the Cloud Manager plays a critical role in edge computing scenarios, ensuring proximity-based service placement and maintaining low-latency, high-reliability requirements. Furthermore, it facilitates fine-grained scaling decisions that match service-level intents, contributing directly to elastic, resilient orchestration.¶
The Cloud Manager serves as the bridge between orchestration logic and actual compute substrates. Its extended functionalities include:¶
Compute Resource Abstraction: Normalizes capabilities of VMs, containers, and accelerators into a unified schema.¶
Edge-Aware Scheduling: Places latency-sensitive workloads near data sources using location tags and latency maps.¶
Dynamic Scaling: Triggers horizontal or vertical scaling of services based on telemetry and policy.¶
Secure Configuration Enforcement: Applies predefined templates and isolation profiles to newly instantiated services.¶
Failure Recovery & Migration: Performs live migration or re-instantiation in case of compute node degradation or failure.¶
The Network Controller in UNCO serves as a programmable interface between orchestration logic and the physical or virtual network infrastructure. It is responsible for interpreting policies and traffic engineering directives from NS-OSS and translating them into actionable configurations on network devices or SDN agents.¶
As the network-facing component, the controller collects real-time metrics from the underlying transport and access networks, including traffic utilization, link health, congestion indicators, and routing anomalies. These insights feed back into NS-OSS to enable adaptive reconfiguration in response to network dynamics. The controller also supports integration with emerging technologies such as P4 programmable data planes and segment routing protocols, allowing fine-grained per-flow steering based on SLA metadata or service tags.¶
By interfacing with the Cloud Manager, the Network Controller becomes cloud-aware, enabling traffic paths to be optimized based on the location, health, and demand patterns of compute resources. This makes the UNCO framework especially suitable for distributed AI, AR/VR, and latency-sensitive applications. Additionally, it supports inter-domain coordination for multi-cloud and multi-vendor environments, ensuring robust, scalable service delivery across complex topologies.¶
The Network Controller performs programmable data-plane management and service-aware traffic engineering:¶
Telemetry-Driven Path Optimization: Continuously monitors link quality (bandwidth, jitter, RTT, congestion).¶
Dynamic QoS Enforcement: Applies differentiated service policies (e.g., priority queues, rate limits, ECN) based on slice and service IDs.¶
Programmable Fabric Support: Interfaces with SDN controllers, P4 switches, or segment routing agents for granular traffic steering.¶
Inter-Domain Routing Federation: Coordinates with external network controllers (e.g., IP/MPLS, BGP peers) for path stitching across domains.¶
The UNCO framework defines standard interfaces between its components to support unified orchestration and closed-loop control across cloud and network domains. The interfaces are categorized as follows:¶
1) IN1: Application - NS-OSS Interface¶
This interface enables applications to interact with the orchestration system for service deployment and resource feedback.¶
IN1.1 Service Deployment Request (Application → NS-OSS)¶
IN1.2 Resource Allocation Result (NS-OSS → Application)¶
2) Cloud Manager - NS-OSS Interface¶
This interface enables the Cloud Manager to provide real-time cloud resource status to NS-OSS.¶
IN2.1 Resource Metrics Report (Cloud Manager → NS-OSS)¶
IN2.2 Service Status Report (Cloud Manager → NS-OSS)¶
Parameters:¶
Computing power requirements: computing power types (CPU/GPU/FPGA), Resource quantity (number of CPU cores/memory/GPU model and quantity), Scenarios (training/inference/storage/high-performance computing) .¶
Network status: topology, bandwidth, latency and other information • Deployment configuration: availability data center, image identification (operating system/preset image ID), network configuration (VPC ID/subnet ID/security group rule summary).¶
Resource pre occupation: resource pool type (public cloud/private cloud/hybrid cloud), pre occupation mode (on-demand/reserved instance), storage configuration (type/capacity/IOPS).¶
Purpose: Supports service lifecycle management, monitoring, and fault recovery.¶
3) IN3: NS-OSS - Network Controller Interface¶
This interface allows the NS-OSS to dynamically program the network according to real-time cloud and service conditions.¶
IN3.1 Issuing of Network Control Policy (NS-OSS → Network Controller)¶
To ensure UNCO can support a wide range of networked applications across edge, cloud, and transport environments, it defines a set of functional requirements that guide its architectural design and interface behaviors. These requirements emphasize responsiveness, reliability, and compatibility across multi-vendor, multi-domain infrastructures. The following functions are essential to enable joint orchestration of computing and networking resources while preserving service quality, optimizing resource utilization, and maintaining policy consistency.¶
Here are some functional requirements:¶
FR1: SLA-compliant orchestration for computing, network, and storage resources.¶
FR2: Elastic, demand-driven scheduling based on real-time data and service intent.¶
FR3: Inter-domain policy normalization and conflict mitigation across compute and network planes.¶
FR4: Observability and feedback mechanisms for orchestration decisions.¶
FR5: Unified access control, audit trails, and policy enforcement across domain.¶
Cloud computing has become a foundational component in the infrastructure of modern telecom operators. With the increasing deployment of cloud-based AI services and edge-native applications, it is essential to support integrated orchestration of cloud and network resources as well as end-to-end security management. UNCO addresses this need by providing mechanisms to incorporate cloud-related information into network control and policy decision-making, enabling dynamic, SLA-driven service management.¶
However, the lack of standardized interfaces and models for exchanging cloud telemetry across the network domain remains a key obstacle. Cross-domain collaboration is often hindered by proprietary APIs, inconsistent abstractions, and limited interoperability. These limitations result in delayed network adjustments and fragmented service delivery.¶
UNCO addresses these challenges by proposing a unified framework and standardized interfaces that bring real-time cloud awareness into network orchestration. Its ability to coordinate compute and network resources holistically enables more resilient, efficient, and SLA-compliant service delivery across public clouds, private datacenters, and edge platforms.¶
As UNCO continues to evolve, its ability to bridge these gaps through telemetry integration, policy abstraction, and multi-domain orchestration will be critical. Potential application scenarios include:¶
Elastic AI/ML service hosting at the edge and core, requiring workload-aware bandwidth and path adjustments.¶
Immersive applications (AR/VR/XR, cloud gaming, real-time collaboration) that rely on strict latency and jitter guarantees.¶
Dynamic multi-cloud interconnection for enterprise-grade network slicing and hybrid connectivity, etc.¶
These emerging services demand orchestration frameworks like UNCO that go beyond siloed resource management and offer unified, programmable, and standards-aligned operational control.¶
UNCO presents a comprehensive framework for integrating computing and networking orchestration in modern networks. By addressing dynamic scheduling, multi-objective trade-offs, cross-domain policy harmonization, and end-to-end security, UNCO provides a strong foundation for enabling future-ready services.¶
TBD¶