An Architectural Framework for Monitoring Packet Loss Caused by Network Congestion

With the large-scale deployment of 5G networks, emerging services including enhanced Mobile Broadband (eMBB) and Ultra-Reliable Low Latency Communication (uRLLC) have imposed stringent requirements on IP bearer network performance, demanding significantly reduced latency, minimized jitter, and near-zero packet loss rates [_GPP_TS_22.261]. At the same time, the technical development of Big Data and Artificial Intelligence (AI) calls for intelligent computing network infrastructure whose goal is to construct a lossless network characterized by "high throughput, low latency, and zero packet loss" [Adithya_Gangidi24][Kun_Qian24]. However, the inherent statistical multiplexing nature of TCP/IP-based IP networks results in bursty traffic patterns, making network congestion an inevitable occurrence. Such congestion phenomena degrade network performance and introduce the uncertainty in service delivery, e.g., loss leads to packet retransmission, increasing delay leads to decreasing throughput. For a long time, numerous studies have been concentrated on congestion control mechanisms and related algorithms [RFC9293][RFC9743] to improve network performance. Network congestion is roughly divided into two classes: long-lived congestion and short-lived congestion. A long lived congestion is generally caused by persistent traffic growth, e.g., congestion duration ranging from hours to days, which is easy to be observed through Network Management System/Element Management System (NMS/EMS). However, a short-lived congestion is almost caused by traffic bursts, among which microburst is one of the major contributors. Microburst is a phenomenon where a device port receives a considerable amount of burst data in a very short time (i.e., milliseconds, even microseconds), resulting in an instantaneous burst rate much higher than the average rate, even exceeding the port bandwidth [Microburst][Shuhei_Yoshida21]. A microburst is prone to packet loss but difficult to detect in time. Many investigations prove that microburst is the main culprit affecting latency-sensitive and packet loss-sensitive services. When a microburst occurs, the queuing time increases rapidly, and in severe case, packet loss may even occur, which are intolerable for applications like Virtual reality (VR). In order to reduce uncertain service delivery caused by network congestion, it is essential to monitor congestion-induced packet loss in real time so that network operators can quickly locate the congested nodes and links, and then make path optimization for the affected traffic flows to avoid congestion; and evaluate network congestion level so as to provide the guidance for network planning, capacity expansion and optimization. [I-D.he-ippm-congestion-loss-monitoring-problem] discusses the requirements of real-time monitoring of packet loss caused by congestion, presents the problems and challenges faced by existing monitoring and measurement techniques in real-time monitoring of congestion-induced packet loss. This document describes an architectural framework for real-time monitoring of congestion-induced packet loss.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 when, and only when, they appear in all capitals, as shown here.

Abbreviations used in this document: AI: Artificial Intelligence CLI: Command Line Interface CPU: Central Processing Unit MPLS: Multi-Protocol Label Switching NTP: Network Time Protocol PLR: packet loss ratio SNMP: Simple Network Management Protocol? SLA: Service Level Agreement SLO: Service Level Objective SRv6: Segment Routing over IPv6 VPN: Virtual Private Network

To monitor congestion-induced packet loss effectively, this document proposes a comprehensive packet loss monitoring architectural framework [Xioaming_He25]. The proposed framework is mainly composed of network devices and the collection and analysis system. All network devices need to report loss events caused by congestion, and also cache the discarded packets due to queue overflow and upload them to the collection and analysis system in real-time manner. Telemetry interface (e.g., YANG Push [RFC8641], gRPC [gRPC]) with subscription mechanism is used to push loss data immediately when the loss event occurs, avoiding the inefficiency of the traditional SNMP polling mode. The collection and analysis system is required to count the total number of the discarded packets reported, parse the service types of discarded packets, count the number of the discarded packets for every traffic flow contained in all loss events, and calculate packet loss ratio (PLR) of the specified user flow, etc. Furthermore, the real-time visibility of packet loss gained from the collection and analysis system can feed into NMS/EMS so that network operators can quickly pinpoint the congested nodes and the affected traffic flows. Also, with the injection of such a real-time visibility of packet loss, the network controller can make timely path optimization for the affected traffic flows sensitive to latency and loss to improve user Quality of Experience (QoE). Figure 1 illustrates the proposed framework for monitoring packet loss caused by congestion.

+-------------+ +--------------------------+ +-------------+ | Network |<-------| Collection and Analysis |------->| NMS/EMS | | controller | | system | | | +-------------+ +-------------^------------+ +-------------+ | | | Timely path optimization | Rapid troubleshooting | based on real-time | based on real-time | loss visibility | loss visibility | Packet loss data reporting +-----v--^-------------^-------------^-------------^-------------^-------+ | | | | | | | | | | | | | | | +---+--+ +---+--+ +---+--+ +---+--+ +---+--+ | | | Node |------| Node |------| Node |------| Node |------| Node | | | +------+ +------+ +------+ +------+ +------+ | | IP Network | +------------------------------------------------------------------------+

In IP networks, network devices such as router and switch are mainly used to implement packet forwarding. Traditional network devices can only record the number of discarded packets by port or queue overflow, and no loss information is notified promptly when packet loss occurs. The operator can only log on the device (e.g., through CLI) to search for loss event. Network devices need to have the ability to detect congestion and packet loss in real time. The traditional query using CPU on main control engine consumes much processing resources, and the network device must leverage built-in dedicated hardware to detect packet loss in real time. On the other hand, existing forwarding devices do not cache the packets overflowed by queue, but simply drop them, hence it are not clear what packets were dropped, and what traffic flows contributed to congestion or microburst. In order to capture the traffic flows related to the packet loss, a cache for the discarded packets is needed. The proposed in-device packet loss detection architecture is shown in Figure 2.

+------------------------------------------------------------------+ | Network device | | | | +---------------------------+ +--------------------------+ | | | Real-time packet loss |--->| packet loss information | | | | detection module | | reporting module | | | +-------------|-------------+ +--------------------------+ | | | | | v | | +---------------------------+ +-----+ | | | packet loss counter |<-------|queue| port1 | | +---------------------------+ +-----+ | | +-----+ | | +---------------------------+ |queue| port2 | | |Cache module for discarded |<-------+-----+ | | | Packets | +-----+ | | +-------------|-------------+ |queue| port3 | | | +-----+ | | v : | | +---------------------------+ +-----+ | | | packet loss file Upload | |queue| portN | | | module | +-----+ | | +---------------------------+ +-----+ | | |queue| portM | | +-----+ | +------------------------------------------------------------------+ The in-device packet loss Detection architecture is required to add four new functional modules, which are described as follows. Real-time packet loss detection module: Leverage the built-in dedicated hardware to query the queue overflow packet loss counter of every port at millisecond interval; also, records the location and time of loss occurrence. Packet loss information reporting module: Sends loss information according to subscription request, including the number of discarded packets, the timestamp of loss occurrence, the localization of packet loss such as device ID, port ID and queue ID. Cache module for discarded packets: Caches packets dropped by queue overflow, and optionally, records the number of discarded packets, the time of loss occurrence, the localization of packet loss such as device ID, port ID and queue ID. Only one shared cache is needed for all ports and queues. In order to save buffer space, the cached packets should be cleaned immediately after uploading. Packet loss file upload module: Packages the cached discarded packets as a file or compressed file and uploads it to the collection and analysis system according to the specified rule.

To analyze packet drops caused by queue overflow, implementing a cache mechanism is essential for capturing discarded packets. However, since packet parsing and statistical analysis consume significant local resources (such as memory and computing power), these tasks are more suitable for being handled by a remote central processing entity. since packet headers typically contain all necessary service type and flow attribute information, truncating discarded packets to a fixed length (e.g., the first 64 bytes) provides sufficient data for analysis while dramatically reducing cache need. In the process of uploading packet loss file and cleaning the discarded packets, any loss event may happen to occur, leading to no buffer available for the subsequent dropped packets. In order to avoid this situation, the cache should be divided into two separate spaces in appropriate proportion: primary space and spare space. The primary space is used to cache the discarded packets for uploading each time, and the spare space is used to cache subsequent discarded packets during the current packet packaging and uploading operation.

In order to support the real-time uploading of packet loss file, file transfer protocol such as Trivial File Transfer Protocol (TFTP) [RFC1350] should to be used for transferring the file immediately when the loss file is available. To minimize cache capacity, a smart uploading scheme for packet loss file is proposed, which is described as follows. S1 If there is no discarded packet in any cache, no packet loss file will be uploaded to minimize processing resources. S2 If there exist some discarded packets in any cache, including the primary space or the spare space, and neither space reaches the utilization threshold (e.g., 90%), the packet loss file is uploaded according to the preset fixed cycle (e.g., 10s) that needs to meet the real-time requirements for packet parsing and statistics. S3 Else, when either space reaches the utilization threshold due to considerable dropped packets, the packet loss file is uploaded immediately without waiting for the next uploading cycle.

The local device is also required to collect real-time loss data caused by congestion. In order to capture loss event in real time, the network device needs to leverage the built-in dedicated hardware such as Application Specific Integrated Circuit (ASIC) to read the packet loss counter of each port or queue at millisecond interval, and send telemetry data about loss information according to subscription request. In order to improve the real-time awareness of packet loss in some scenarios such as traffic optimization and congestion discovery, the on-change update (compared to periodic update) is more preferable, that is, a telemetry update is sent immediately when packet loss counter value changes. While supporting on-change update, a dampening period should be configurable to minimize the amount of data sent. On the other hand, in order to measure Packet Loss Ratio (PLR) caused by congestion, the network device is required to collect the statistical data of the monitored traffic flows and send the corresponding telemetry data to the collection and analysis system periodically. The ingress device, such as access router and Provider Edge router (PE), is required to configure the receiving packet counter for the monitored traffic. The specified traffic flows may be identified by Layer 2 flows (e.g., based on source and/or destination Media Access Control (MAC) address, Virtual Local Area Network Identifier (VLAN ID), Virtual eXtensible Local Area Network Identifier (VxLAN VNI)), Layer 3 flows (e.g., identified by N-tuple, and/or Flow Label field of IPv6 packet header), Layer 2/3 VPN ID carried in SR-MPLS label stack or IPv6 Segment Routing Header (SRH), etc.

The global time synchronization is also needed for the accurate calculation of PLR measurement. For instance, when the ingress device periodically reports the received VPN traffic statistical data (packet counter value) with the timestamp in telemetry data, and during some report period, this specified VPN traffic has happened to encounter packet loss caused by a microburst, and the loss information is immediately reported carrying the timestamp of loss occurrence. Figure 3 depicts the timing relationship between the time of telemetry data of the specified traffic reported and that of loss occurrence reported.

| Report period | Report period | --|-------------------|---------|---------|--------> Synchronization time ^ ^ ^ ^ | | | | | | | | Tp Tl Tc Based on their respective timestamps, e.g., the timestamp Tl of loss occurrence falls between the timestamp Tp and Tc carried by the two consecutive traffic telemetry data, the collection and analysis system can correctly calculate PLR of the specified VPN traffic at that exact period. The network device is required to support time synchronization techniques such as Network Time Protocol(NTP)or IEEE1588, which are widely deployed in operator's networks. Generally, NTP can meet precision of 50 ms and IEEE1588 can meet precision of microseconds. In the proposed scheme, time synchronization precision depends on measurement period. For normal measurement period of tens of seconds or even minutes, synchronization precision of 50ms(easy to implement) is enough to satisfy the measurement requirement.

The proposed framework is required to handle packet loss information, and claims higher real-time requirements. Therefore, an independent collection and analysis system is more suitable to monitor the real-time packet loss caused by congestion. The proposed structure of collection and analysis system is shown in Figure 4.

+--------------------------------------------------------------------------+ | Collection and analysis system | | | | +------------------+ +-------------------------+ | | | PLR measurement |<-----------------| Packet loss statistics | | | | module | | module | | | +--------^---------+ +---^---------------^-----+ | | | | | | | | | | | | +---------+--------------+ +----------------+ +-----------------+ | | | Measured traffic flows | | Packet parsing | |Packet loss data | | | | collection module | | module |<---|collection module| | | +------------------------+ +----------------+ +-----------------+ | +--------------------------------------------------------------------------+ The proposed structure of collection and analysis system mainly consists of five functional modules, which are described as follows. Packet loss data collection module: Accepts the packet loss data from network devices, including the telemetry data of loss information reported and loss files uploaded, and stores them for a specified time; records the number of discarded packets, the timestamp and location ID carried in the telemetry data every time it is reported. Measured traffic flows collection module: Accepts the telemetry data of the measured traffic flows reported from network ingress devices, and stores them for a specified time; records the number of received packets and the timestamp carried in the telemetry data every time it is reported. Packet parsing module: Leverages the professional packet parsing tools to make real-time resolution of discarded packets from packet loss files uploaded. Packet loss statistics module: Based on packet parsing results, counts the number of discarded packets belonging to different traffic flows; Based on packet loss information reported, counts the total number of the discarded packets of each device, each port and queue, and also records the time and location of loss occurrence. PLR measurement module: Based on the statistical data of the measured traffic flows reported periodically and the number of the discarded packets of the measured traffic flows, calculates PLR of the measured traffic flows according to the requirements of network operators (e.g., periodic measurement).

The discarded packets should be parsed as soon as possible to meet the real-time requirement of packet loss statistics and measurement. For the purpose of the real-time visibility of packet loss statistics as well as on-line PLR measurement, packet parsing time for the current uploaded packet loss file should be as little as possible, say, 100ms. The packet flow parsing of the discarded packets should at least include the measured traffic mentioned above, such as Layer 2/3 flows, Layer 2/3 VPN traffic, etc.

PLR measurement module can obtain the number of packets and timestamps carried in the telemetry data of the measured traffic flow from the measured traffic flows collection module. Meanwhile, it also can obtain the number of the discarded packets of the measured traffic flow and the timestamps carried in the loss information or carried in the packet loss file from packet loss statistics module. Therefore, based on the timing relationship between the timestamp carried in the telemetry data of the measured traffic flow and that of loss occurrence, as well as the number of received packets carried in the telemetry data of the measured traffic flow and the number of the discarded packets of the measured traffic flow, PLR measurement module can calculate the PLR of the measured traffic flow during a specified measurement period. For example, the collection and analysis system receives the previous telemetry data of the measured traffic flow carrying the number N1 of received packets and the timestamp T1, as well as the current telemetry data carrying the number N2 of received packets and the timestamp T2. Meanwhile, it also obtains the number N3 of the discarded packets of the measured traffic flow and the timestamp T3 carried in the packet loss file. If the timestamp T3 is between timestamp T1 and T2, then the PLR of the measured traffic flow for the current measurement period (T2-T1) is accurately calculated as: PLR = N3/(N2-N1) (1)

In summary, to monitor packet loss caused by congestion in real time and obtain accurate packet loss ratio results, the proposed architectural framework needs to meet the following functional requirements. [REQ-1] Network device is REQUIRED to support detecting packet loss caused by congestion at least every millisecond interval. [REQ-2] Network device is REQUIRED to report packet loss events in real time, i.e., immediately upon detection. and the reported telemetry data is REQUIRED to carry the timestamp of packet loss occurrence, the number of discarded packets, and the packet loss location such as device ID, port ID, and queue ID. [REQ-3] Network device is REQUIRED to support the capability to subscribe to periodic updates, e.g., to collect the statistical data of the monitored traffic flows and send the corresponding telemetry data to the collection and analysis system periodically. The subscription period shall be configurable as part of the subscription request. For periodic subscription, network device is RECOMMENDED to support the ability of redundant suppression, where a telemetry update should not be generated unless the value of the subscribed data objects has changed. [REQ-4] Network device is REQUIRED to support the capability to subscribe to updates on-change, i.e., whenever values of the subscribed data objects change. For example, a telemetry update is sent immediately when queue overflow packet loss counter value changes. For on-change subscription, network device is REQUIRED to support a dampening period that needs to pass before subsequent on-change updates are sent. The dampening period should be configurable as part of the subscription request. [REQ-5] Network device is REQUIRED to cache all discarded packets caused by queue overflow. For purpose of Packet loss statistics and analysis, network device is REQUIRED to record the time of packet loss occurrence, the number of discarded packets, and the packet loss location such as device ID, port ID, and queue ID. To reduce cache capacity, it is RECOMMENDED to truncate discarded packets to a fixed length (e.g., the first 64 bytes). [REQ-6] Network device is REQUIRED to upload all discarded packets as a file or compressed file in real-time manner. [REQ-7] Network device is REQUIRED to support time synchronization for measuring packet loss ratio caused by congestion, and time synchronization precision SHOULD be less than 50ms. [REQ-8] Collection and analysis system is REQUIRED to support parsing the header of all discarded packets to determine the flow attribute of every discarded packet, count the number of discarded packets of each traffic flow in a real-time manner. [REQ-9] Collection and analysis system is REQUIRED to support periodic measurement of PLR based on the total number of discarded packets divided by the total number of sent packets. Also, it is REQUIRED to support periodic measurement of PLR according to the number of the discarded packets divided by the number of the sent packets for the specified user traffic. [REQ-10] Collection and analysis system is REQUIRED to support visualization of data analysis for the discarded packets in the form of tables and figures, which are easily understandable by the operators.

In this section we consider three typical application scenarios to demonstrate the advantages of the proposed architectural framework for real-time monitoring of packet loss caused by congestion.

Leverage real-time packet loss detection module with the built-in dedicated hardware to read the queue overflow packet loss counter of every port at millisecond interval, and record the time and location of loss occurrence. Once the loss counter value changes, the telemetry data of packet loss will be reported to the collection and analysis system, which will immediately become aware of this. Based on the packet loss statistics collected, the operator (through some on-line smart analytical tool) can correlate the number of discarded packets with time of loss occurrence, and thus determinate whether it is long-lived or short-lived congestion that causes packet loss. For instance, if the increasing number of packet loss lasts for a very short time (e.g., a few milliseconds to tens of milliseconds), it might well be a microburst causing packet loss. At the same time, we can parse from loss files uploaded what traffic flows are contained in discarded packets and identify what traffic flows lead to microburst, so that we can take action to those culprits causing microburst. Therefore, the network operator can quickly pinpoint the congested node, improving the efficiency of fault diagnosis and root cause analysis. In addition, based on congestion state and trend of packet loss statistics, timely actions will be taken, e.g., redirecting the affected traffic flows to non-congested port, or making dynamic traffic adjustment to alleviate congestion, etc.

Congestion evaluation is of significant value for subsequent network planning, capacity expansion and optimization. It should be noted that the PLR is a classical indicator of reflecting network performance, but it cannot accurately reflect the network congestion level, since we do not exactly know the overall network packet loss caused by congestion. As mentioned above, existing monitoring techniques are not specially designed to monitor packet loss caused by congestion. In the proposed scheme, the PLR caused by congestion can be accurately calculated by the total number of discarded packets divided by the total number of the received packets by the network. No probe is required. In addition, we can obtain the average frequency and duration parameters for short-lived congestion occurrence on entire network within a day, based on which we can evaluate the degree of traffic bursts and expand network capacity accordingly.

The PLR is also a key indicator for SLA compliance and should be verified. In the proposed scheme, by configuring the packet counters for the specified user flows received on the ingress devices and making real-time parsing of the discarded packets for them, we can measure tens of thousands of service traffic flows simultaneously. Because the proposed scheme leverages a separate entity to handle packet parsing and loss statistics, the concurrent number of measured flows is not limited by network resources (e.g., computing, storage or bandwidth). Also, the data plane does not need to be modified to adapt to different transport protocols and monitoring techniques as existing measurement methods do (e.g., Alternate-Marking method defined in [RFC9343]for IPv6, [RFC9714]for MPLS, and [RFC9947] for SRv6).

This document has no IANA actions.

The congestion-induced loss monitoring system introduces additional traffic to the network. During network congestion, the monitoring system itself must not exacerbate the situation. Mechanisms such as rate limiting and traffic prioritization for congestion-related monitoring data should be considered. Also, some appropriate defense measures against Distributed Denial of Service (DDoS) attack are necessary to protect the data plane and control plane. This document does not specify security mechanisms, but highlights that any solution must consider trusted boundary regarding telemetry data subscriptions, telemetry data reporting, and protection of potentially sensitive operational data. These aspects are expected to be addressed by solution proposals based on deployment requirements and threat models.