Chinese Optics Letters, Volume. 23, Issue 5, 050001(2025)

Virtualization-enabled dynamic functional split and resource mapping in optical data center networks [Invited] On the Cover

Bo Tian1、*, Shanting Hu1、**, Qi Zhang2, Xiaofei Huang3, Lei Zhu2, Huan Chang1, Xiaolong Pan1, and Xiangjun Xin1
Author Affiliations
  • 1School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China
  • 2School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • 3Beijing Institute of Electronic System Engineering, Beijing 100584, China
  • show less

    The heterogeneity of applications and their divergent resource requirements lead to uneven traffic distribution and imbalanced resource utilization across data center networks (DCNs). We propose a fine-grained baseband function reallocation scheme in heterogeneous optical switching-based DCNs. A deep reinforcement learning-based functional split and resource mapping approach (DRL-BFM) is proposed to maximize throughput in high-load server racks by implementing load balancing in DCNs. The results demonstrate that DRL-BFM improves the throughput by 20.8%, 22.8%, and 29.8% on average compared to existing algorithms under different computational capacities, bandwidth constraints, and latency conditions, respectively.

    Keywords

    1. Introduction

    The tremendous increase in Internet traffic, mainly driven by large online applications such as high-resolution video streaming, virtual reality/augmented reality (VR/AR), and smart cities, leads to explosive demands on cloud computing in data center networks (DCNs)[1]. As the scale of DCN increases, traffic exhibits fast burstiness and high dynamics. Traditional electrical switches, however, struggle to support high-throughput, low-latency interactions due to electronic bottlenecks and power constraints[2]. Many optical switching-enabled technologies, such as semiconductor optical amplifiers (SOAs)[3], tunable lasers with arrayed wavelength grating (AWG) routers[4], and wavelength-selective switches (WSSs)[5], have been used to implement optical circuit switching. The high bandwidth of optical switches substantially lowers latency in hierarchical switching architectures. Additionally, switching data in the optical domain greatly decreases the power consumption by eliminating optical/electrical (OE) and electrical/optical (EO) conversions.

    The centralized radio access network (C-RAN) has been regarded as a promising architecture in ultradense cellular networks. The main innovation introduced by C-RAN is that the baseband functions of the legacy base station are incorporated in the base band unit (BBU) pool, which is hosted by DCNs[6]. With massive radio access network (RAN) applications spanning diverse categories and varying resource requirements, computing tasks and traffic fluctuate significantly over time, causing uneven traffic distribution and imbalanced resource utilization across DCNs. For instance, some racks become overloaded, leading to degraded quality of service (QoS), while others remain underutilized. Additionally, the vast number of interconnect links and network elements in DCNs complicates network management. Therefore, an efficient network reconfiguration method is crucial for managing and optimally reallocating network resources to achieve workload balance.

    However, migrating full-stack baseband functions across server racks requires substantial optical bandwidth, which limits flexibility in allocating computing and bandwidth resources. Network function virtualization (NFV) has been proposed as a solution to address these challenges by utilizing virtualization technology to consolidate various types of network equipment onto high-capacity servers, which can be deployed in data centers, network edges, and end-user premises. This enables the decoupling of virtual network functions (VNFs) from hardware devices, allowing them to operate as software applications on general-purpose hardware. The VNFs can be instantiated and mapped in different network locations, and they are typically interconnected through virtual networks, offering a level of flexibility that traditional hardware-based networking cannot provide. Standardization organizations such as the 3rd Generation Partnership Project (3GPP) and Next Generation Fronthaul Interface (NGFI) have highlighted the need to leverage virtualization-enabled functional splits to build scalable and cost-effective transport networks[7]. A virtual network embedding (VNE) algorithm is employed to jointly minimize cell interference and optimize bandwidth utilization by dynamically selecting optimal functional split points[8]. A joint optimization scheme for selecting functional split options and controlling power is proposed to maximize throughput[9]. A problem of isolation-aware RAN slice mapping is proposed that takes into account various functional splits within a wavelength division multiplexing (WDM) metro-aggregation network[10]. However, the use of integer linear programming (ILP) and heuristic algorithms makes it challenging to provide real-time and adaptive decisions.

    Machine learning and artificial intelligence (ML/AI) have been extensively investigated and successfully deployed to address critical challenges in optical networks, such as transmission quality diagnostics and network performance optimization. Within this framework, deep reinforcement learning (DRL) agents autonomously learn decision-making strategies by iteratively approximating value or policy functions through interactions with high-dimensional state spaces. A DRL-based method is proposed to jointly select and map VNFs to provide agile and flexible network services[11]. The routing, modulation, and spectrum assignment (RMSA) problem is modeled as a Markov decision process (MDP) for learning the optimal online policies in elastic optical networks[12]. A QoS-aware approach is presented based on a DRL agent and traffic prediction[13].

    In this paper, we model fine-grained baseband functions as a function chain (FC) and propose a DRL-based joint functional split and baseband function mapping algorithm (DRL-BFM) for DCNs. The objective is to maximize throughput in high-load racks by implementing load balancing through computational workload migration. Simulation results show that the proposed DRL-BFM algorithm improves average throughput by 20.8%, 22.8%, and 29.8% compared to benchmark algorithms under different computational capacities, bandwidth constraints, and latency requirements, respectively.

    2. System Model

    2.1 Heterogeneous optical switching-based DCN

    In this work, we consider a heterogeneous optical switching based DCN (HOS-DCN) architecture[14]. In Fig. 1, a HOS-DCN architecture comprises N clusters, and each cluster includes N server racks. Each rack is equipped with an optical top of rack (ToR) and hosts multiple servers. The intracluster traffic is routed through an N×N intracluster optical switch (IAOS), while the intercluster traffic is routed through an N×N intercluster optical switch (IEOS). Both the IAOS and IEOS are designed using SOA-based optical gates[15]. The ith IEOS is responsible for establishing optical connections between the ith racks across all clusters. Therefore, HOS-DCN enables both inter-/intracluster traffic migration via a single hop.

    Heterogeneous optical switching-based DCN architecture.

    Figure 1.Heterogeneous optical switching-based DCN architecture.

    In the control plane, we propose utilizing DRL agents to determine the placement of fine-grained FCs. These DRL agents are trained within the IAOS layer. Specifically, each IAOS gathers real-time status information from the server racks in its cluster, including available computational and bandwidth resources, which serve as input for training. Once trained, the IAOS distributes the DRL agents to each ToR within the cluster, ensuring uniform decision-making across all ToRs. Upon task arrival, the DRL agent at the ToR determines the optimal processing location for a function unit (FU). Additionally, the DRL agents configure both intracluster and intercluster optical switching, sending switching signals to the IAOS and IEOS. As a result, deploying an FC may require coordinated decision-making among DRL agents across multiple server racks.

    Furthermore, an optical ToR should have the capability to support both intracluster and intercluster traffic migration. Its internal structure is illustrated in Fig. 2. As shown, an optical ToR consists of multiple OE/EO conversion devices, along with optical multiplexers and demultiplexers. Optical demultiplexers separate optical signals of different wavelengths and direct them to the OE devices. These OE devices detect downstream optical traffic and convert it into electrical signals, which are then transmitted to the servers for processing. In the upstream direction, Ethernet frames generated by the servers are temporarily stored in dedicated electric buffers and forwarded to either IAOSs or IEOSs based on the switching forwarding table configured by the DRL agents.

    Internal structure of an optical ToR.

    Figure 2.Internal structure of an optical ToR.

    2.2 Fine-grained function chain

    In this section, the baseband functions and the RAN application are modeled as an FC, where the fine-grained FUs are sequentially processed, followed by the execution of the RAN application. In Fig. 3, we present the fine-grained functional split options. As shown, the full-stack RAN functions are divided into seven FUs F={f1,,f7}, where f1 includes transmission/receiving functions and low-PHY functions, which are executed in the remote radio head (RRH) to reduce the fronthaul bandwidth consumption, while f2 to f7 are executed in the BBU pool. Furthermore, these FUs are interconnected through virtual links L={l1,,l6}.

    Fine-grained functional split options.

    Figure 3.Fine-grained functional split options.

    The execution of each FU fiF of a RAN request r requires a certain number of giga operations per second (GOPS) cir. We also denote ρr as the processing load for the RAN application. The computational demand (i.e., GOPS) to execute full-stack RAN functions for request r can be defined as Cr=(3Nant+Nant2+13MrcrLr)NRB10,where Nant refers to the number of antennas used for request r in the RAN, Mr indicates the modulation bits, cr represents the code rate, Lr denotes the number of antenna layers, and NRB is the number of resource blocks Mr. A scaling factor, σi, is introduced to represent the proportion of the computational demand of fi relative to the full-stack RAN functions[16,17]. Thus, the computational demand (i.e., GOPS) of function fk is calculated as cir=σiCr.

    The execution of two consecutive FUs on two different racks will have specific bandwidth demands to map the virtual link between them. Let wli denote the data rate of the virtual link liL. The data rate of l1 is calculated as follows (Mbps): wl1=NSYMData·NscRB·Nant·NIQ/1000,where NSYMData represents the symbols per subframe and NIQ denotes the number of IQ bits. According to the 3GPP document[18], wl2 to wl6 follows a proportional relationship with wl1 and can be expressed as wl2i6=λiwl1.

    2.3 Resource mapping in HOS-DCNs

    In this section, we present the resource mapping framework managed by the NFV-management and orchestration (NFV-MANO) plane, as illustrated in Fig. 4. The framework consists of three layers: the service function layer, the physical infrastructure layer, and the virtualization layer. The service function layer defines radio processing functionalities as FUs, which are flexibly distributed across server racks. The physical infrastructure layer comprises computing, storage, and networking hardware, including servers, storage devices, and fiber links. The virtualization layer serves as a software platform, abstracting underlying physical resources into isolated virtual environments that provide virtualized computing, networking, and storage resources. The virtual environments for the virtual computing and communication resources are virtual machines (VMs) and virtual fiber links (VFLs), which can create isolated virtual environments for hosting different FUs and virtual links. In our work, it is assumed that each FU fiF is instantiated on a single VM #i, and each rack contains six types of VMs. Besides, each rack has N1 intracluster VFLs and N1 intercluster VFLs. A virtual link liL is instantiated on either an intracluster VFL #A or an intercluster VFL #E. To deploy an FC, fine-grained FUs and virtual links must be dynamically and orderly mapped together.

    Resource mapping framework in HOS-DCNs.

    Figure 4.Resource mapping framework in HOS-DCNs.

    In Fig. 4, we consider two FCs (FC1 and FC2) that initially arrive at Rack 1. Given the high workload in Rack 1, these FCs must be reallocated to other racks to balance the computational load and optimize resource utilization. Based on the control signals from the DRL agents, FC1 is divided into two parts, with FU2 mapped onto VM #2 and processed in Rack 1, while FU3 to FU7 are mapped onto VM #3 to VM #7 and processed in Rack 2. The virtual link between FU2 and FU3 is mapped onto VL #A1 in Rack 1. FC2 is divided into three parts, with FU2 mapped onto VM #2 and processed in Rack 1, while FU3 and FU4 are mapped onto VM #3 and VM #4 for processing in Rack 3. Additionally, FU5 to FU7 are mapped onto VM #5 to VM #7 for processing in Rack 4. The virtual link l2 of FC2 is mapped onto VL #E1 in Rack 1 and l4 is mapped onto VL #A1 in Rack 3.

    3. Proposed Algorithm

    In this section, we first define the optimization problem and formulate the dynamic functional split and mapping process as a Markov decision process (MDP). Subsequently, we introduce the DRL-BFM approach to execute the FC mapping process, utilizing the proximal policy optimization (PPO) algorithm to train the DRL agents effectively.

    3.1 Problem definition

    The dynamic functional split and mapping problem can be defined as: Given the DCN topology (including the locations of ToRs, IAOSs, and IEOSs), set of high-load server racks, set of service requests, computational capacity of server racks, and bandwidth capacity of fiber links interconnecting two ToRs; Decide the optimal mapping of FCs; To maximize the throughput of high-load server racks through balancing the DCN workloads; Subject to constraints such as QoS requirements, computational capacities, and bandwidth capacities. The mapping of fine-grained FUs and virtual links is a complex decision-making problem and is known to be NP-hard. In the following, we formulate an MDP and present a DRL-based approach to solve this problem.

    3.2 MDP formulation

    For a service request rR, the fine-grained baseband functions are modeled as an FC Lr={fr,1,fr,2,,fr,K}, where fr,k represents FU k. Let a(r) denote the source rack of r. In our problem, fr,1 is processed in RRHs, while the placement of fr,2 to fr,K is determined by DRL agents. Specifically, let Δr,k denote the rack where fr,k is processed, and the placement of its subsequent FU fr,k+1 is determined by the agent at Δr,k. We formulate this process as an MDP.

    1. 1)Observation: The observation should provide the DRL agents with enough information to accurately capture the state of the environment. The input state for allocating fr,k is denoted as Sr,k and can be expressed as Sr,k={Er,k,Cr,k,Tr,k,τr,k,gr,k,cr,k},where Er,k represents the set of computational capacities of N intracluster racks and N1 intercluster racks connecting to Δr,k1. Cr,k represents the set of bandwidth capacities for fiber links interconnecting Δr,k1 to N1 intracluster ToRs and to N1 intercluster ToRs. Tr,k represents the set of latency required for migrating fr,k from Δr,k1 to N1 intra-cluster ToRs and N1 intercluster ToRs. τr,k denotes the latency threshold for placing fr,k and its subsequent FUs. gr,k and cr,k represent the computational and bandwidth demands for allocating fr,k. Note that Er,k, Cr,k, and Tr,kτr,k are each normalized individually.
    2. 2)Action: The action space A includes all potential actions an agent can take within a given environment. In DRL-BFM, an action represents the allocation of fr,k to a specific rack, within the same cluster with Δr,k1 via IAOS or in different clusters with Δr,k1 via IEOS. The set A also includes action Ø, which signifies the termination of an FC when the constraints on computational capacity, bandwidth, and latency are violated.
    3. 3)State transition: After processing fr,k, the MDP transitions to a new state in one of two scenarios:
      • If fr,k is not the last FU in Lr, and none of the constraints are violated, then the action is valid. Thus, the new state Sr,k+1 needs to be set and used as input to the DRL agent at Δr,k for allocating fr,k+1. The changes in computational capacity, bandwidth capacity, and latency incurred by processing fr,k are updated, after which Sr,k+1 can be calculated.
      • If fr,k is the last FU in Lr, a new FC Lr+1 will begin to be allocated. Since fr+1,1 in Lr+1 is processed directly in the source rack, the next action is to process fr+1,2 based on a new state Sr+1,2. Similar to the previous scenario, resource changes incurred by processing fr,k are updated.
      • 4)Reward function: Let ar,k denote an action that Δr,k is selected to process fr,k. The reward value Rr,k for processing fr,k by taking ar,k is defined as Rr,k=α0dar,k+α1car,k+α2pt+α3pd,where ear,k represents the normalized value of selecting Δr,k in Er,k, and car,k represents the normalized value of the virtual link between Δr,k1 and Δr,k in Cr,k. If Δr,k1=Δr,k, car,k is calculated as the maximum value in Cr,k. Therefore, the greater the remaining computational capacity in Δr,k, the larger the value of ear,k. Similarly, a higher remaining bandwidth capacity on the selected virtual link results in an increased value of car,k. The third term, pt, and the fourth term, pd, represent penalties for exceeding latency thresholds and bandwidth capacities, respectively.

    3.3 DRL-BFM algorithm

    PPO is a policy gradient algorithm based on the actor-critic (AC) architecture and has been proven effective for training DRL agents[19]. The AC network consists of an actor module and a critic module. The actor module parameterizes the policy function to select optimal actions, while the critic module parameterizes the value function to evaluate the selected actions. In our work, PPO is used to train DRL agents for dynamic selecting functional splitting options and allocating FCs.

    Algorithm 1 presents the pseudo-code of the PPO-based training process. Service requests arrive sequentially. The computational demand gr,k for processing fr,k, the bandwidth demand cr,k for migrating fr,k, as well as the latency threshold trth, serve as inputs to Algorithm 1. Each cluster trains a single agent, and the weights of the actor and critic networks are initialized before the algorithm begins. Line 1 starts the main loop of the training process, which is divided into two parts. The first part outlines the process of generating samples through interaction with the environment (lines 2–19). The training is started by initializing the experience replay buffer and an iteration is terminated once the computational resources of the ESs are fully utilized (lines 2–3). To reallocate an FC r, fr,1 is processed in the source rack (lines 4–6). Next, identify the cluster where request r is currently located. The network controller will execute policy πθ to choose an action and obtain the reward (lines 7–14). Notably, the mapping process of different FUs within an FC r can be implemented by agents in different clusters. The state transition tuple is then collected and stored in the replay buffer (lines 15–17). The second part illustrates the learning process (lines 20–35). Let Bn denote the number of minibatches. For each minibatch b, the generalized advantage (GAE) is used to calculate the advantage function A^θ(t) (line 23). The state-action probability πθ(at|st) under the old policy is also calculated (line 24). Then, the strategy gradient LtCLIP(θ) and the value function estimator LtVF(ϕ) are calculated to update actor and critic module parameters, respectively (lines 26–31)[20]. When network configurations change significantly, agents must be retrained to adapt to new states or action spaces.

    • Table 1. Hyperparameters in the Learning Process

      Table 1. Hyperparameters in the Learning Process

      ParameterValueParameterValue
      Learning rate10−4/10−5Discount factor0.9
      Hidden layers3Neurons256/layer
      Minibatch size128Number of epochs10
      OptimizerAdamActivation functionReLU(·)
      Output functionSoftmaxCoefficient of LtCLIP(θ)2
      Coefficient of LtVF(ϕ)0.05Coefficient of H (πθ, st)0.01

    Once the training is complete, the DRL agents are deployed in the ToRs, and the DRL-BFM algorithm is applied to jointly manage the functional split and execute the mapping process of fine-grained FUs in HOS-DCNs. The pseudo-code of the DRL-BFM algorithm is described in Algorithm 2. Service requests R and trained DRL agents serve as inputs to the DRL-BFM algorithm. Before executing each FU fr,k, the current cluster i is identified (line 3). Then, the network state for processing fr,k is calculated and fed into the DRL agent πθi, which determines the rack Δr,k where fr,k should be executed (lines 4–5). Subsequently, fr,k is mapped onto VM #k and processed in Δr,k and the remaining computational capacity of Δr,k is updated (lines 7–8). If Δr,k and Δr,k1 are not the same and belong to the same cluster, the virtual link between fr,k1 and fr,k is mapped onto the VAL in Δr,k1 (lines 9-12). On the other hand, if Δr,k and Δr,k1 belong to different clusters, the virtual link between fr,k1 and fr,k is mapped onto the VEL in Δr,k1 (lines 13–16). Additionally, if ar,k is not a valid action (i.e., latency or bandwidth requirements are not met), fr,k will continue to be executed in Δr,k (lines 18–20).

    • Table 2. DRL-BFM

      Table 2. DRL-BFM

      Input: Request set R, trained DRL parameters πθi,iN
      1: forrRdo
      2:  fork=2,3,,7 do
      3:   identify the current cluster i
      4:   calculate the network state Sr,k
      5:   run policy πθi to select an action ar,k
      6:   ifar,k is valid
      7:    map fr,k onto VM #k and process it in Δr,k
      8:   update computational capacity in Δr,k
      9:   ifΔf,k1Δf,kthen
      10:    ifΔr,k1 and Δr,k are in the same cluster then
      11:     map the virtual link between fr,k1 and fr,k onto VL #An in Δr,k1
      12:     update the bandwidth capacity of VL #An
      13:    else
      14:    map the virtual link between fr,k1 and fr,k onto VL #En in Δr,k1.
      15:    update the bandwidth capacity of VL #En
      16:    end if
      17:   end if
      18:  else
      19:   map fr,k onto VM #k and process it in Δr,k1
      20:  end if
      21: end for
      22: end for

    In Algorithm 2, the FUs of each service are executed sequentially. An FU can either be processed on the same server as the previous FU or migrated to another server within the same cluster or across clusters. When two consecutive FUs are executed on different servers, functional splitting occurs, necessitating the mapping of their virtual link. This flexible functional splitting allows any two FUs to be split, enabling dynamic adaptation to resource availability. Once a split occurs, both the FU placement and the corresponding virtual link mapping must be optimized to ensure efficient resource utilization and seamless service execution.

    4. Performance Evaluation

    In this section, the learning environment is constructed using OpenAI Gym, and both the actor and critic networks are trained with PyTorch 2.0.1. Service requests are transmitted from RRHs to the BBU pool (hosted at server racks within the DCN) sequentially. We assume that the service arrival rate in high-load racks is 5 times higher than that in other racks. The number of resource blocks NRB per task is selected from [50,100], and the number of antennas Nant per task is randomly selected from [2,4]. Several studies present parameters for estimating the computational and bandwidth demands for different functional splits[9]. Initially, the workload in high-load racks is set between [2000,2500] GOPS, while in other racks, it ranges between [1000,1500] GOPS. The initial data rate for the fiber link interconnecting two ToRs is set between [0.5,1] Gbps. The hyperparameters for training policy and critic networks are summarized in Table 1.

    • Table 1. Hyperparameters in the Learning Process

      Table 1. Hyperparameters in the Learning Process

      ParameterValueParameterValue
      Learning rate10−4/10−5Discount factor0.9
      Hidden layers3Neurons256/layer
      Minibatch size128Number of epochs10
      OptimizerAdamActivation functionReLU(·)
      Output functionSoftmaxCoefficient of LtCLIP(θ)2
      Coefficient of LtVF(ϕ)0.05Coefficient of H (πθ, st)0.01

    We evaluate the performance of DRL-BFM by comparing it with the following heuristic algorithms: i) RBM (random-based mapping): Randomly selects a rack from those meeting the resource requirements for FU mapping. ii) LCBM (least-loaded computing-based mapping): Selects the least-loaded rack that satisfies the resource demands for processing an FU. iii) LBFO (least-loaded bandwidth-focused optimization): Chooses the rack that is connected via the least-loaded fiber link to the current rack while meeting the computing resource requirements.

    Figure 5 illustrates the throughput in high-load racks, measured by the number of successfully processed service requests as a function of increasing computational capacities using DRL-BFM, RBM, LCBM, and LBFO algorithms. As shown, DRL-BFM achieves an average throughput increase of 30.6%, 5.5%, and 26.4% compared to RBM, LCBM, and LBFO algorithms, respectively. Furthermore, as computational capacity increases, the performance gap between DRL-BFM and the other algorithms (RBM, LCBM, and LBFO) widens. This is because a higher computational capacity allows more FCs to migrate between ToRs, thereby amplifying the constraints imposed by fiber bandwidth capacities. These results demonstrate that our proposed DRL-BFM effectively improves throughput in high-load server racks under varying computational capacities.

    Comparison of throughput with increasing computational capacities using DRL-BFM, RBM, LCBM, and LBFO.

    Figure 5.Comparison of throughput with increasing computational capacities using DRL-BFM, RBM, LCBM, and LBFO.

    Figure 6 shows the throughput in high-load racks with increasing bandwidth capacities using DRL-BFM, RBM, LCBM, and LBFO. As shown, the computational capacity is set to 7000 GOPS. DRL-BFM achieves an average throughput improvement of 32%, 4.5%, and 32.1% compared to RBM, LCBM, and LBFO, respectively. It can be observed that as the bandwidth capacity increases to 6.5 Gbps, the throughput performance gap between LCBM and DRL-BFM gradually narrows. This is because higher computational capacity enables more FCs to migrate between ToRs, which in turn intensifies the constraints imposed by fiber bandwidth limitations. The results demonstrate that DRL-BFM effectively enhances throughput in high-load server racks under varying bandwidth capacities.

    Comparison of throughput with increasing bandwidth capacities using DRL-BFM, RBM, LCBM, and LBFO.

    Figure 6.Comparison of throughput with increasing bandwidth capacities using DRL-BFM, RBM, LCBM, and LBFO.

    High-latency services allow traffic to be reallocated among server racks. Figure 7 shows the throughput in high-load racks with increasing ratios of high-latency services. The computational and bandwidth capacities are set to 7000 GOPS and 4 Gbps, respectively. The results show that DRL-BFM improves the throughput by 38.1%, 11.5%, and 40.0% compared to RBM, LCBM, and LBFO algorithms, respectively. It can also be observed that as the ratio of high-latency services increases, the performance gap between DRL-BFM and the other algorithms widens. These results indicate that the proposed DRL-BFM effectively enables traffic migration for a larger number of services, enhancing throughput in high-load server racks.

    Comparison of throughput with increasing ratios of high-latency services using DRL-BFM, RBM, LCBM, and LBFO.

    Figure 7.Comparison of throughput with increasing ratios of high-latency services using DRL-BFM, RBM, LCBM, and LBFO.

    We use the load balancing factor (LBF), which quantifies the average relative deviation of the workloads across all racks, to reflect the load balancing performance in DCNs. Figure 8 shows the LBF values as the number of service requests increases using the DRL-BFM, RBM, LCBM, and LBFO algorithms. Each algorithm runs 10 times, and the average results are presented. As shown, LCBM achieves an average LBF that is 5.1%, 44.7%, and 47.3% lower than that calculated by DRL-BFM, RBM, and LBFO, respectively. The reason LCBM outperforms the other algorithms is that it greedily selects low-load server racks for offloading FCs. The small gap between DRL-BFM and LCBM indicates that the proposed DRL-BFM approach effectively balances DCN workloads while improving throughput.

    Load balancing performance using DRL-BFM, RBM, LCBM, and LBFO.

    Figure 8.Load balancing performance using DRL-BFM, RBM, LCBM, and LBFO.

    5. Conclusion

    In summary, we have developed a DRL-BFM approach to jointly select functional splits and map fine-grained baseband functions in a heterogeneous optical switching-based DCN. We conducted simulations to compare the throughput and load-balancing performance of DRL-BFM with other heuristic algorithms under varying computational capacities, bandwidth capacities, and latency conditions. The results demonstrate that the proposed DRL-BFM approach achieves higher throughput in high-load racks while effectively balancing the DCN workloads.

    [19] J. Schulman, F. Wolski, P. Dhariwal et al. Proximal policy optimization algorithms(2017).

    [20] J. Ziazet, M. Junio, B. Jaumard. Deep reinforcement learning for network provisioning in elastic optical networks. IEEE International Conference on Communications (ICC), 1(2022).

    Tools

    Get Citation

    Copy Citation Text

    Bo Tian, Shanting Hu, Qi Zhang, Xiaofei Huang, Lei Zhu, Huan Chang, Xiaolong Pan, Xiangjun Xin, "Virtualization-enabled dynamic functional split and resource mapping in optical data center networks [Invited]," Chin. Opt. Lett. 23, 050001 (2025)

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Special issue on optical interconnect and integrated photonic chip technologies for hyper-scale computing

    Received: Sep. 6, 2024

    Accepted: Mar. 26, 2025

    Published Online: May. 9, 2025

    The Author Email: Bo Tian (tianbo@bit.edu.cn), Shanting Hu (hushanting@bit.edu.cn)

    DOI:10.3788/COL202523.050001

    CSTR:32184.14.COL202523.050001

    Topics