Distributed Fronthaul-Constrained Joint Transmission Design and Selection Using Augmented Consensus-based Dual Decomposition

Mykola Servetnyk; Carrson C. Fung

doi:10.23919/JCN.2022.000030

ISSN: 1976-5541

Volume 24, No 4 (2022), pp. 419 - 437

10.23919/JCN.2022.000030

Mykola Servetnyk and Carrson C. Fung

Distributed Fronthaul-Constrained Joint Transmission Design and Selection Using Augmented Consensus-based Dual Decomposition

Abstract: User-centric coordinated multipoint (CoMP) joint transmission (JT) is a novel technique to manage interference and enhance system performance with single frequency reuse, where user equipment (UE) communicates with their closest transmission points (TPs). Unfortunately, in coherent JT, requirement of strict network synchronization accuracy makes it difficult and expensive to be practically deployed. The noncoherent JT has therefore received growing attention since it requires less strict network synchronization accuracy compared to its coherent counterpart as it does not require the signal to be phase-aligned at the receiver. Moreover, cell-free massive MIMO, which is regarded as combining CoMP with massive MIMO systems, has recently been touted as a solution for avoiding intercell interference and provide uniform coverage over a large area. However, the operational costs of CoMP, such as the associated control signaling and communication overhead and the increase of network complexity, could prevent the practical implementation of CoMP. A distributed joint transmission CoMP (JT-CoMP) scheme is proposed herein that allows distributed design and selection of cooperating transmission nodes. To maximize user capacity, the proposed distributed consensus optimization problem assumes spectrum underlay transmission is used so that the solution can achieve non-orthogonal multiple access (NOMA) for cell-free user centric JT-CoMP systems. The proposed algorithm is different from others in the literature because it solves a design problem that involves a coupling constraint that no existing algorithm can solve. Analytical results based on spectral graph theory are given to prove its convergence and characterize its rate of convergence. The more practical scenario is further considered, where limited fronthaul capacity is also included in the problem. A successive convex approximation (SCA) method is used to solve the resulting nonconvex problem, which is shown to maximize spectral efficiency. Simulation results are provided to show that the performance of both proposed distributed algorithms (that address problem without and with fronthaul constraint) is comparable to its centralized counterpart.

Keywords: Distributed consensus algorithm , fronthaul constraint , NOMA , transmission point selection , user centric cell-free cooperative MIMO

I. INTRODUCTION

AS spectral efficiency of point-to-point transmission in cellular networks approaches its theoretical limits, to further improve network capacity and ease network management, traditional cellular networks are evolving into cell-free massive MIMO systems, regarded as a combination between user-centric coordinated multipoint (CoMP) [1] and traditional massive multiple-input multiple-output (MIMO) systems [2]. In this scenario, smarter resource coordination among transmission points (TPs) for efficient interference management and transmission coordination can provide substantial gains in throughput and user experience. CoMP allows joint transmission (JT) from cooperating TPs, which gives the ability to simultaneously transmit data to each user equipment (UE) by sharing the same channel resources, such as time and frequency, to boost reception performance. Qualcomm has implemented a 5G CoMP testbed [3] which demonstrated that system capacity can increase by four-fold when CoMP spatial multiplexing was used, thus proving CoMP’s potential. However, finding an optimal solution for joint TP selection and transmission design problem is known to be NP-hard, because it requires search for the determination of the cooperating sets. Besides, the operational costs of JT-CoMP, prior phase alignment and tight synchronization to the target user requirements often hinder its implementation in industrial scale [4], not to mention communication overhead and the increase of network complexity.

Coordinated beamforming/scheduling and dynamic point selection have been investigated as alternatives which require less communication overhead compared to JT, but without achieving the same transmission performance as JT. There are two possible approaches to tackle the joint TP selection and transmission design problem: centralized and distributed. Among the advantages of the centralized approach is its relative simplicity in algorithm implementation and the possibility to directly use the interior point method (IPM) to solve the joint design problem. However, to take full advantage, the centralized server is required to handle high amount of computations with high throughput communication backhaul to every TPs. Among the data transmitted is the channel state information (CSI) from the TPs to the server and precoder coefficients from the server to the TPs. It is possible to satisfy such a requirement for small networks but this is not practical as the system scales up. Several joint transmit and TP selection design strategies have been proposed in the literature to reduce signaling overhead. [5] tackled the joint precoder and clustering problem by formulating the problem as a mixed-integer convex problem which minimizes the transmit energy subject to signal-to-interference-plus-noise quality-ofservice (SINR QoS) constraint, where convex formulation was obtained upon a reformulation of the constraint. However, the approach only applies to the case of a single receive antenna. Two algorithms were proposed in [6] with similar problem formulation, where one is based on iterative reweighed [TeX:] $$\ell_{1}-$$ norm minimization, and the other is based on solving the [TeX:] $$\ell_{2}-$$ norm relaxed problem and then iteratively removing the links that correspond to the smallest transmit power. [7] extended aforementioned designs by including per TP backhaul constraint and solved the design problem by customized branch and the bound algorithm applied to discrete monotonic optimization. [8] proposed to minimize a sum transmit signal power subject to the SINR QoS constraint similar to [5] with an additional fronthaul capacity constraint. Proposed problem formulation falls into a category of the mixed integer second order cone problem and can be solved with continuous relaxation, or the proposed inflation algorithm.

In addition, none of the aforementioned works are appropriate for large-sized networks due to computational complexity at the central unit that increases proportionally to the number of TPs in the network as was mentioned before. [9] solved the distributed precoder design problem by maximizing the weighted sum rate via minimization of the weighted sum meansquared- error (WMMSE) proposed in [10], [11]. [12] developed a CSI signaling technique for coordinated beamforming for time division duplex (TDD) multi-cell system by also solving the weighted sum rate problem by reformulating the problem into minimizing the weighted sum mean-squared-error, making implementation of this CoMP method feasible. A solution for intercell interference level management for weighted sum rate problem was proposed in [13] using primal decomposition for multiple-input single-output (MISO) systems with SINR constraints. The nonconvex problem was decomposed into a master problem with several subproblems which, unlike the scheme in [9], only involves coordination between cooperating base stations. However, solving for the solution of the master problem requires synchronous base stations. [14] proposed to use dual decomposition method to decentralize interference management among adjacent cells, which was extended by [15] using the alternating direction of method of multiplier based decentralized beamforming design. [16] used the WMMSE technique in [9] and extend the linear transceiver design problem for JT-CoMP. [17] has embedded rate QoS constraint into the problem formulation and was able to solve the problem in a way similar to [9], [16] via successive convex approximation to turn one of the subproblems into convex form. [18] proposed a distributed precoding design scheme for JT-CoMP that requires coordination among cooperating base stations using backhaul signaling where correlation information among base stations, known as cross-term information, are exchanged. Unfortunately, the signaling scheme does not scale well with system size. [19] proposed distributed precoding scheme for cell-free massive MIMO system by minimizing weighted sum mean-squared-error, where a new over-theair signaling scheme was proposed to exchange cross-term information that scales well with system size compared to the backhaul signaling scheme in [18]. Since the quality of overthe- air channel will usually be worse than that of the backhaul, this may adversely affect the quality of the precoder. Table I summarizes the methods above.

During the distributed design, agents in the network coordinate as a swarm to perform a certain task and reach consensus. [20] provided a theoretical framework for a wide range of consensus algorithms for multiagent networked systems. [21] and [22] have developed distributed algorithms based on consensus-based primal-dual decomposition and [23] proposed algorithm based on Newton’s method. However, [21]–[23] have limitation in the form of the constraint to be [TeX:] $$\sum_{q} g^{q}\left(x^{q}\right) \leq 0,$$ where q is the node index, [TeX:] $$x^{q}$$ is the variable at node q and [TeX:] $$g^{q}\left(x^{q}\right)$$ is local constraint function at the qth node. Recently, [24] has extended the algorithm in [22] to be applied to a constraint of the form [TeX:] $$\sum_{q} g^{q}\left(x^{q}\right)=L,$$ where author interprets L as a resource shared among nodes. Equality constraint implies that after optimization all resources must be used. However, iPt is more meaningful to replace equality with inequality [TeX:] $$\sum_{q} g^{q}\left(x^{q}\right) \leq L$$ to emphasize the amount of used resource can be flexible. Recently, [25] has considered a problem with the coupling constraint being an inequality and proposed an algorithm that converges to a fixed point. The main focus of that work is to solve a problem where the constraint graph does not necessarily have to equal to the physical network graph. However, global constraint functions [TeX:] $$g^{q}\left(x^{q}\right)$$ are limited to be an affine transformation. Moreover, problem formulation in [25] does not include local constraint, which significantly simplifies algorithm and convergence analysis.

In this work, a distributed consensus algorithm using dual decomposition, which circumvents the previously mentioned limitations using an adaptive “resource” allocation scheme, is proposed. The proposed scheme can be effectively applied to the distributed joint transmission and selection problem where each TP is modeled as a processing node in the network. Moreover, the amount of communication overhead does not scale with the number of transmission points. The problem of CSI overhead and data sharing among transmission points is not considered herein. The internode communications are modeled as an undirected graph, where the resulting graph Laplacian is used to prove the convergence of the proposed technique. This proof augments the ones from [21], [22], [24], [26], [27] and extends them with the incorporation of the analysis of the resource allocation scheme. In addition, none of the distributed CoMP schemes above considered nonorthogonal multiple access (NOMA) where transmitters can transmit simultaneously using the same time, frequency and space. This allows more users to be served and enhanced spectrum efficiency. Spectrum underlay techniques [28]–[30] are used herein where transceivers behave altruistically such that transmission between one pair of transceiver does not cause noticeable interference to other users (considered as victims). The interference can be managed by constraining the transmit power emitted by the transmitter toward its victims such that victims are oblivious to other transmissions. Finally, to relax the stringent requirement of coherent JT, noncoherent JT is considered. The main contributions and novelties of this work are as follows:

This work attempts to jointly optimize the coordination clusters and linear precoders in the network by solving a nonconvex programming problem. Solving the problem allows users to transmit non-orthogonally, enabling cellfree user centric JT-CoMP systems [2], thus increasing user capacity compared to orthogonal multiple access techniques. The resulting solution maximizes received signal power at the UEs, which in return increases the overall system sum rate.

Table I

COMPARISON OF DIFFERENT COMP DESIGNS.

	Objective	Limitations
[5]	Transmit power	Limited to single antenna receivers
[6]	Reweighted [TeX:] $$\ell_{1} \text { and } \ell_{2}$$ relaxed of transmit power	Limited to single antenna receivers
[7]	Sum rate	Requires branch and bound, slow convergence
[8]	Total power consumption	Use of MISOCP solver (high computational complexity) or continuous relaxation
[9]	Weighted sum rate	Design for coordinated beamforming, misestimation of received signal covariance matrix can degrade performance
[12]	Weighted sum rate	Design for coordinated beamforming, guarantees to only obtain a stationary solution of the WMMSE problem
[13]	Weighted sum rate	Design for coordinated beamforming, requires synchronous base stations to solve master problem
[14]	Total transmit power	Requires consistency among interference at different base stations, requires single antenna receivers
[15]	Weighted total transmit power	Design for coordinated beamforming, signaling between base stations may be high for large system
[16]	Weighted sum rate, proportional fairness	Guarantees to only obtain a stationary solution of the WMMSE problem
[17]	Weighted sum-rate	Local optimal solution is obtained using SCA. QoS constraint may not be satisfied for fast-fading channels
[18]	Weighted sum rate	Requires backhaul signaling that does not scale well with system size
[19]	Weighted sum mean squared error	Quality of over-the-air channel may adversely impact precoder design

The transformed problem can be regarded as a multiagent convex problem with coupling constraints including one of the form [TeX:] $$\sum_{q} g^{q}\left(x^{q}\right) \leq L.$$ An efficient distributed algorithm that computes a stationary solution to the transformed problem is proposed.

The proposed distributed algorithm generalizes the works in [21], [22], [24] as they failed to handle [TeX:] $$\sum_{q} g^{q}\left(x^{q}\right) \leq L$$ type of constraints. Analytical results prove the convergence property of the proposed algorithm to a stationary solution of the transformed problem. It is important to solve the problem distributively with inequality constraints as the considered problem with equality constraints is often infeasible.

Algorithm is further extended to incorporate the fronthaul constraint as part of the design. Proposed consensus based primal dual decomposition method is augmented with successive convex approximation (SCA) and alternating direction method of multipliers (ADMM) and allows for sophisticated fronthaul resource allocation.

The convergence and effectiveness of the proposed algorithm have been theoretically validated and evaluated via extensive simulations.

The rest of the paper is organized as follows. In Section II, system model is presented. The joint TP selection and precoder design problem is formulated and then transformed into convex form in Section III. The proposed efficient distributed algorithm is proposed in Section IV to solve the transformed problem. The algorithm convergence properties and underlying assumptions are stated in Section V. Extension of the proposed distributed algorithm to fronthaul-constrained transmission nodes is presented in Section VI. Discussion about proposed algorithms’ complexity and system overhead is given in Section VII, followed by numerical examples and analytical results to validate the proposed algorithms in Section IX. The paper is concluded in Section X and proofs of the theorems are given in the Appendix.

Table II

NOTATIONS.

Symbol	Definition
B	Number of spatial streams.
[TeX:] $$c^{(n)}$$	Dual consensus/objective balance coefficient.
[TeX:] $$C^{q}$$	qth TP’s fronthaul capacity threshold.
[TeX:] $$C^{q} / \widetilde{C} q$$	Local constraint set of qth TP.
[TeX:] $$f^{q}(\cdot)$$	qth TP’s objective function.
[TeX:] $$\mathbf{F}_{i}^{q}$$	Precoder from qth TP to ith UE.
[TeX:] $$\mathrm{g}^{q}(\cdot)$$	qth TP’s coupling constraint term.
[TeX:] $$\mathbf{H}_{i}^{q}$$	Channel from qth TP to ith UE.
i,j	UEs subscript indices.
I	Total number of users
[TeX:] $$I_{t h}$$	Instantaneous interference leakage power constraint threshold
[TeX:] $$\mathrm{Q}_{i}^{q}$$	Precoder variable from qth TP to ith UE.
[TeX:] $$n_{T} / n_{R}$$	Number of Tx/Rx antennas.
[TeX:] $$L_{i j}^{q(n)}$$	Instantaneous leakage interference power at nth iteration.
[TeX:] $$\mathcal{N}^{q}$$	qth TP’s neighborhood set.
[TeX:] $$\mathcal{Q}$$	TPs cooperating set, i.e., all TPs.
q,r	TPs superscript indices.
[TeX:] $$S_{i j}^{q}$$	Slack leakage interference power for qth TP on ij interference link.
W	Consensus matrix.
[TeX:] $$[\mathbf{W}]_{r q}$$	Weight between TPs q and r.
[TeX:] $$\alpha$$	Fronthaul penalty parameter.
[TeX:] $$\lambda^{q}$$	Dual variable for interference leakage constraint.
[TeX:] $$\ell^{q}$$	Consensus dual variable for interference leakage constraint.
[TeX:] $$\sigma^{2}$$	Noise power
[TeX:] $$\varphi$$	Number of consensus iterations in [TeX:] $$W^{\varphi}$$

Notations: Uppercase (lowercase) bold face letters indicate matrices (column vectors). [TeX:] $$\text { Superscript }^{H}$$ designates A as a symmetric positive semidefinite matrix, [TeX:] $$\mathbf{1}_{M}$$ denotes an [TeX:] $$M \times 1$$ vector, containing 1 in all of its entries. [TeX:] $$[\mathbf{A}]_{m n}$$ denotes the (m, n)th element of [TeX:] $$\text { A. }\|\cdot\|_{0},\|\cdot\|_{2} \text {, and }\|\cdot\|_{F} \text { denote } \ell_{0}, \ell_{2}$$ and Frobenius norm, respectively. [TeX:] $$|\mathbf{A}|$$ denotes the elementwise magnitude value of A. [TeX:] $$\mathcal{P}_{+}[\cdot]$$ denotes the projection operator that projects the argument onto [TeX:] $$\mathbb{R}_{+}.$$ The superscripts q(n),r(n) on [TeX:] $$\mathbf{Q}_{i}^{q(n)}, \mathbf{Q}_{j}^{r(n)}$$ are used to denote the matrix Q at the q, rth TP and at the nth global iteration.

Fig. 1.

System model.

II. SYSTEM MODEL

A depiction of the system model under consideration is shown in Fig. 1 and described below. Assume the network consists of a set of TPs [TeX:] $$\mathcal{Q}=\{1, \cdots, Q\},$$ known as the cooperating set, each having [TeX:] $$n_{T}$$ transmit antennas. [TeX:] $$\mathcal{I}=\{1, \cdots, I\}$$ denotes a set of UEs that should be served by subset of [TeX:] $$\mathcal{Q},$$ with each user equipped with [TeX:] $$n_{R}$$ receive antennas. Data from UEs are delivered to each TP from a serving gate through a fronthaul link (labeled in yellow) with limited capacity and transmitted through a serving link (green and blue arrows). The fundamental idea of JT-CoMP is to simultaneously serve multiple users over the same spectrum resources (time, frequency, and space) at the expense of inter-user interference (red beams). It is assumed that TPs can estimate channel through uplink pilot sequence and share it with cooperating set through X2 interface [31]. Denote channel between the qth TP and the ith user as [TeX:] $$\mathbf{H}_{i}^{q} \in \mathbb{C}^{n_{R} \times n_{T}}.$$ Furthermore, precoder matrix from the qth TP to the ith user is defined as [TeX:] $$\mathbf{F}_{i}^{q} \in \mathbb{C}^{n_{T} \times B} \text {, where } B \leq n_{T}$$ denotes number of spatial streams. The received signal for user i can thus be written as [TeX:] $$\mathbf{y}_{i}=\sum_{q} \mathbf{H}_{i}^{q} \mathbf{F}_{i}^{q} \mathbf{s}_{i}+\sum_{j \neq i} \sum_{q} \mathbf{H}_{i}^{q} \mathbf{F}_{j}^{q} \mathbf{s}_{j}+\mathbf{n}_{i},$$ where the second term represents interference and [TeX:] $$\mathbf{n}_{i}$$ denotes additive white Gaussian noise (AWGN) with known variance [TeX:] $$\sigma_{i}^{2}.$$ It is assumed that the noise is uncorrelated with the data signal [TeX:] $$\mathbf{s}_{i} \in \mathbb{C}^{B} \text { with } \mathbb{E}\left[\mathbf{s}_{i}^{H} \mathbf{s}_{i}\right]=1.$$ Major notations used throughout this paper can be found in Table II.

In coherent JT, based on the CSI shared among all cooperating set, all TPs transmit signals that are jointly precoded with prior phase alignment and tight synchronization to the target user [32]. The requirement of strict network synchronization accuracy makes it difficult and expensive to be practically deployed. The noncoherent JT has therefore received growing attention since it requires less strict network synchronization accuracy compared to its coherent counterpart. It, however, requires more complex operation at the receiver because the UEs need to successfully decode all received signals [33] and combine them such that the overall received signal power is equal to the sum of received signal powers from each TP. Nevertheless, due to its advantages, the noncoherent JT scenario is considered, which led to the problem formulation below.

III. PROBLEM FORMULATION

A. Joint TP Selection and Precoder Design Objective

In the system model considered hereafter, a user receives a noncoherent sum of multiple data streams of the useful signal transmitted by the cooperating TPs (i.e., noncoherent JT). Therefore, the UEs’ rate equals to a sum of rates from individual TPs. Due to the monotonic relationship between received signal power and SINR, direct dependence of achievable throughput on SINR, and the use of spectrum underlay transmission, the problem is formulated as maximization of a sum of the received signal power subject to the instantaneous leakage interference power being constrained below the threshold parameter [TeX:] $$I_{t h},$$ which is usually obtained from long-term statistical measures or link budget to guarantee successful communication between TP and UE [34] for all users. Notably, for noncoherent JT-CoMP, received signal power is represented as sum of powers of each TP-UE link. In contrast, for the coherent version, the power would be written as power of sum of the signals as signals are phase-aligned. However, the leakage interference constraint is still not amenable to distributed design, as will be shown in the sequel. Finally, the transmit power of the [TeX:] $$q \in \mathcal{Q}$$ TP is limited below [TeX:] $$P^{q}.$$ However, an increasing number of cooperating TPs will impact operational complexity and create extra load on the fronthaul infrastructure, which has to deliver each UEs’ data to each TP in the cooperating set. This can preclude usage of the cooperative scheme in practice. The number of precoding elements at all TPs selected to be in the cooperation set can be done by including a sparsity inducing [TeX:] $$\ell_{0}$$ regularization term in the objective. Hence, the problem can be formulated as

(1a)

[TeX:] $$\max _{\mathbf{F}_{i}^{q}, i \in \mathcal{I}, q \in \mathcal{Q}} \sum_{i} \sum_{q}\left\|\mathbf{H}_{i}^{q} \mathbf{F}_{i}^{q}\right\|^{2}-\alpha\left\|\mathbf{F}_{i}^{q}\right\|_{0}^{2}$$

(1b)

[TeX:] $$\text { s.t. } \sum_{q}\left\|\mathbf{H}_{j}^{q} \mathbf{F}_{i}^{q}\right\|^{2} \leq I_{t h}, i, j \in \mathcal{I}: i \neq j$$

(1c)

[TeX:] $$\sum_{i}\left\|\mathbf{F}_{i}^{q}\right\|^{2} \leq P^{q}, q \in \mathcal{Q},$$

where [TeX:] $$\alpha \in \mathbb{R}_{+}$$ is a regularization parameter that represents the associated cost for assigning TP to a UE and shall be referred to as the fronthaul penalty parameter. An increase in [TeX:] $$\alpha$$ will undoubtedly promote sparsity in the precoder vector, thus lowering operating cost. Although many NOMA techniques have been proposed in literature, only spectrum underlay is considered as the focus of this work is on distributed usercentric noncoherent JT-CoMP.

B. Problem Reformulation

It can be observed that (1) is not a convex problem because the objective consists of sum of convex and nonconvex terms and optimal solution cannot be found in polynomial time. This can be easily circumvented by defining [TeX:] $$\mathbf{Q}_{i}^{q} \triangleq \mathbf{F}_{i}^{q} \mathbf{F}_{i}^{q H}$$ such that received signal power for user i becomes [TeX:] $$\sum_{q} \operatorname{tr}\left(\mathbf{H}_{i}^{q} \mathbf{Q}_{i}^{q} \mathbf{H}_{i}^{q H}\right)$$ and all terms inside the constraints can be rewritten in similar fashion. Finally, relaxing the [TeX:] $$\ell_{0}$$ norm by replacing it with the [TeX:] $$\ell_{1}$$ norm and using rank relaxation on [TeX:] $$\mathbf{Q}_{i}^{q}$$ [35], i.e., removing the constraint [TeX:] $$\operatorname{rank}\left(\mathbf{Q}_{i}^{q}\right)=B,$$ the complete design problem can be written as

(2a)

[TeX:] $$\max _{\boldsymbol{\Omega}} \sum_{i} \sum_{q} \operatorname{tr}\left(\mathbf{H}_{i}^{q} \mathbf{Q}_{i}^{q} \mathbf{H}_{i}^{q H}\right)-\alpha \mathbf{1}_{n_{T}}^{T}\left|\mathbf{Q}_{i}^{q}\right| \mathbf{1}_{n_{T}}$$

Algorithm 1

Distributed consensus-based dual decomposition joint TP design and selection.

(2b)

[TeX:] $$\text { s.t. } \sum_{q} \operatorname{tr}\left(\mathbf{H}_{j}^{q} \mathbf{Q}_{i}^{q} \mathbf{H}_{j}^{q H}\right) \leq I_{t h}, i \in \mathcal{I}: j \neq i$$

(2c)

[TeX:] $$\sum_{i} \operatorname{tr}\left(\mathbf{Q}_{i}^{q}\right) \leq P^{q}, \mathbf{Q}_{i}^{q} \succeq \mathbf{0}, q \in \mathcal{Q}, i \in \mathcal{I},$$

where maximization with respect to [TeX:] $$Q$$ denotes maximization with respect to [TeX:] $$\mathbf{Q}_{i}^{q}, \forall i \in \mathcal{I}, \forall q \in \mathcal{Q}.$$ (2) is a convex programming problem and global optimal solution to it can be found using interior-point method [35]. After [TeX:] $$\mathbf{Q}_{i}^{q}$$ is obtained, [TeX:] $$\mathbf{F}_{i}^{q}$$ can be obtained via randomization procedure as described in Section VIII, which is asymptotically optimal approach for the rank recovery.

Notice that (2b) contains [TeX:] $$|\mathcal{I}|(|\mathcal{I}|-1)$$ number of constraints and [TeX:] $$|\mathcal{Q}|$$ variables, with the former usually greater than the latter. Thus, if these constraints are converted to equality constraints, there is no way to guarantee feasibility in (2), hence, this is not considered in this work.

IV. DISTRIBUTED CONSENSUS BASED ALGORITHM WITH UNLIMITED FRONTHAUL

The proposed distributed algorithm is summarized in Algorithm 1 and developed based on a combination of dual decomposition and proximal minimization with inequalitycoupled constraints that are different from the one in [21] and [22], which is of the form [TeX:] $$\sum_{q} g^{q}\left(x^{q}\right) \leq 0$$,where only dual variables are exchanged between agent (TP) q and its one-hop neighbors, i.e., members of [TeX:] $$\mathcal{N}^{q} \subseteq \mathcal{Q}.$$ This increases privacy protection as it is difficult, in the context of the current problem, to convert the dual variables to the primal ones even if someone is eavesdropping on the network. Local copies of the dual variable and auxiliary variable will have to be created and exchanged to achieve consensus amongst all the nodes.

Note that the leakage interference power constraint (2b) can only be written in this form [TeX:] $$\sum_{q} \operatorname{tr}\left(\mathbf{H}_{j}^{q} \mathbf{Q}_{i}^{q} \mathbf{H}_{j}^{q H}\right)-I_{t h} \leq 0,$$ if processing is done in a centralized manner, but not in a distributed manner, as there is no apparent way to optimally distribute [TeX:] $$I_{t h}$$ amongst all the agents without impacting system performance such as sum rate.

It is also possible to argue, since [TeX:] $$I_{t h}$$ is the interference threshold, each agent should have a priori information about [TeX:] $$I_{t h}$$ at each node. However, the goal here is to obtain a solution that is close, or identical, to the one obtained by centralized processing. Hence, being able to obtain the optimal interference threshold for each node is required. Moreover, even though it is possible for each node to determine its interference leakage power threshold, this will most likely require an exchange of location information amongst cooperating TPs, which will raise privacy concerns.

The proposed algorithm is separated into three steps, namely, local processing, communications, and a consensus step. All steps are summarized in Algorithm 1 and are described in detail in following sections.

A. Local Processing - Computing [TeX:] $$\mathbf{Q}_{j}^{q}, \lambda^{q}, \text { and } \ell^{q}$$

Define [TeX:] $$\mathcal{C}^{q} \triangleq\left\{\mathbf{Q}_{i}^{q}: \sum_{i} \operatorname{tr}\left(\mathbf{Q}_{i}^{q}\right) \leq P^{q}, \mathbf{Q}_{i}^{q} \succeq \mathbf{0}, i \in \mathcal{I}, q \in \mathcal{Q}\right\},$$ then (2) can be written as

(3a)

[TeX:] $$\max _{\mathbf{Q}} \sum_{q} \sum_{i} \operatorname{tr}\left(\mathbf{H}_{i}^{q} \mathbf{Q}_{i}^{q} \mathbf{H}_{i}^{q H}\right)-\alpha \mathbf{1}_{n_{T}}^{T}\left|\mathbf{Q}_{i}^{q}\right| \mathbf{1}_{n_{T}}$$

(3b)

[TeX:] $$\text { s.t. } \sum_{q} \operatorname{tr}\left(\mathbf{H}_{j}^{q} \mathbf{Q}_{i}^{q} \mathbf{H}_{j}^{q H}\right) \leq I_{t h}, i, j \in \mathcal{I}: j \neq i$$

(3c)

[TeX:] $$\mathbf{Q}_{i}^{q} \in \mathcal{C}^{q}, i \in \mathcal{I}, q \in \mathcal{Q}.$$

(3) is a convex optimization problem, but the existence of the coupling constraints (3b) makes it impossible to solve (3) in a distributive manner as solution at each node depends on precoder from other nodes, resulting in an inequality-coupled problem [22]. In such a problem, each agent aims to optimize a local performance criterion subject to local constraints and yet, the decision variables from each agent need to agree. In the case of (3), [TeX:] $$I_{t h}$$ is regarded as a resource that is available to all agents but its exact value is unknown at each node. [21] and [22] considered a similar problem, however, assumed resource is equally distributed. For problem (3), it means that [TeX:] $$I_{t h}$$ is evenly distributed amongst all agents as [TeX:] $$I_{t h, i j}^{q}=I_{t h} / Q, \forall q, i \neq j,$$ which represents the leakage interference power threshold for the qth agent that is used to bound the instantaneous leakage interference power to the jth user when signal is transmitted to the ith user. However, equal distribution of the resource is not optimal as shown in Fig. 3 of [36]. Therefore, an adaptive algorithm for interference leakage resource allocation is proposed herein to obtain a suboptimal [TeX:] $$I_{t h, i j}^{q},$$ which is presented in detail in Section IV-B.

Before presenting distributed optimization framework, (3) shall be rewritten in shorter form for clarity. First, denote the objective function as

[TeX:] $$f^{q}\left(\mathbf{Q}^{q}\right) \triangleq \sum_{i} \operatorname{tr}\left(\mathbf{H}_{i}^{q} \mathbf{Q}_{i}^{q} \mathbf{H}_{i}^{q H}\right)-\alpha \mathbf{1}_{n_{T}}^{T}\left|\mathbf{Q}_{i}^{q}\right| \mathbf{1}_{n_{T}}$$

with [TeX:] $$\mathbf{Q}^{q} \triangleq \operatorname{Diag}\left(\left[\mathbf{Q}_{1}^{q} \cdots \mathbf{Q}_{I}^{q}\right]\right).$$ Second, the constraint function rewritten as

[TeX:] $$\sum_{q} \operatorname{tr}\left(\mathbf{H}_{j}^{q} \mathbf{Q}_{i}^{q} \mathbf{H}_{j}^{q H}\right)-I_{t h, i j}^{q} \triangleq \sum_{q} g_{i j}^{q}\left(\mathbf{Q}_{i}^{q}\right) \leq 0.$$

Now, (3) can be written as

(4a)

[TeX:] $$\max _{\mathbf{Q}} \sum_{q} f^{q}\left(\mathbf{Q}^{q}\right)$$

(4b)

[TeX:] $$\text { s.t. } \sum_{q} g_{i j}^{q}\left(\mathbf{Q}_{i}^{q}\right) \leq 0, \forall i, j \in \mathcal{I}, j \neq i$$

(4c)

[TeX:] $$\mathbf{Q}_{i}^{q} \in \mathcal{C}^{q}, i \in \mathcal{I}, \forall q \in \mathcal{Q}.$$

To begin with distributed algorithm for (4), the problem with coupling constraint (4b) should be circumvented. This is done by forming a Lagrangian function that includes the objective and the coupling constraint. Dual variables for user pairs (i, j) of node q are denoted as [TeX:] $$\lambda^{q} \triangleq\left[\lambda_{12}^{q}, \cdots, \lambda_{1 I}^{q}, \cdots, \lambda_{i j}^{q}, \cdots, \lambda_{I(I-1)}^{q}\right]^{T}$$ and correspond to constraint functions [TeX:] $$\mathbf{g}^{q}\left(\mathbf{Q}^{q}\right) \triangleq \left[g_{12}^{q}, g_{13}^{q}, \cdots, g_{1 I}^{q}, \cdots, g_{i j}^{q}, \cdots, g_{I(I-1)}^{q}\right]^{T} \in \mathbb{R}^{I(I-1)}.$$ Therefore, the Lagrangian can be written as

(5)

[TeX:] $$\begin{aligned} \mathcal{L}(\mathbf{Q}, \boldsymbol{\lambda}) &=\sum_{q} f^{q}\left(\mathbf{Q}^{q}\right)+\lambda^{q T} \mathbf{g}^{q}\left(\mathbf{Q}^{q}\right) \\ &=\sum_{q} \mathcal{L}^{q}\left(\mathbf{Q}^{q}, \boldsymbol{\lambda}^{q}\right) \end{aligned}$$

with requirement that [TeX:] $$\boldsymbol{\lambda}_{i j}^{q}=\boldsymbol{\lambda}_{i j}^{r}, \forall q, r \in \mathcal{Q}$$ for each (i, j) user pair, so that constraint (4b) is satisfied. Notably, terms inside (5) are independent for different q and can be computed by different node independently and in parallel. In order for all nodes to reach consensus in the Lagrangian function above, [TeX:] $$\lambda^{q}$$ will be broadcasted from each node [TeX:] $$q \in \mathcal{Q},$$ to its neighboring set [TeX:] $$\mathcal{N}_{q}.$$ An auxiliary dual variable [TeX:] $$\ell^{q},$$ computed as [TeX:] $$\ell^{q} \triangleq \sum_{r \in \mathcal{N} q}\left[\mathbf{W}^{\varphi}\right]_{r q} \lambda^{r},$$ is used to achieve consensus on dual variable [TeX:] $$\lambda^{q}.$$ First, [TeX:] $$\ell^{q}$$ is used as a surrogate for [TeX:] $$\lambda^{q}$$ when solving for [TeX:] $$Q_{i}^{q}.$$ It is then used as a proximal term when solving for [TeX:] $$\lambda^{q}$$ to promote consensus. These two steps will be elaborated clearly later. The update equation for [TeX:] $$\ell^{q}$$ is a weighted sum of [TeX:] $$\lambda^{q},$$ where the weights are contained inside the consensus matrix [26] [TeX:] $$\mathbf{W} \in \mathbb{R}^{Q \times Q},$$ with properties

(6)

[TeX:] $$[\mathbf{W}]_{r q}=0 \text { if } r \notin \mathcal{N}^{q}, \mathbf{W}=\mathbf{W}^{T}, \mathbf{W} 1_{Q}=1_{Q},$$

(7)

[TeX:] $$\lim _{\varphi \rightarrow \infty} \mathbf{W}^{\varphi}=\frac{1_{Q} \mathbf{1}_{Q}^{T}}{Q}, \text { and } \rho\left(\mathbf{W}-\frac{\mathbf{1}_{Q} \mathbf{1}_{Q}^{T}}{Q}\right) \leq \nu<1 \text {, }$$

where [TeX:] $$\rho(\cdot)$$ is the spectral radius operator and ν is spectral radius upper bound value. [TeX:] $$\varphi$$ denotes the number of consensus iterations, which reflects how far messages from one node are passed to next-hop nodes. W is used to model the TP network as a connected undirected graph. As indicated in Algorithm 1, the update for [TeX:] $$\ell^{q}$$ is carried out after that of [TeX:] $$\lambda^{q}.$$

Since consensus for [TeX:] $$\lambda^{q(n)}$$ is obtained through [TeX:] $$\ell^{q},$$ hence, the Lagrangian function in (5) can be approximated at point [TeX:] $$b \partial$$ as

(8)

[TeX:] $$\mathcal{L}^{q}\left(\mathbf{Q}^{q}, \ell^{q}\right)=f^{q}\left(\mathbf{Q}^{q}\right)+\ell^{q T} \mathbf{g}^{q}\left(\mathbf{Q}^{q}\right)$$

and it will be used to find the precoder matrix. Before proceeding, it should be indicated that when iteration [TeX:] $$\operatorname{index}^{(n)}$$ appears above certain mathematical quantity, it means that this variable is fixed and obtained after corresponding optimization iteration n. When the index is absent, the corresponding quantity is a variable or contains a variable that should be optimized. For instance, [TeX:] $$\mathbf{g}^{q}$$ in (9) contains [TeX:] $$\mathrm{Q}^{q}, \text { while } \mathrm{g}^{q(n)}$$ in (11) is a quantity evaluated at iteration n at [TeX:] $$\mathrm{Q}^{q(n+1)}.$$

The local processing step is stated as follows. At the nth iteration of the proposed algorithm, part of the local processing stage will be solving for [TeX:] $$\mathbf{Q}_{i}^{q}$$ in the next iteration and will require maximizing [TeX:] $$\mathcal{L}^{q}\left(\mathbf{Q}^{q}, \ell^{q}\right)$$ at a fixed point [TeX:] $$\ell^{q(n)}$$ to obtain

(9)

[TeX:] $$\mathbf{Q}^{q(n+1)}=\underset{\mathbf{Q}^{q} \in \mathcal{C}^{q}}{\arg \max } f^{q(n)}\left(\mathbf{Q}^{q}\right)+\ell^{q(n) T} \mathrm{~g}^{q}\left(\mathbf{Q}^{q}\right).$$

In the simulation, [TeX:] $$\ell^{q(0)}$$ was initialized to be 4. The other part of the local processing involves solving for the dual variable [TeX:] $$\lambda^{q}$$ which can be done by minimizing the Lagrange dual function [TeX:] $$d\left(\boldsymbol{\lambda}^{q}\right) \triangleq \max _{\mathbf{Q}^{q} \in \mathcal{C}^{q}} \mathcal{L}^{q}\left(\mathbf{Q}^{q}, \boldsymbol{\lambda}^{q}\right)$$ with respect to [TeX:] $$\lambda^{q}.$$ This dual function will be augmented with a proximal term to incentivize the consensus of the dual variables. Specifically, the new dual problem is formulated as

(10)

[TeX:] $$\min _{\boldsymbol{\lambda}^{q} \succeq 0} \max _{\mathbf{Q}^{q} \in \mathcal{C}^{q}}\left\{\mathcal{L}^{q}\left(\mathbf{Q}^{q}, \boldsymbol{\lambda}^{q}\right)+\frac{\left\|\boldsymbol{\lambda}^{q}-\ell^{q(n)}\right\|_{2}^{2}}{2 c^{(n)}}\right\}$$

for each TP, where the second term in the inner objective is a proximal term. [TeX:] $$c^{(n)}$$ is a scalar that balances emphasis between objective and consensus on [TeX:] $$\lambda^{q}.$$ The update of [TeX:] $$\lambda^{q(n)},$$ denoted as [TeX:] $$\boldsymbol{\lambda}^{q(n+1)},$$ can be obtained as

(11)

[TeX:] $$\lambda^{q(n+1)}=\underset{\boldsymbol{\lambda}^{q} \geq 0}{\arg \min } \boldsymbol{\lambda}^{q T} \mathrm{~g}^{q(n)}+\frac{\left\|\boldsymbol{\lambda}^{q}-\ell^{q(n)}\right\|_{2}^{2}}{2 c^{(n)}},$$

which requires the computation of [TeX:] $$\mathrm{g}^{q(n)} \triangleq \operatorname{tr}\left(\mathbf{H}^{q} \mathbf{Q}^{q(n+1)} \mathbf{H}^{q H}\right)-I_{t h, i j}^{q(n)}.$$ Using the fact that [TeX:] $$\lambda^{q} \succeq 0,$$ then [TeX:] $$\lambda^{q}$$ for the next iteration is obtained via differentiation as

(12)

[TeX:] $$\lambda^{q(n+1)}=\mathcal{P}_{+}\left[\ell^{q(n)}-e^{(n)} \mathrm{g}^{q(n)}\right],$$

where [TeX:] $$c^{(n)}$$ can be regarded as an (adaptive) step size at the nth iteration and [TeX:] $$\mathbf{g}^{q(n)}$$ is a subgradient of the objective function in (11). Finally, [TeX:] $$\ell^{q(n)}$$ is updated according to

(13)

[TeX:] $$\ell^{q(n+1)}=\sum_{r \in \mathcal{N}^{q}}\left[\mathbf{W}^{\varphi}\right]_{r q} \lambda^{r(n+1)}.$$

The local updates of [TeX:] $$\mathbf{Q}_{i}^{q}, \lambda_{i}^{q} \text { and } \ell_{i}^{q}$$ are summarized in Step 1 and 3 of Algorithm 1, respectively.

B. Adaptive Strategy for Computing [TeX:] $$I_{t h, i j}^{q}$$

Since the coupling constraint [TeX:] $$\sum_{q} g_{i j}^{q}\left(\mathbf{Q}_{i}^{q}\right) \leq 0$$ must have prior knowledge about [TeX:] $$I_{t h, i j}^{q}$$ and even distribution of [TeX:] $$I_{t h}$$ is not optimal, adaptive resource allocation algorithm is proposed to allocate [TeX:] $$I_{t h}$$ across all agents which achieves better performance than the even distribution scheme in [21], [22], [24]. The issue of distributing resource for the coupling sum constraint was considered in [37], in which knowledge about the lower and upper bound of the second-order derivative of the objective is required. However, this approach cannot be applied to (4) as the objective is not differentiable. Even if the penalty term in (3a) is removed, the second-order derivative equals to zero. Besides, it is unclear how the scheme in [37] can be applied with additional coupling constraints as is the case in Section VI.

To address problem above, novel resource allocation proposed in [36] is used, while the scheme’s optimality is proven in this work. The procedure is illustrated as follows. Initialization is done with [TeX:] $$I_{t h, i j}^{q(0)}=I_{t h} / Q,$$ i.e., equal distribution. Define the slack leakage interference power as [TeX:] $$S_{i j}^{q(n+1)} \triangleq I_{t h, i j}^{q(n)}-L_{i j}^{q(n+1)}$$ where [TeX:] $$L_{i j}^{q(n+1)} \triangleq \operatorname{tr}\left(\mathbf{H}_{j}^{q} \mathbf{Q}_{i}^{q(n+1)} \mathbf{H}_{j}^{q H}\right)$$ is the instantaneous leakage interference power. [TeX:] $$S_{i j}^{q(n+1)}$$ can be viewed as the amount of excess leakage interference power (resource) that is not used by the qth TP while serving the ith user. Then a new value for the leakage threshold can be updated as

(14)

[TeX:] $$I_{t h, i j}^{q(n+1)} \triangleq L_{i j}^{q(n)}+\sum_{r \in \mathcal{N} q}[\mathbf{W}]_{r q} S_{i j}^{r(n+1)},$$

which corresponds to Step 3 of Algorithm 1. Notice that the slack leakage interference power [TeX:] $$S_{i j}^{q(n+1)}$$ is transmitted from the qth agent to all members of [TeX:] $$\mathcal{N}^{q}$$ in Step 2 in Algorithm 1. After [TeX:] $$I_{t h, i j}^{q(n+1)}$$ is computed at the nth iteration, it will be used in (9) to compute [TeX:] $$g_{i j}^{q}\left(\mathbf{Q}_{i}^{q}\right)=\operatorname{tr}\left(\mathbf{H}_{i}^{q} \mathbf{Q}_{i}^{q} \mathbf{H}_{j}^{q H}\right)-I_{t h, i j}^{q(n+1)},$$ which will be used to obtain [TeX:] $$\mathbf{Q}^{q(n+2)}$$ at the next iteration. Notice that all nodes [TeX:] $$q \in \mathcal{Q}$$ will be running the above steps in parallel. Define [TeX:] $$\mathcal{L}_{n a}^{(n)} \triangleq\left\{S_{i j}^{q(n)} \mid S_{i j}^{q(n)}>0\right\} \text { and } \mathcal{L}_{a}^{(n)} \triangleq\left\{S_{i j}^{q(n)} \mid S_{i j}^{q(n)}=0\right\}.$$ The optimality of the proposed adaptive resource allocation scheme is stated in the theorem below.

Theorem 1. Adaptive interference leakage allocation scheme converges, i.e.,

[TeX:] $$\lim _{n \rightarrow \infty} \sum_{q} S_{i j}^{q(n)}=S_{i j}^{\infty} \quad \forall i, j \in \mathcal{I}, j \neq i.$$

Moreover, if leakage interference constraint is active at the optimal solution of (3), then

[TeX:] $$\lim _{n \rightarrow \infty} \sum_{q} S_{i j}^{q(n)}=0 \quad \forall i, j \in \mathcal{I}, j \neq i.$$

That is, [TeX:] $$\lim _{n \rightarrow \infty} \mathcal{L}_{n a}^{(n)}$$ will be empty.

See Appendix A for proof.

V. ALGORITHM CONVERGENCE

In this section, convergence of the proposed primal-dual decomposition algorithm is presented. Theorems 2–4 shown herein are similar to those in [21] and state the dual convergence and optimality (Theorem 2 and Theorem 3, respectively), and then, primal objective optimality (Theorem 4). Notably, these optimality bounds are different from [21] due to two reasons. First, the proposed algorithm uses an adaptive step size [TeX:] $$c^{(n)}$$ instead of fixed step size. Second, in [21], consensus step is done prior to projection, which requires computation of sets [TeX:] $$D_{\mu} \text { and } D_{\mathbf{G}}$$over which projections are done. For definitions of these sets, please refer to the original work. On the contrary, in the proposed algorithm, projection is done prior to consensus and hence avoids the difficulty and requires a new proof.

Convergence results described in a sequel hold under the assumptions below.

Assumption 1. Between any two nodes [TeX:] $$q, r \in \mathcal{Q}$$ in a network there exists a path of edges.

Assumption 2. [TeX:] $$\left\{c^{(n)}\right\}_{n \geq 0}$$ is a nonincreasing sequence of positive reals and satisfies

[TeX:] $$\lim _{n \rightarrow \infty} c^{(n)}=0 ; \quad \sum_{n=0}^{\infty} c^{(n)}=\infty ; \quad \sum_{n=0}^{\infty}\left(c^{(n)}\right)^{2}<\infty .$$

One possible choice for [TeX:] $$e^{(n)}$$ satisfying Assumption 2 is [TeX:] $$c^{(n)}=\gamma /(n+1)$$ for some [TeX:] $$\gamma>0.$$ Notice that choice of [TeX:] $$e^{(n)}$$ that satisfies Assumption 2 guarantees [TeX:] $$\lim _{n \rightarrow \infty}\left\|\boldsymbol{\lambda}^{(n)}-\boldsymbol{\ell}^{(n)}\right\|_{2}^{2}=0.$$

Assumption 3. The difference between consensus dual variables after the first iteration of Algorithm 1 is bounded as [TeX:] $$\left\|\ell^{q(1)}-\ell^{r(1)}\right\|_{2} \leq \beta^{(1)}, \forall q, r \in \mathcal{Q} .$$

Assumption 4. Constraint functions [TeX:] $$g_{i j}^{q}\left(\mathbf{Q}_{i}^{q}\right), \forall \mathbf{Q}_{i}^{q},$$ are finite, i.e.,

[TeX:] $$\left\|g_{i j}^{q}\left(\mathbf{Q}_{i}^{q}\right)\right\|_{2} \leq G, \text { for any } \mathbf{Q}_{i}^{q} \in \mathcal{C}^{q}.$$

Assumption 4 defines a bound on the subgradient of the qth Lagrange dual function in (5), that exists due to convexity nature of the dual problem. From (3), the value of G can be upper bounded by [TeX:] $$P^{q}\left\|\mathbf{H}_{j}^{q}\right\|_{F}^{2}.$$

Assumption 5. Lagrangian multipliers and related variables are bounded as following

(15)

[TeX:] $$\left\|\boldsymbol{\lambda}^{q}\right\|_{2}^{2} \leq \Lambda,\left\|\ell^{q}\right\|_{2}^{2} \leq \Lambda,\|\mathbf{y}\|_{2}^{2} \leq \Lambda,$$

for some nonnegative constant Λ. This assumption follows naturally from the assumption of the bounded dual function. y is an auxiliary dual variable and is defined in Appendix D to be used for the convergence proof.

Theorem 2. Dual variable convergence

Define [TeX:] $$\bar{\ell}^{(n)} \triangleq \frac{1}{Q} \sum_{q} \ell^{q(n)}$$ as the mean value of the consensus dual variables. Assume Assumptions 1-5 hold, W satisfies the conditions in (6) and ν is defined as in (7). Let [TeX:] $$p:=\frac{\nu^{\delta} \beta^{(1)}}{\beta^{(1)}+c^{(n)} G}$$ There exists a number of consensus iterations [TeX:] $$\bar{\varphi},$$ such that if [TeX:] $$\varphi \geq \bar{\varphi}+\delta, \delta \geq 0, \text { then } \ell^{q}$$ reaches consensus as

[TeX:] $$\left\|\ell^{q(n)}-\bar{\ell}^{(n)}\right\|_{2} \leq 2 p^{n-1} \nu^{\delta} \sigma \beta^{(1)}+2 p c^{(n)} G \frac{1-p^{n-1}}{1-p}, \forall q \in \mathcal{Q}.$$

Moreover, [TeX:] $$\bar{\varphi}=\frac{\log \left(\beta^{(1)}\right)-\log \left(4 Q\left(\beta^{(1)}+c^{(n)} G\right)\right)}{\log (\nu)}+\delta \triangleq \bar{\varphi}+\delta.$$

Theorem 3. Dual objective optimality

Let [TeX:] $$\lambda$$ be generated with Algorithm 1, which corresponds to [TeX:] $$\epsilon$$subgradient algorithm [38]. Given that the Lagrange dual function [TeX:] $$d(\boldsymbol{\lambda}) \triangleq \arg \max _{\mathbf{Q}} \mathcal{L}(\mathbf{Q}, \lambda)$$ is bounded, then

[TeX:] $$\liminf _{n \rightarrow \infty} d\left(\boldsymbol{\lambda}^{(n)}\right)=d^{\star}+Q \zeta,$$

where d⋆ denotes the optimal value of [TeX:] $$d(\boldsymbol{\lambda}) \text { and } \zeta$$ is a constant inside the [TeX:] $$\epsilon-$$subgradient algorithm.

Theorem 4. Primal objective convergence

(a) A lower bound on the primal cost is given by

[TeX:] $$f\left(\mathrm{Q}^{q(n)}\right) \geq f^{\star}-\frac{\Lambda^{2}}{2 n c^{(n)} / Q}-e^{(n)};$$

(b) An upper bound on the primal cost is given by

[TeX:] $$\begin{aligned} &\quad f\left(\mathbf{Q}^{q(n)}\right) \leq f^{\star}+\frac{5 \Lambda^{2}}{2 n c^{(n)} / Q}+e^{(n)}, \text { where } \\ &\begin{aligned} e^{(n)}=\frac{c^{(n)} Q(G+\tau)^{2}}{2} &+Q \tau \Lambda \\ &+Q\left(\beta^{(1)}(4 G+2 \tau)+\zeta\right). \end{aligned} \end{aligned}$$

Proofs of Theorems 1, 2, 3, and 4 can be found in Appendices A, C, D, and E, respectively.

Algorithm 2

Distributed augmented consensus-based dual decomposition for joint fronthaul-constrained TP design and selection.

VI. EXTENSION TO LIMITED FRONTHAUL SCENARIO

The distributed design algorithm described below extends the one proposed in [39] for the case of centralized joint precoder and selection design. Wireless and fiber optics are expected to serve as fronthaul between TPs and backhaul between the cloud processing units, with the fronthaul capacity usually lower than that of the backhaul, which will bound the transmission rate to the users. Thus, it is imperative to include this constraint in the joint TP selection and precoder design problem. The proposed distributed algorithm is summarized in Algorithm 2 and derivations are explained below. Note that the main differences between Algorithm 1 and 2 are that Step 1 is modified and an extra step is added to solve the limited-fronthaul problem. Fronthaul capacity constraint can be mathematically formulated as constraining the sum of rates offered to different users by the fronthaul capacity bound [TeX:] $$C^{q}.$$ That is, [TeX:] $$\sum_{i} \log \left(1+\operatorname{SINR}_{i}^{q}\right) \leq C^{q}, \text { where } \operatorname{SINR}_{i}^{q}$$ denotes the signal-to-interference-plus-noise ratio in the link between the qth TP and ith user. Writing the SINR explicitly, the constraint becomes

(16)

[TeX:] $$\sum_{i} \log \left(1+\frac{\operatorname{tr}\left(\mathbf{H}_{i}^{q} \mathbf{Q}_{i}^{q} \mathbf{H}_{i}^{q H}\right)}{\sigma^{2}+\sum_{j \neq i} \sum_{r} \operatorname{tr}\left(\mathbf{H}_{i}^{r} \mathbf{Q}_{j}^{r} \mathbf{H}_{i}^{r H}\right)}\right) \leq C^{q},$$

[TeX:] $$\forall q \in \mathcal{Q}.$$ To shorten notations, define [TeX:] $$R_{i}^{q} \triangleq \operatorname{tr}\left(\mathbf{H}_{i}^{q} \mathbf{Q}_{i}^{q} \mathbf{H}_{i}^{q H}\right)$$ and [TeX:] $$I_{i j}^{r} \triangleq \operatorname{tr}\left(\mathbf{H}_{i}^{r} \mathbf{Q}_{j}^{r} \mathbf{H}_{i}^{r H}\right),$$ so that (16) becomes

(17)

[TeX:] $$\sum_{i} \log \left(1+R_{i}^{q} /\left(\sigma^{2}+\sum_{j \neq i} \sum_{r} I_{i j}^{r}\right)\right) \leq C^{q},$$

[TeX:] $$\forall q \in \mathcal{Q}.$$ The SINR term in (17) can be rewritten using exponential function as

(18)

[TeX:] $$\sum_{i} \log \left\{1+\exp \left[\log \left(R_{i}^{q}\right)-\log \left(\sigma^{2}+\sum_{i \neq i} \sum_{r} I_{i j}^{r}\right)\right]\right\} \leq C^{q},$$

where exp and log are inserted into the numerator. Notice that the expression is still nonconvex because the exponential function is nondecreasing and it requires its argument to be convex for the constraint to be convex, which is not the case. To circumvent this problem, affine approximation around an arbitrary point [TeX:] $$x_{0} \text { of } \log (x)$$ is used so that [TeX:] $$\log (x) \approx \log \left(x_{0}\right)+\left(x-x_{0}\right) / x_{0}$$ can be used as a convex upper bound of the original constraint. [TeX:] $$x_{0}$$ can be chosen to be [TeX:] $$\mathbf{Q}_{i}^{q(m-1)},$$ i.e., the variable [TeX:] $$Q_{i}^{q}$$ at the previous (m − 1)th SCA iteration, which can be regarded as a constant in the mth SCA iteration. Hence, the fronthaul constraint at the mth SCA iteration becomes

(19)

[TeX:] $$\begin{aligned} \sum_{i} \log \{1+\exp & {\left[\log \left(\epsilon+R_{i}^{q(m-1)}\right)+\frac{\left(R_{i}^{q}-R_{i}^{q(m-1)}\right)}{\epsilon+R_{i}^{q(m-1)}}\right.} \\ &\left.\left.-\log \left(\sigma^{2}+\sum_{j \neq i} \sum_{r} I_{i j}^{r}\right)\right]\right\} \leq C^{q}, \end{aligned}$$

where [TeX:] $$R_{i}^{q(m-1)}=\operatorname{tr}\left(\mathbf{H}_{i}^{q} \mathbf{Q}_{i}^{q(m-1)} \mathbf{H}_{i}^{q T}\right) \text { and } \epsilon=10^{-8}$$ is added to avoid numerical problem since the algorithm initializes with [TeX:] $$\mathbf{Q}_{i}^{q(0)}=\mathbf{0}, \forall i \in \mathcal{I}, \forall q \in \mathcal{Q}.$$ Since the argument inside exp is convex, the function is log-convex, hence, the left-hand side of (19) is convex. Moreover, the proposed SCA results in an upper bound for the original constraint, hence, it is a valid substitute. The proposed scheme converges because at each iteration, the feasible region of the problem shrinks, leading to monotonic increase in the objective.

However, in the distributed optimization scenario, (19) is a coupling constraint since the interference term in (19) forces the constraint for the qth TP to be dependent on the rth precoder, for [TeX:] $$\forall r \neq q.$$ This problem can be circumvented by replacing the interference term with the one from the previous (global) (n − 1)th iteration. To ensure feasibility and convergence, the ADMM scheme is applied, and the constraint [TeX:] $$\mathbf{Q}_{i}^{q}=\mathbf{Q}_{i}^{q(n-1)}$$ is added, thus converting the problem into

(20)

[TeX:] $$\begin{aligned} &\max _{\mathbf{Q}} \sum_{q} \sum_{i} \operatorname{tr}\left(\mathbf{H}_{i}^{q} \mathbf{Q}_{i}^{q} \mathbf{H}_{i}^{q H}\right)-\alpha \mathbf{1}_{n_{T}}^{T}\left|\mathbf{Q}_{i}^{q}\right| \mathbf{1}_{n_{T}} \\ &\text { s.t. } \sum_{q} \operatorname{tr}\left(\mathbf{H}_{j}^{q} \mathbf{Q}_{i}^{q} \mathbf{H}_{j}^{q H}\right) \leq I_{t h}, i, j \in \mathcal{I}: j \neq i \\ &\sum_{i} \log \left\{1+\exp \left[\log \left(\epsilon+R_{i}^{q(m-1)}\right)+\frac{\left(R_{i}^{q}-R_{i}^{q(m-1)}\right)}{\epsilon+R_{i}^{q(m-1)}}\right.\right. \\ &\left.\left.\quad-\log \left(\sigma^{2}+\sum_{j \neq i} \sum_{r} I_{i j}^{r(n-1)}\right)\right]\right\} \leq C^{q}, \\ &\mathbf{Q}_{i}^{q}=\mathbf{Q}_{i}^{q(n-1)}, \mathbf{Q}_{i}^{q} \in \mathcal{C}^{q}, i \in \mathcal{I}, q \in \mathcal{Q}, \end{aligned}$$

where [TeX:] $$I_{i j}^{r(n-1)} \triangleq \operatorname{tr}\left(\mathbf{H}_{i}^{q} \mathbf{Q}_{j}^{r(n-1)} \mathbf{H}_{i}^{q H}\right)$$ denotes the interference from the rth precoder to the ith user while transmitting to the jth user at the (n − 1)th consensus iteration. Similar to [36], the above problem can be modified by adding a quadratic penalty term into the objective and express (20) as

(21)

[TeX:] $$\begin{aligned} &\max _{\mathbf{Q}} \sum_{q} f^{q}\left(\mathbf{Q}^{q}\right)+\rho\left\|\mathbf{Q}_{i}^{q}-\mathbf{Q}_{i}^{q(n-1)}\right\|_{F}^{2} \\ &\text { s.t. } \sum_{q} g_{i j}^{q}\left(\mathbf{Q}_{i}^{q}\right) \leq 0, \forall i, j \in \mathcal{I}, j \neq i \\ &\mathbf{Q}_{i}^{q} \in \tilde{\mathcal{C}}^{q}, \mathbf{Q}_{i}^{q}=\mathbf{Q}_{i}^{q(n-1)}, i \in \mathcal{I}, \forall q \in \mathcal{Q}. \end{aligned}$$

Notice that the fronthaul constraint in (19) has now be absorbed into the local constraint set [TeX:] $$\mathcal{C}^{q},$$ which is denoted as [TeX:] $$\widetilde{\mathcal{C}}^{q}.$$ (21) can be solved using a scheme similar to Algorithm 1. That is, [TeX:] $$\mathbf{Q}^{q(n+1)}$$ can be computed as

(22)

[TeX:] $$\begin{aligned} &\mathbf{Q}^{q(n+1)}=\arg \max _{\mathbf{Q}^{q} \in \overline{\mathcal{C}}^{q}} f^{q}\left(\mathbf{Q}^{q}\right)+\ell^{q(n) T} \mathbf{g}^{q(n)}\left(\mathbf{Q}^{q}\right)+ \\ &\sum_{i} \operatorname{tr}\left(\mathbf{Q}_{i d}^{q H}\left(\mathbf{Q}_{i}^{q}-\mathbf{Q}_{i}^{q(n-1)}\right)\right)+\rho\left\|\mathbf{Q}_{i}^{q}-\mathbf{Q}_{i}^{q(n-1)}\right\|_{F}^{2} \end{aligned}$$

using the SCA method, where [TeX:] $$\mathrm{Q}_{i d}^{q} \succeq 0$$ is the Lagrangian multiplier associated with equality constraint in (20). The Lagrange multiplier can be updated as [40]

(23)

[TeX:] $$\mathbf{Q}_{i d}^{q(n+1)}=\mathbf{Q}_{i d}^{q(n)}+\rho\left(\mathbf{Q}_{i}^{q(n)}-\mathbf{Q}_{i}^{q(n-1)}\right)$$

The overall proposed scheme is summarized in Algorithm 2.

VII. ALGORITHM OVERHEAD AND COMPLEXITY

To justify the proposed distributed design schemes outlined in Algorithm 1 and 2, a description of complexity and system overhead is provided. For the proposed iterative schemes, it is assumed that the channels can remain static during a transmission time interval (TTI), which can be expected for static environments. The timing bottleneck for Algorithm 1 is processing delay in the local optimization step, which consists of computation time of SDP convex programming problem of size equals to the number of transmit antennas multiplied by the number of UEs. The feasibility of time complexity of the convex programming techniques for signal processing and mobile communications is further supported in [42], which states that it is possible to solve modest-sized convex optimization problems on microsecond time scales. Despite strong dependence on the solver implementation, it is important to understand the advantages of the proposed method over the centralized one proposed in [39]. For centralized approach, the computational complexity of the IPM for SDP grows at complexities [TeX:] $$O\left(n^{3}\right) \sim O\left(n^{6}\right)$$ [41] per iteration depending on the problem structure. The amount of transferred data is [TeX:] $$O(n) \text {, where } n=n_{T}^{2} Q I \text { with } n_{T}$$ being the number of transmit antennas, Q is the number of TPs, and I is the number of UEs. Using the distributed method, the computational complexity per iteration is [TeX:] $$O\left(m^{3}\right) \sim O\left(m^{6}\right), \text { with } m=n_{T}^{2} I,$$ and the amount of data transferred is O(I). As a result, the computational complexity and amount of transferred data of the algorithm do not scale with the number of TPs. Even though the distributed algorithm may take more iterations to converge than IPM, this difference in practice is less than an order of magnitude (30–40 iterations for centralized IPM, 80–300 iterations of the distributed algorithm). Therefore, there is a clear tradeoff between the two, with distributed algorithm suited more for cell-free cooperative MIMO systems.

The aforementioned analysis also holds for Algorithm 2 with the added complexity that an SDP problem needs to be solved during each SCA iteration. Hence, the complexity of local optimization in Algorithm 2 is multiplied by the number of SCA approximations being done compared with Algorithm 1.

Notably, the proposed distributed algorithms require nodes exchange information in Step 2 of Algorithm 1 and 2, which creates some overhead. Particularly, each node sends a realvalued dual variable of size I(I − 1) to its’ neighbors. Given that the X2 infrastructure existing between TPs being a high throughput optical cable, which is a must to support a feasible CoMP system, expected transmission delay is negligible compared to other steps of the algorithm.

VIII. RANDOMIZATION PROCEDURE

The randomization procedure described herein is invoked in Step 4 and 5 in Algorithm 1 and 2, respectively. For Algorithm 1, [TeX:] $$\mathbf{Q}_{i}^{q}$$ is obtained from (4), [TeX:] $$\mathbf{F}_{i}^{q}$$ with rank B is recovered distributively by using the randomization procedure similar to the one in [28]. Specifically, Nrand number of [TeX:] $$\mathbf{F}_{i}^{q}=1 / B \mathbf{U}_{\mathbf{Q}_{i}^{q}} \mathbf{\Lambda}_{\mathbf{Q}_{i}^{q}}^{1 / 2} \mathbf{E}$$ are generated. [TeX:] $$\mathbf{U}_{\mathbf{Q}_{i}^{q}} \text { and } \Lambda_{\mathbf{Q}_{i}^{q}}^{1 / 2}$$ are the eigenvector and eigenvalue matrix of [TeX:] $$\mathrm{Q}_{i}^{q},$$ respectively. [TeX:] $$[\mathbf{E}]_{m l}=e^{j \theta_{m l}} \text { and } \theta_{m l}$$ is independent and identically distributed uniformly on [TeX:] $$[0,2 \pi],$$ with [TeX:] $$m=1, \cdots, n_{T}$$ and [TeX:] $$l=1, \cdots, B.$$ Only those [TeX:] $$\mathbf{F}_{i}^{q} \mathrm{~s}$$ that satisfy the leakage interference constraints, i.e., [TeX:] $$\left\|\mathbf{H}^{q} \mathbf{F}^{q(n)}\right\|-I_{t h, i j}^{q(n)} \leq 0,$$ are retained. The remaining [TeX:] $$\mathbf{F}_{i}^{q} \text { 's }$$ are compared, and the one which maximizes the objective is the optimal precoder.

It is, however, more difficult to retrieve [TeX:] $$\mathbf{F}_{i}^{q}$$ distributively from [TeX:] $$\mathbf{Q}_{i}^{q(n)}$$ in Algorithm 2 after convergence because [TeX:] $$\mathbf{F}_{i}^{q}$$ has to satisfy the limited fronthaul constraint, which requires knowing the interference power terms from of all agents at each agent. This procedure can be done distributively similar to Algorithm 1. It, however, requires additional system overhead between TPs to exchange fronthaul constraint related data. Fortunately, in simulation, the limited fronthaul capacity constraint will often be satisfied after [TeX:] $$\mathbf{F}_{i}^{q}$$ is obtained because the SINR will remain relatively unchanged since both received signal and interference power will decrease by relatively the same amount. Although extra communication overhead will be incurred by the exchange of precoders to satisfy the limited fronthaul constraint, only a few trials of randomization are required.

IX. NUMERICAL RESULTS

A. Simulation Environment

An example of the simulated network is shown in Fig. 2. Each service sector contains a TP (green triangle) in the middle, surrounded by three or five UEs (red squares). A service sector herein looks similar to a cell, but it is an area where users can be serviced by multiple TPs in a cooperating set, and a single TP can serve multiple users in different sectors. UEs are placed at the edge of each of these service sectors to promote JT and to emulate cell-free users in 5G NR networks using JT-CoMP. The channel is constructed as [TeX:] $$\mathbf{H}_{i}^{q}=\mathbf{H}_{w}^{i} \mathbf{G}_{q}^{i}, \text { where } \mathbf{H}_{w}^{i} \in \mathbb{C}^{n_{T} \times n_{T}}$$ denotes the channel matrix with i.i.d. zero-mean complex Gaussian entries, [TeX:] $$\mathbf{G}_{q}^{i} \text { is an } n_{T} \times n_{T}$$ diagonal matrix whose nth diagonal element equals [TeX:] $$\sqrt{10^{L_{0} / 10}\left(d / d_{0}\right)^{-\alpha} 10 \leqslant n / 10}, \text { with } L_{0}$$ denoting the reference path loss, d and [TeX:] $$d_{0}$$ denoting the distance between the transmitter and receiver and the reference distance, respectively, [TeX:] $$\alpha$$ is the path loss exponent, and the shadow fading [TeX:] $$10^{\xi_{n} / 10}$$ is independent of the path loss [TeX:] $$\left(d / d_{0}\right)^{-\alpha}$$ and is a log-normal distributed random variable, with [TeX:] $$\xi_{n}$$ being a normal distributed random variable with zero mean and variance equals to 10 dB as indicated in Table III. The transmit power and instantaneous interference leakage power threshold were chosen similar to [39] and [28]. The choice of [TeX:] $$I_{t h}=10^{-4}$$ W is because the average SINR in cellular systems varies from −5 to 20 dB. The average power received from a reference signal has a typical range from −40 dBm to −140 dBm, which equals [TeX:] $$10^{-4} \text { to } 10^{-14} \mathrm{~W}.$$ Therefore, choosing [TeX:] $$I_{t h}=10^{-4} \mathrm{~W}$$ and having received signal power around [TeX:] $$10^{-4} \mathrm{~W}$$ will make the SINR to be around 0 dB, which is typical SINR QoS value in the references. Other network and algorithm parameters are summarized in Table III and are selected according to the 3GPP 5G NR user equipement radio transmission and reception specification document [43]. CVX [47] with MOSEK [48] was used to obtain solution to (3) and (20) on an Intel Core i7 4790 4GHz machine with 32GB RAM. Centralized algorithm in [39] is used as a baseline for the proposed algorithm. In addition, joint precoder design and user selection works in [22], [24], and [46] are used for performance comparison.

Fig. 2.

Example of simulated networks: 3 (left) and 5 (right) UEs per sector.

Table III

SIMULATED NETWORK PARAMETERS.

# of TPs/UEs	Q=7, I=21, 30
# of Antennas	[TeX:] $$n_{T}=4, n_{R}=2$$
[TeX:] $$I_{t h}$$	[TeX:] $$10^{-4} \mathrm{~W}$$
[TeX:] $$P^{q}$$	1 W
[TeX:] $$\sigma^{2}$$	[TeX:] $$-33 dB$$
[TeX:] $$L_{0} \text { Ref loss (dB) }$$	60
[TeX:] $$d_{0} \text { Ref distance }(\mathrm{m})$$	10
PL exponent	3.76
Shadowing	10 dB
Tx antenna gain	10 dB
Carrier frequency	2.5 GHz

With the exception of Fig. 6, all simulation results below are based on the network with 3 UEs per sector shown in Fig. 2.

Fig. 3.

Two instances of Algorithm 1 convergence for different connection density with unlimited fronthaul.

B. Convergence Behavior

In particular interest of this work is to show the impact that network connectivity and the number of message passing have on algorithm convergence for unlimited fronthaul capacity. Fig. 3 shows convergence behavior when each TP is connected to two and three nearest neighbors, labeled as 2NN and 3NN, respectively. A fully connected network is used as a performance benchmark, and is labeled as FC in Fig. 3. For each node q, weights [TeX:] $$[\mathbf{W}]_{g r}, r \in \mathcal{N}^{q}$$ are selected to be equal [TeX:] $$1 /\left|\mathcal{N}^{q}\right|.$$ Number of consensus iterations was set to [TeX:] $$\varphi=1.$$

It can be seen from the results that the algorithm converges faster for a more densely connected network, as stated in Theorem 1. To highlight the effect of limited fronthaul capacity, [TeX:] $$C^{q}=1,2,3,4$$ bits/Hz/sec were simulated. The first value corresponds to worst-case microwave fronthaul capacity in [44] and the last corresponds to the situation when the fronthaul constraint becomes looser than the leakage interference power constraint so that the latter becomes active. Convergence results are shown in Fig. 4. Comparing to the unlimited fronthaul capacity scenario, it can be observed from Fig. 4 that the objective value of the distributed design, regardless of the density of the connection, will exceed that of the centralized in the first few iterations, which does not occur in the unlimited fronthaul case. This is due to the fact that before convergence is reached, interference power tends to oscillate a lot due to the use of the SCA in the interference term in (20), making fronthaul approximation imprecise and producing infeasible solutions. However, as the algorithm proceeds, a solution close to the centralized one can be achieved. Next, algorithm convergence was tested for a different number of consensus iterations. Fig. 5 shows convergence behavior when 1, 2, and 13 number of consensus iterations are used with different connection density. The latter case corresponds to the situation where information of each node is passed to all other nodes in the network. The centralized and distributed design with [TeX:] $$\varphi=\infty$$ are used as performance benchmarks. The latter is equivalent to the FC case (with equal edge weights) in the previous experiment, which holds from (7). From experimental data, it can be observed that convergence becomes faster as the number of consensus iterations increases.

Fig. 4.

Two instances of Algorithm 2 convergence for different connection density, [TeX:] $$C q=2 \mathrm{bits} / \mathrm{sec} / \mathrm{Hz}.$$

Fig. 5.

Two instances of Algorithm 1 convergence for different number of consensus iterations (message passing).

C. Performance with Different [TeX:] $$\alpha \text { and } C^{q}$$

Next, the performance of the proposed algorithms in terms of average objective value and sum rate (average over the number of channel realizations) are compared to the previously proposed centralized design in [39]. Moreover, in order to show performance degradation of the proposed algorithms due to unknown interference leakage threshold distribution per TP, i.e., [TeX:] $$I_{t h, i j}^{q}, q \in \mathcal{Q}, i, j \in \mathcal{I}: i \neq j,$$ the proposed algorithm with a priori knowledge of [TeX:] $$I_{t h, i j}^{q}$$ is simulated and is labeled as “GNDTR” (ground truth) in the subsequent figures. The ground truth value of [TeX:] $$I_{t h, i j}^{q}$$ is obtained from the centralized solution. Fig. 6 shows the performance difference for FC, 2NN, and 3NN topologies and FC with ground truth [TeX:] $$I_{t h, i j}^{q}.$$ Also, performance for different network densities is assessed, for 3 and 5 UEs per sector, respectively. Each point is obtained based on 500 channel realizations with random user placement in each iteration. It can be observed that the average performance degrades for less densely connected networks and there is also some degradation due to the use of the adaptive resource allocation scheme proposed in Section IV-B, which supports the theoretical results above. It can also be seen that the proposed adaptive resource adaptive scheme performs well compared to the ground truth case and causes relatively small degradation. Even though ground truth [TeX:] $$I_{t h, i j}^{q}$$ is obtained using the solution from the centralized design, it slightly underperforms the centralized design due to the suboptimality of the primal solution as stated in Theorem 4. It is also important to mention that when the number of users per sector increases, overall performance degrades, as there are more constraints regarding interference that need to be satisfied. Next, Fig. 7 shows performance gain in terms of average objective and normalized sum rate for the proposed adaptive resource allocation in comparison to existing schemes in [22] and [24], that considered equal resource allocation. The gain obtained in the proposed method is due to the fact that [TeX:] $$I_{t h, i, j}^{q}$$ is computed more precisely such that “victims” that can sustain higher leakage interference will lead to higher received signal power allocated to the desired users, resulting in higher rate.

Performance results for the fronthaul-constrained network are shown in Figs. 8 and 9. The performance loss is noticeably bigger for a weakly connected network, due to the combination of dual decomposition, resource allocation, and ADMM algorithms. The average sum rate performance vs. [TeX:] $$C^{q}$$ with different values of of the proposed algorithm is also compared with the joint beamforming-semidefinite relaxation (JB-SDR) algorithm in [46] in Fig. 9, which shows the proposed algorithm can outperform JB-SDR when [TeX:] $$C^{q}$$ is relatively small.

Fig. 6.

Average objective . Average normalized rate. Average objective value and average normalized rate vs. of Algorithm 1 without fronthaul constraint for different connection density

Fig. 7.

Average objective. Average normalized rate. Average objective value and average normalized rate vs. of Algorithm 1 without fronthaul constraint comparison for optimal resource resource allocation, proposed resource allocation scheme and equal resource allocation in [ 22] and [ 24].

Finally, the run time performance vs. number of transmit antennas for the proposed distributed algorithms are compared to its centralized counterpart and [22] with and without limited fronthaul in Fig. 10. As expected, the run time grows as the system size increases. Also, the run time for the algorithms that need to solve problem with limited fronthaul is higher than those which consider unlimited fronthaul because SCA needs to be employed in both the centralized and distributed cases when fronthaul constraint exists.

X. CONCLUSION

A joint TP selection and precoding scheme for cell-free cooperative MIMO using underlay transmission is proposed. The original nonconvex problem is converted into a convex one such that a unique global solution can be found efficiently. Novel consensus-based dual decomposition distributed algorithm for unconstrained fronthaul has been proposed to obtain a solution to the reformulated problem where only a small amount of information needs to be exchanged between different agents. Also, data privacy is ensured as it is difficult to reconstruct the primal solution from the exchanged information. This algorithm extended previous works by not requiring a priori knowledge about the interference constraint threshold, i.e., the availability of an oracle is not needed. Convergence has been proven and the convergence rate equation has been shown. Next, the proposed scheme is extended to a limited fronthaul capacity scenario. A combination of successive convex approximation and ADMM are used to resolve the nonconvexity and constraint coupling problems to make distributed optimization possible.

Fig. 8.

Average objective. Average sum rate. Average objective value and average normalized rate vs. of Algorithm 2 with fronthaul constraint for different values of [TeX:] $$C^{q}.$$

Fig. 9.

Average sum rate for [TeX:] $$\alpha=0 \text { and } \alpha=5 e-4.$$ Average sum rate for [TeX:] $$\alpha=2.5 e-4 \text { and } \alpha=1 e-3.$$ Average sum rate vs. [TeX:] $$C^{q}$$ of Algorithm 2 with fronthaul constraint for different

APPENDIX A PROOF OF THEOREM 1

Proof. First, from the proposed resource allocation scheme, the sequence [TeX:] $$L_{i j}^{q(n)}$$ is nonincreasing. This is true because if the constraint is active, the threshold will not decrease. Denote the overall amount of slack for pair [TeX:] $$i j \text { as } S_{i j}^{(n)}=\sum_{q} S_{i j}^{q(n)}.$$ Using (14), [TeX:] $$S_{i j}^{(n)}$$ can be written as

[TeX:] $$\begin{aligned} S_{i j}^{(n)} &=\sum_{q}\left(I_{t h, i j}^{q(n)}-L_{i j}^{q(n)}\right) \\ &=\sum_{q}\left(L_{i j}^{q(n-1)}+\sum_{r \in \mathcal{N}^{q}}[\mathbf{W}]_{r q} S_{i j}^{r(n-1)}-L_{i j}^{q(n)}\right) \\ &=\sum_{q}\left(L_{i j}^{q(n-1)}-L_{i j}^{q(n)}+\sum_{r \in \mathcal{L}^{q}}[\mathbf{W}]_{r q} S_{i j}^{r(n-1)}\right) \end{aligned} \\ =S_{i j}^{(n-1)}+\sum_{q} \underbrace{\left(L_{i j}^{q(n-1)}-L_{i j}^{q(n)}\right)}_{\begin{array}{c} \text { some slack that has been consumed } \\ \text { by nodes with active constraints },<0 \end{array}} \text {, } $$

where the last equality is obtained using (6). As explained at the beginning of the proof [TeX:] $$L_{i j}^{q(n)} \leq L_{i j}^{q(n-1)} \text {, hence } S_{i j}^{(n)}$$ is also non-increasing and first statement of the theorem holds. Further, if constraint (3b) is active at optimal point, [TeX:] $$\lim _{n \rightarrow \infty} S_{i j}^{(n)}=0$$ and it can be concluded that [TeX:] $$\inf S_{i j}^{(n)}=0$$ due to the fact that if a sequence of real numbers is decreasing and bounded below, then its infimum is the limit.

APPENDIX B BASIC RELATIONS FOR THEOREMS PROOFS

The first two lemmas are general results from consensus theory in [26] and are repeated here for completeness. Other lemmas and theorems are new and are written and proven to establish the convergence properties of the proposed distributed algorithm.

Fig. 10.

Run time (in sec) vs. number of transmit antennas for centralized precoder design algorithm, proposed distributed precoder design algorithm and [ 22], with [TeX:] $$\left(C^{q}=1\right)$$ and without limited fronthaul. Results are averaged over 20 Monte Carlo experiments.

Lemma 1 ([26] Lemma 1).

Define [TeX:] $$\mathbf{x}^{q} \in \mathbb{R}^{I(I-1)}$$ as a subvector of [TeX:] $$\mathbf{x}^{(n)} \in \mathbb{R}^{I(I-1) Q} \text {, i.e., } \mathbf{x}^{(n)} \triangleq\left[\mathbf{x}^{1 T} \cdots \mathbf{x}^{Q T}\right]^{T}.$$ Denote [TeX:] $$\overline{\mathrm{X}}$$ as the mean of [TeX:] $$\mathbf{x}^{q} \text {, i.e., } \overline{\mathbf{x}}= \frac{1}{Q} \sum_{q=1}^{Q} \mathbf{x}^{q} .$$ Then for [TeX:] $$q, r=1, \cdots, Q \text { and } q \neq r$$

[TeX:] $$\text { (a) if }\left\|\mathrm{x}^{q}-\mathbf{x}^{r}\right\|_{2} \leq \beta \text {, then }\left\|\mathbf{x}^{q}-\overline{\mathbf{x}}\right\|_{2} \leq \frac{Q-1}{Q} \beta \\ \text { (b) if }\left\|\mathbf{x}^{q}-\overline{\mathbf{x}}\right\|_{2} \leq \beta \text {, then }\left\|\mathbf{x}^{q}-\mathbf{x}^{r}\right\|_{2} \leq 2 \beta \text {. }$$

Proof.

[TeX:] $$\text { (a) }\left\|\mathrm{x}^{q}-\overline{\mathbf{x}}\right\|_{2}=\frac{1}{Q}\left\|Q \mathbf{x}^{q}-\sum_{q} \mathbf{x}^{q}\right\|_{2} \\ \begin{aligned} &=\frac{1}{Q}\left\|\sum_{q} \mathrm{x}^{q}-Q \mathbf{x}^{q}\right\|_{2} \\ &=\frac{1}{Q}\left\|\sum_{q \neq r}\left(\mathbf{x}^{q}-\mathbf{x}^{r}\right)\right\|_{2} \end{aligned} \\ \leq \frac{1}{Q} \sum_{q \neq r}\left\|\mathrm{x}^{q}-\mathrm{x}^{r}\right\|_{2} \leq \frac{Q-1}{Q} \beta. \\ \text { (b) }\left\|\mathrm{x}^{q}-\mathrm{x}^{r}\right\|_{2}=\left\|\mathrm{x}^{q}-\overline{\mathrm{x}}+\overline{\mathrm{x}}-\mathrm{x}^{r}\right\|_{2} \\ \leq\left\|\mathrm{x}^{q}-\overline{\mathrm{x}}\right\|_{2}+\left\|\overline{\mathrm{x}}-\mathrm{x}^{r}\right\|_{2} \leq 2 \beta .$$

Lemma 2 ([26] Lemma 2). Suppose [TeX:] $$\mathbf{x}^{(n+1)}=\left(\mathbf{W}^{\varphi} \otimes \mathbf{I}_{I(I-1)}\right) \mathbf{x}^{(n)},$$ with W fulfilling the conditions in (6) and (7). Let [TeX:] $$\left\|\mathrm{x}^{q(n)}-\mathrm{x}^{r(n)}\right\|_{2} \leq \sigma \leq \infty, q, r=1, \cdots, Q.$$ Then

(24)

[TeX:] $$\left\|\mathbf{x}^{q(n+1)}-\mathbf{x}^{r(n+1)}\right\|_{2} \leq 2 \nu^{\varphi} Q \sigma.$$

Proof. Let [TeX:] $$\mathbf{x}^{(n)}=\tilde{\mathbf{x}}^{(n)}+\mathbf{a}^{(n)} \text { with } \tilde{\mathbf{x}}^{(n+1)}= \left(\frac{1_{Q} \mathbf{1}_{Q}^{T}}{Q} \otimes \mathbf{I}_{I(I-1)}\right) \mathbf{x}^{(n)} \text { and } 1_{I(I-1) Q}^{T} \mathbf{a}^{(n)}=0 . \mathbf{a}^{(n)} \in \mathbb{R} I(I-1) Q$$ also contains subvectors [TeX:] $$\mathbf{a}^{q(n)},$$ similarly to [TeX:] $$\mathbf{x}^{(n)}$$ and [TeX:] $$\mathbf{x}^{q(n)}.$$ The results of Lemma 1 shows that [TeX:] $$\left\|a^{q(n)}\right\|_{2} \leq Q-1 / Q \sigma<\sigma, \text { for } Q>1.$$ Hence, [TeX:] $$\left\|\mathbf{a}^{(n)}\right\|_{2} \leq Q \sigma.$$ Furthermore, [TeX:] $$\mathbf{x}^{(n+1)}=\left(\mathbf{W}^{\varphi} \otimes \mathbf{I}_{I(I-1)}\right)\left(\overline{\mathbf{x}}^{(n)}+\mathbf{a}^{(n)}\right)=\overline{\mathbf{x}}^{(n)}+ \left(\mathbf{W}^{\varphi} \otimes \mathbf{I}_{I(I-1)}\right) \mathbf{a}^{q(n)}.$$ Using this relationship,

[TeX:] $$\left\|\mathbf{x}^{q(n+1)}-\overline{\mathbf{x}}^{(n+1)}\right\|_{2} \leq\left\|\mathbf{x}^{(n+1)}-\widetilde{\mathbf{x}}^{(n+1)}\right\|_{2} // \begin{aligned} &=\left\|\left(\mathbf{W}^{\varphi} \otimes \mathbf{I}_{I(I-1)}\right)\left(\mathbf{a}^{(n)}-\mathbf{0}_{Q}\right)\right\|_{2} \\ &=\left\|\left[\left(\mathbf{W}^{\varphi}-\mathbf{1}_{Q} \mathbf{1}_{Q}^{T} / Q\right) \otimes \mathbf{I}_{I(I-1)}\right] \mathbf{a}^{(n)}\right\|_{2} \\ &\leq\left\|\left[\left(\mathbf{W}^{\varphi}-\left(1_{Q} \mathbf{1}_{Q}^{T} / Q\right)^{\varphi}\right) \otimes \mathbf{I}_{I(I-1)}\right] \mathbf{a}^{(n)}\right\|_{2} \\ &\leq\left\|\left(\mathbf{W}-1_{Q} \mathbf{1}_{Q}^{T} / Q\right)\right\|_{2}^{\varphi}\left\|\mathbf{I}_{I(I-1)}\right\|_{2}\left\|\mathbf{a}^{(n)}\right\|_{2} \leq \nu^{\varphi} Q \sigma, \end{aligned}$$

where the last inequality is true because [TeX:] $$\left\|\mathbf{W}-1_{Q} \mathbf{1}_{Q}^{T} / Q\right\|_{2}= \rho\left(\mathbf{W}-\mathbf{1}_{Q} \mathbf{1}_{Q}^{T} / Q\right) \leq \nu$$ according to (7). Then, according to part (b) of Lemma 1,

[TeX:] $$\left\|\mathbf{x}^{q(n+1)}-\mathbf{x}^{r(n+1)}\right\|_{2} \leq 2 \nu^{\varphi} Q \sigma.$$

Before presenting the next lemma, define [TeX:] $$\ell \underline{\underline{\Delta}} \left[\ell^{1 T} \cdots \ell^{Q T}\right]^{T}$$ and [TeX:] $$\mathrm{g} \triangleq\left[\mathrm{g}^{1 T} \cdots \mathrm{g}^{Q T}\right]^{T},$$ then from (13) and (12), the update of [TeX:] $$\ell$$ becomes

(25)

[TeX:] $$\boldsymbol{\ell}^{(n+1)}=\left(\mathbf{W}^{\varphi} \otimes \mathbf{I}_{I(I-1)}\right) \mathcal{P}_{+}\left[\ell^{(n)}-c^{(n)} \mathbf{g}^{(n+1)}\right].$$

Also define [TeX:] $$\bar{\ell} \triangleq \frac{1}{Q} \sum_{q} \ell^{q},$$ i.e., average of all [TeX:] $$\ell^{q} s \forall q \in \mathcal{Q}.$$ Lemma below establishes important bound for iterates of [TeX:] $$\ell^{q}.$$ Also, in order to shorten notations, [TeX:] $$(x x) \\ \leq$$ is used to denote “inequality follows from equation (xx)”.

Lemma 3.

Suppose [TeX:] $$\ell,$$ is generated according to (25) and ν is defined in (7). Assuming [TeX:] $$\left\|\ell^{q(n)}-\bar{\ell}^{(n)}\right\|_{2} \leq \beta,$$ there exists a [TeX:] $$\bar{\varphi} \geq 1,$$ such that if [TeX:] $$\varphi \geq \bar{\varphi}+\delta \text { with } \bar{\varphi} \geq 0 \text { and } \delta \geq 0, \text { then } \forall q \in \mathcal{Q}$$

[TeX:] $$\left\|\ell^{q(n+1)}-\bar{\ell}^{(n+1)}\right\|_{2} \leq \nu^{\delta} \beta,$$

Proof. Let [TeX:] $$\mathbf{u}^{q} \in \mathbb{R}^{I(I-1)} \text { and } \mathbf{v}^{q} \in \mathbb{R}^{I(I-1)}, \forall q \in \mathcal{Q},$$ at the nth iteration, be defined as

(26)

[TeX:] $$\mathbf{u}^{q(n)} \triangleq \ell^{q(n)}-c^{(n)} \mathbf{g}^{q(n)},$$

with [TeX:] $$\mathrm{g}^{q(n)}$$ being the subgradient of the objective function in (9) at the nth global iteration after [TeX:] $$\mathbf{Q}_{i}^{q(n+1)}$$ is obtained. Then the distance between iterates of [TeX:] $$\mathbf{u}^{q(n)}$$ can be bounded as

(27)

[TeX:] $$\begin{aligned} \left\|\mathbf{u}^{q(n)}-\mathbf{u}^{r(n)}\right\|_{2} &=\left\|\ell^{q(n)}-c^{(n)} \mathrm{g}^{q(n)}-\ell^{r(n)}+c^{(n)} \mathrm{g}^{r(n)}\right\|_{2} \\ & \leq\left\|\ell^{q(n)}-\ell^{r(n)}\right\|_{2}+\left\|c^{(n)}\left(\mathrm{g}^{r(n)}-\mathrm{g}^{q(n)}\right)\right\|_{2} \end{aligned} \begin{aligned} &\leq\left\|\ell^{q(n)}-\ell^{r(n)}\right\|_{2}+2 c^{(n)} G \\ &\stackrel{\text { Lemma1(b) }}{\leq} 2\left(\beta+c^{(n)} G\right) . \end{aligned}\\ $$

Next, the non-expansive property of [TeX:] $$\mathcal{P}_{+}$$ can be used to obtain

(28)

[TeX:] $$\begin{aligned} \left\|\mathcal{P}_{+}\left[\mathbf{u}^{q(n)}\right]-\mathcal{P}_{+}\left[\mathbf{u}^{r(n)}\right]\right\|_{2} & \leq\left\|\mathbf{u}^{q(n)}-\mathbf{u}^{r(n)}\right\|_{2} \\ & \leq 2\left(\beta+c^{(n)} G\right). \end{aligned}$$

Next, (25) implies that [TeX:] $$\ell^{q(n+1)}=\sum_{p \in \mathcal{N}^{q}}\left[\mathbf{W}^{\varphi}\right]_{p q} \mathcal{P}_{+}\left[\mathbf{u}^{p(n+1)}\right],$$ then after consensus step

(29)

[TeX:] $$\begin{aligned} &\left\|\ell^{q(n+1)}-\ell^{r(n+1)}\right\|_{2}= \\ &\left\|\sum_{p \in \mathcal{N}^{q}}\left[\mathbf{W}^{\varphi}\right]_{p q} \mathcal{P}_{+}\left[\mathbf{u}^{p(n+1)}\right]-\sum_{p \in \mathcal{N}^{r}}\left[\mathbf{W}^{\varphi}\right]_{p r} \mathcal{P}_{+}\left[\mathbf{u}^{p(n+1)}\right]\right\|_{2}= \end{aligned} \\ \left\|\left[\mathbf{W}^{\varphi} \otimes \mathbf{I}_{I(I-1)} \mathcal{P}_{+}\left[\mathbf{u}^{(n+1)}\right]\right]_{q}-\left[\mathbf{W}^{\varphi} \otimes \mathbf{I}_{I(I-1)} \mathcal{P}_{+}\left[\mathbf{u}^{(n+1)}\right]\right]_{r}\right\|_{2} \\ \stackrel{(28),(24)}{\leq} 4 \nu^{\varphi}(Q-1)\left(\beta+c^{(n)} G\right) \leq 4 \nu^{\varphi} Q\left(\beta+c^{(n)} G\right),$$

where [TeX:] $$\mathbf{u} \triangleq\left[\mathbf{u}^{1 T} \cdots \mathbf{u}^{Q T}\right]^{T}.$$ Furthermore, by Lemma 1 (a) [TeX:] $$\left\|\ell^{q(n+1)}-\bar{\ell}^{(n+1)}\right\|_{2} \leq 4 \nu^{\varphi} Q\left(\beta+c^{(n)} G\right).$$ Next, substituting

(30)

[TeX:] $$\varphi \geq \frac{\log (\beta)-\log \left(4 Q\left(\beta+c^{(n)} G\right)\right)}{\log (\nu)}+\delta \triangleq \bar{\varphi}+\delta, \delta \geq 0$$

into previous equation produces [TeX:] $$\left\|\ell^{q(n+1)}-\bar{\ell}^{(n+1)}\right\|_{2} \leq \nu^{\delta} \beta$$ and the lemma is proven.

APPENDIX C PROOF OF THEOREM 2

The quantity [TeX:] $$\left\|\ell^{q(1)}-\bar{\ell}^{(1)}\right\|_{2}$$ is upper bounded by [TeX:] $$\beta^{(1)}$$ from Assumption 3. Choose [TeX:] $$\varphi \geq \bar{\varphi}+\delta, \delta \geq 0, \text { with } \bar{\varphi}$$ stated in Theorem 2. Then, by Lemma 3 it follows that,

[TeX:] $$\begin{aligned} \left\|\ell^{q(2)}-\bar{\ell}^{(2)}\right\|_{2} \leq \nu^{\delta} \beta^{(1)} \\ \left\|\ell^{q(3)}-\bar{\ell}^{(3)}\right\|_{2} \leq 4 \nu^{\varphi} Q\left(\nu^{\delta} \beta^{(1)}+c^{(n)} G\right) \stackrel{(30)}{=} \\ & \nu^{\delta} \beta^{(1)} \frac{\nu^{\delta} \beta^{(1)}+c^{(n)} G}{\beta^{(1)}+c^{(n)} G} \end{aligned} \\ \begin{aligned} &\left\|\ell^{q(4)}-\bar{\ell}^{(4)}\right\|_{2} \leq \\ &\frac{\nu^{\delta} \beta^{(1)}}{\beta^{(1)}+c^{(n)} G}\left(\nu^{\delta} \beta^{(1)} \frac{\nu^{\delta} \beta^{(1)}+c^{(n)} G}{\beta^{(1)}+c^{(n)} G}+c^{(n)} G\right) \end{aligned} \\ \begin{aligned} \left\|\ell^{q(n)}-\bar{\ell}^{(n)}\right\|_{2} \leq \nu^{\delta} \beta^{(1)}\left(\frac{\nu^{\delta} \beta^{(1)}}{\beta^{(1)}+c^{(n)} G}\right)^{n-1} \\ +c^{(n)} G\left(-1+\sum_{t=0}^{n-1}\left(\frac{\nu^{\delta} \beta^{(1)}}{\beta^{(1)}+c^{(n)} G}\right)^{t}\right). \end{aligned}$$

Defining [TeX:] $$p \triangleq \frac{\nu^{\delta} \beta^{(1)}}{\beta^{(1)}+c^{(n)} G}, \text { since } p<1,$$ then sum of geometric series formula can be used to obtain

(31)

[TeX:] $$\left\|\ell^{q(n)}-\bar{\ell}^{(n)}\right\|_{2} \leq p^{n-1} \nu^{\delta} \beta^{(1)}+p c^{(n)} G \frac{1-p^{n-1}}{1-p} \triangleq \beta^{(n)}$$

and the theorem is proved.

APPENDIX D PROOF OF THEOREM 3

Definition 1. A vector g is subgradient of a convex function d at x if

(32)

[TeX:] $$d(\mathbf{y}) \geq d(\mathbf{x})+\mathbf{g}^{T}(\mathbf{y}-\mathbf{x}), \forall \mathbf{y} \in \operatorname{dom} f.$$

Definition 2. The set of all subgradients of a convex function d is called subdifferential of d at x, and is denoted by [TeX:] $$\partial d(\mathrm{x}):$$

(33)

[TeX:] $$\partial d(\mathrm{x})=\left\{\mathrm{g} \mid d(\mathrm{y}) \geq d(\mathrm{x})+\mathrm{g}^{T}(\mathrm{y}-\mathrm{x}), \forall \mathrm{y} \in \operatorname{dom} f\right\}.$$

Definition 3. The [TeX:] $$\epsilon$$ -subdifferential set for [TeX:] $$\epsilon \geq 0$$ of a convex function d at x is the collection of [TeX:] $$\epsilon$$-subgradients:

(34)

[TeX:] $$\partial_{\epsilon} d(\mathbf{x})=\left\{\mathbf{g} \mid d(\mathbf{y}) \geq d(\mathbf{x})+\mathrm{g}^{T}(\mathbf{y}-\mathbf{x})-\epsilon, \forall \mathbf{y} \in \operatorname{dom} f\right\}.$$

Next lemma shows important property of [TeX:] $$\epsilon$$-subdifferential.

Lemma 4. Let [TeX:] $$d(\mathrm{x}): X \rightarrow \mathbb{R}$$ be a convex function. Let the set [TeX:] $$X \subset \mathbb{R}^{n}$$ be convex and compact and in particular [TeX:] $$\max _{\mathbf{x} \in X}\|\mathrm{x}\|_{2} \leq \eta.$$ There exist two finite scalars [TeX:] $$\epsilon>0$$ and [TeX:] $$\tau>0$$ such that, [TeX:] $$\forall \mathrm{x} \in X, \forall \mathrm{g}(\mathrm{x}) \in \partial d(\mathrm{x}) \text {, and } \forall \nu, \text { with } \|\nu\|_{2} \leq \tau,$$

[TeX:] $$\mathrm{g}(\mathrm{x})+\nu \in \partial_{\epsilon} d(\mathrm{x})$$

holds.

Proof. The claim is proven by directly using the definition of [TeX:] $$\epsilon$$-subgradient. Since q is convex function [TeX:] $$\forall \mathrm{x}, \mathrm{y} \in X,$$ by (32),

[TeX:] $$\begin{aligned} d(\mathrm{y})-d(\mathrm{x}) & \geq\langle\mathrm{g}(\mathrm{x}), \mathrm{y}-\mathrm{x}\rangle \\ &=\langle\mathrm{g}(\mathrm{x})+\nu, \mathrm{y}-\mathrm{x}\rangle-\langle\nu, \mathrm{y}-\mathrm{x}\rangle \\ & \geq\langle\mathrm{g}(\mathrm{x})+\nu, \mathrm{y}-\mathrm{x}\rangle-\|\nu\|_{2}\|\mathrm{y}-\mathrm{x}\|_{2} \\ & \geq\langle\mathrm{g}(\mathrm{x})+\nu, \mathrm{y}-\mathrm{x}\rangle-2 \tau \eta \end{aligned}$$

holds, which has the form of (34) for [TeX:] $$\tau \leq \epsilon /(2 \eta).$$

Next, define [TeX:] $$\mathbf{y}^{(n)} \in \mathbb{R}^{I(I-1)} \text { and } \mathbf{d}^{(n)} \in \mathbb{R}^{I(I-1)}$$ as

(35)

[TeX:] $$\mathbf{y}^{(n)} \triangleq \mathcal{P}_{+}\left[\overline{\mathbf{u}}^{(n-1)}\right] \text { and } \mathbf{d}^{(n)} \triangleq \bar{\ell}^{(n)}-\mathbf{y}^{(n)},$$

where [TeX:] $$\overline{\mathbf{u}}^{(n-1)} \triangleq 1 / Q \sum_{q} \mathbf{u}^{q(n-1)}, \text { with } \mathbf{u}^{q(n)}$$ defined in (26). To prove convergence, [TeX:] $$\left\|\bar{\ell}^{(n)}-\mathbf{y}^{(n)}\right\|_{2}$$ will be shown to be bounded, where y is updated via an ϵ-subgradient method. Therefore, Proposition 4.1 in [45] can be used to establish a basis for the convergence rate. Then the combination of [TeX:] $$\| \bar{\ell}^{(n)}-\mathrm{y}^{(n)} \|_{2}$$ and convergence rate of the [TeX:] $$\epsilon$$-subgradient method can then directly prove Theorem 3. To see this, the following lemmas need to be established.

Lemma 5. Let [TeX:] $$\mathbf{y}^{(n)}$$ be defined as in (35). Under the same assumptions as in Theorem 2, [TeX:] $$\forall n \geq 1$$

(a) The quantity [TeX:] $$\left\|\mathrm{d}^{(n)} / e^{(n)}\right\|$$ is upper bounded by [TeX:] $$\beta^{(n)} / c^{(n)}$$

(b) The following inequalities are true for all [TeX:] $$q \in \mathcal{Q}$$

(36)

[TeX:] $$d\left(\ell^{q(n)}\right) \geq d\left(\mathbf{y}^{(n)}\right)-2 Q G \beta^{(n)},$$

(37)

[TeX:] $$d^{q}(\mathbf{y}) \geq d^{q}\left(\mathbf{y}^{(n)}\right)+\left\langle\mathbf{g}^{q(n)}+\nu, \mathrm{y}-\mathbf{y}^{(n)}\right\rangle-\frac{\epsilon^{(n)}}{Q}, \mathbf{y} \in \mathbb{R}^{+},$$

where [TeX:] $$d\left(\mathbf{y}^{(n)}\right) \triangleq \sum_{q} d^{q}\left(\mathbf{y}^{(n)}\right) \text { and } \epsilon^{(n)}=Q\left(\beta^{(n)}(4 G+2 \tau)+\zeta\right).$$

(c) The quantity [TeX:] $$\mathbf{h}^{(n)} \triangleq \sum_{q}\left(\mathrm{~g}^{q(n)}+\frac{\mathbf{d}^{(n)}}{c^{(n)}}\right) $$ is an [TeX:] $$\epsilon-$$subgradient of d(y) with respect to y.

(d) The variable [TeX:] $$\mathbf{y}^{(n)}$$ is updated according to the [TeX:] $$\epsilon$$-subgradient method

(38)

[TeX:] $$\mathbf{y}^{(n+1)}=\mathcal{P}_{+}\left[\mathbf{y}^{(n)}-\frac{c^{(n)}}{Q} \mathbf{h}^{(n)}\right] .$$

Proof.

(a) The bound can be proven by proving [TeX:] $$\left\|\mathrm{d}^{(n)}\right\|_{2}<\beta^{(n)}.$$ Using non-expansive property of [TeX:] $$\mathcal{P}_{+},$$

[TeX:] $$\begin{aligned} \left\|\mathbf{d}^{(n)}\right\|_{2} &=\frac{1}{Q}\left\|\sum_{q} \mathcal{P}_{+}\left[\mathbf{u}^{q(n)}\right]-\mathcal{P}_{+}\left[\sum_{q} \overline{\mathbf{u}}^{(n)}\right]\right\|_{2} \\ & \leq \frac{1}{Q} \sum_{q}\left\|\mathbf{u}^{q(n)}-\overline{\mathbf{u}}^{(n)}\right\|_{2} \leq \beta^{(n)}. \end{aligned}$$

(b) By the convexity of [TeX:] $$d^{q}\left(\ell^{q(n)}\right) \text { for } q, r \in \mathcal{Q}$$ it can be written

(39)

[TeX:] $$\begin{aligned} d^{q}\left(\ell^{q(n)}\right) & \stackrel{(33)}{\geq} d^{q}\left(\mathbf{y}^{(n)}\right)+\left\langle\mathbf{g}, \ell^{q(n)}-\mathbf{y}^{(n)}\right\rangle, \text { where } \mathbf{g} \in \partial_{\epsilon^{(n)}} d\left(\mathbf{y}^{(n)}\right) \\ &=d^{q}\left(\mathbf{y}^{(n)}\right)-\left\langle\mathbf{g}, \mathbf{y}^{(n)}-\ell^{q(n)}\right\rangle \\ &=d^{q}\left(\mathbf{y}^{(n)}\right)-\left\langle\mathbf{g}, \mathbf{y}^{(n)}-\bar{\ell}+\bar{\ell}-\ell^{q(n)}\right\rangle \end{aligned} \\ \begin{aligned} &\geq d^{q}\left(\mathbf{y}^{(n)}\right)-\|\mathrm{g}\|_{2}\left\|\mathbf{y}^{(n)}-\ell^{q(n)}\right\|_{2} \\ &\geq d^{q}\left(\mathbf{y}^{(n)}\right)-G\left(\left\|\bar{\ell}^{(n)}-\ell^{q(n)}\right\|_{2}+\left\|\mathbf{d}^{(n)}\right\|_{2}\right) \\ &\geq d^{q}\left(\mathbf{y}^{(n)}\right)-G\left(\beta^{(n)}+\beta^{(n)}\right) \\ &=d^{q}\left(\mathbf{y}^{(n)}\right)-2 G \beta^{(n)}, \end{aligned}$$

where the last inequality is obtained from (31). Then, summing the last equation over [TeX:] $$0 \ni b$$ equals (36). Moreover, for any [TeX:] $$\mathrm{y} \in \mathbb{R}^{+},$$ from Lemma 4 and (39), [TeX:] $$d^{q}(\mathbf{y})$$ is upper bounded as

(40)

[TeX:] $$\begin{aligned} &d^{q}(\mathbf{y}) \stackrel{(33)}{\geq} d^{q}\left(\ell^{q(n)}\right)+\left\langle\mathbf{g}^{q(n)}+\nu, \mathbf{y}-\ell^{q(n)}\right\rangle-\zeta \\ &\quad(39) \\ &\quad \geq d^{q}\left(\mathbf{y}^{(n)}\right)+\left\langle\mathbf{g}^{q(n)}+\nu, \mathbf{y}-\ell^{q(n)}\right\rangle-2 G \beta^{(n)}-\zeta \\ &\quad=d^{q}\left(\mathbf{y}^{(n)}\right)+\left\langle\mathbf{g}^{q(n)}+\nu, \mathbf{y}-\mathbf{y}^{(n)}+\mathbf{y}^{(n)}-\ell^{q(n)}\right\rangle \end{aligned} \\ \begin{aligned} &-2 G \beta^{(n)}-\zeta \\ \geq & d^{q}\left(\mathbf{y}^{(n)}\right)+\left\langle\mathrm{g}^{q(n)}+\nu, \mathrm{y}-\mathrm{y}^{(n)}\right\rangle \\ &-\left\|\mathrm{g}^{q(n)}+\nu\right\|_{2}\left\|\ell^{q(n)}-\mathrm{y}^{(n)}\right\|_{2}-2 G \beta^{(n)}-\zeta. \end{aligned}$$

Since[TeX:] $$\left.\|\nu\|_{2} \leq \tau, \| \mathrm{g}^{q(n)}\right)\left\|_{2} \leq G,\right\| \ell^{q(n)}-\bar{\ell}^{(n)} \|_{2} \leq \beta^{(n)}$$ and [TeX:] $$\left\|\mathbf{d}^{(n)}\right\|_{2} \leq \beta^{(n)},$$ and using (35), it is possible to bound following quantities as

[TeX:] $$\begin{aligned} &\left\|\mathrm{g}^{q(n)}+\nu\right\|_{2} \leq G+\tau \\ &\left\|\ell^{q(n)}-\mathbf{y}^{(n)}\right\|_{2}=\left\|\ell^{q(n)}-\bar{\ell}^{q}+\mathbf{d}^{(n)}\right\|_{2} \leq 2 \beta^{(n)} \text { to obtain } \\ &d^{q}(\mathbf{y}) \geq d^{q}\left(\mathbf{y}^{(n)}\right)+\left\langle\mathbf{g}^{q(n)}+\nu, \mathbf{y}-\mathbf{y}^{(n)}\right\rangle-\underbrace{\left(\beta^{(n)}(4 G+2 \tau)+\zeta\right)}_{\triangleq \epsilon^{(n)} / Q}, \end{aligned}$$

which is (37).

(c) From (33), the inequality in (37) implies [TeX:] $$\left(\mathrm{g}^{q(n)}+\nu\right) \in \partial_{\epsilon^{(n)}} / Q q^{q}(\mathrm{y}) \text { with } \epsilon^{(n)} / Q=\left(\beta^{(n)}(4 G+2 \tau)+\zeta\right).$$ Summing over q yields

[TeX:] $$d(\mathbf{y}) \geq d\left(\mathbf{y}^{(n)}\right)+\left\langle\sum_{q} \mathbf{g}^{q(n)}+\nu, \mathbf{y}-\mathbf{y}^{(n)}\right\rangle-Q\left(\beta^{(n)}(4 G+2 \tau)+\zeta\right),$$

where the inequality holds for any ν satisfying [TeX:] $$\|\nu\| \leq \tau.$$ Since [TeX:] $$\tau$$ can be selected such that [TeX:] $$\beta^{(n)} / c^{(n)} \leq \tau,$$ then it is possible to choose [TeX:] $$\nu=\mathbf{d}^{(n)} / c^{(n)},$$ which proves part (c).

(d) It is sufficient to write explicitly the update rule for [TeX:] $$\mathbf{y}^{(n)}.$$ Using the definition of [TeX:] $$\mathbf{y}^{(n+1)}$$ in (35) and (35), [TeX:] $$\mathbf{y}^{(n+1)}$$ can be written as

[TeX:] $$\begin{aligned} \mathbf{y}^{(n+1)} &=\mathcal{P}_{+}\left[\frac{1}{Q} \sum_{q}\left(\ell^{q(n)}-c^{(n)} \mathrm{g}^{q(n)}\right)\right] \\ &=\mathcal{P}_{+}\left[\bar{\ell}^{(n)}-\frac{c^{(n)}}{Q} \sum_{q} \mathrm{~g}^{q(n)}\right] \end{aligned} \\ \begin{aligned} &=\mathcal{P}_{+}\left[\mathrm{y}^{(n)}+\mathbf{d}^{(n)}-\frac{c^{(n)}}{Q} \sum_{q} \mathrm{~g}^{q(n)}\right] \\ &=\mathcal{P}_{+}\left[\mathrm{y}^{(n)}-\frac{c^{(n)}}{Q} \sum_{q}\left(\mathrm{~g}^{q(n)}+\frac{\mathbf{d}^{(n)}}{c^{(n)}}\right)\right] \end{aligned}$$

which proves part (d).

Proof. (Theorem 3) Given Lemma 5, the sequence [TeX:] $$\left\{\mathbf{y}^{(n)}\right\}$$ is generated with an [TeX:] $$\epsilon^{(n)}$$ -subgradient algorithm to maximize d(y). For [TeX:] $$n \geq 1$$

[TeX:] $$\mathbf{y}^{(n+1)}=\mathcal{P}_{+}\left[\mathbf{y}^{(n)}-\left(c^{(n)} / Q\right) \mathbf{h}^{q(n)}\right].$$

Therefore, by using Proposition 4.2 in [45],

[TeX:] $$\begin{aligned} \lim _{n \rightarrow \infty} \sup d\left(\mathbf{y}^{(n)}\right) &=d^{*}+\lim _{n \rightarrow \infty} \sup Q\left(\beta^{(n)}(4 G+2 \tau)+\zeta\right) \\ &=d^{*}+Q \zeta \end{aligned}$$

holds, which concludes proof of the theorem.

APPENDIX E PROOF OF THEOREM 4

Lemma 6. Let [TeX:] $$\mathbf{y}^{(n)}$$ be defined as (35). Under the same assumptions as Theorem 3

(a) For any [TeX:] $$y \in \mathbb{R}^{+},$$

(41)

[TeX:] $$\sum_{t=1}^{n}\left\langle\mathbf{h}^{(t)}, \mathbf{y}-\mathbf{y}^{(t)}\right\rangle \geq-\frac{\left\|\mathbf{y}^{(1)}-\mathbf{y}\right\|_{2}^{2}}{2 c^{(n)} / Q}-n \frac{c^{(n)} Q(G+\tau)^{2}}{2} \\ \text { (b) For any } \mathrm{y} \in \mathbb{R}^{+} \text {, } $$

(42)

[TeX:] $$\sum_{t=1}^{n}\left\langle\mathbf{h}^{(n)}, \mathbf{y}-\mathrm{y}^{*}\right\rangle \geq-\frac{\left\|\mathrm{y}^{(1)}-\mathbf{y}\right\|_{2}^{2}}{2 c^{(n)} / Q}-n \frac{c^{(t)} Q(G+\tau)^{2}}{2}-\epsilon^{(n)},$$

where [TeX:] $$\epsilon^{(n)}=Q\left(\beta^{(n)}(4 G+2 \tau)+\zeta\right).$$

Proof.

(a) Since [TeX:] $$\left\|\mathrm{g}^{q(n)}+\nu\right\|_{2} \leq G+\tau,$$ using the update rule in (38), for any [TeX:] $$\mathbf{y} \in \mathbb{R}^{+},$$

[TeX:] $$\begin{aligned} &\left\|\mathbf{y}^{(t+1)}-\mathbf{y}\right\|_{2}^{2}=\left\|\mathbf{y}^{(t)}-\frac{c^{(t)}}{Q} \mathbf{h}^{(t)}-\mathbf{y}\right\|_{2}^{2} \\ &=\left(\frac{c^{(t)}}{Q}\right)^{2}\|\mathbf{h}(t)\|_{2}^{2}-\frac{2 c^{(t)}}{Q}\left\langle\mathbf{h}(t), \mathbf{y}^{(t)}-\mathbf{y}\right\rangle+\left\|\mathbf{y}^{(t)}-\mathbf{y}\right\|_{2}^{2} \\ &\leq\left\|\mathbf{y}^{(t)}-\mathbf{y}\right\|_{2}^{2}-\frac{2 c^{(t)}}{Q}\left\langle\mathbf{h}^{(t)}, \mathbf{y}^{(t)}-\mathbf{y}\right\rangle+c^{(t) 2}(G+\tau)^{2}, \end{aligned}$$

holds, where [TeX:] $$\left.\left\|\mathbf{h}^{(t)}\right\|_{2}=\| \sum_{q} \mathbf{g}^{q(t)}+\mathbf{d}^{(t)} / c^{(t)}\right) \|_{2} \leq Q(G+\tau).$$ Therefore, for any [TeX:] $$\mathbf{y} \in \mathbb{R}^{+}$$

(43)

[TeX:] $$\left\langle\mathbf{h}^{(t)}, \mathbf{y}-\mathbf{y}^{(t)}\right\rangle \geq \frac{\left\|\mathbf{y}^{(t+1)}-\mathbf{y}\right\|_{2}^{2}-\left\|\mathbf{y}^{(t)}-\mathbf{y}\right\|_{2}^{2}}{2 c^{(t)} / Q}-\frac{c^{(t)} Q(G+\tau)^{2}}{2}.$$

Summing across t leads to

[TeX:] $$\sum_{t=1}^{n}\left\langle\mathbf{h}^{(t)}, \mathbf{y}-\mathbf{y}^{(t)}\right\rangle \geq \frac{-\left\|\mathbf{y}^{(1)}-\mathbf{y}\right\|_{2}^{2}}{2 c^{(n)} / Q}-n \frac{c^{(n)} Q(G+\tau)^{2}}{2},$$

which proves part (a).

(b) Since [TeX:] $$\mathbf{h}^{(t)}$$ is an [TeX:] $$\epsilon^{(t)}$$-subgradient of the dual function [TeX:] $$d(\cdot)$$ at [TeX:] $$\mathbf{y}^{(t)},$$ using the [TeX:] $$\epsilon$$-subgradient inequality (34)

[TeX:] $$\begin{aligned} d\left(\mathbf{y}^{*}\right) & \geq d\left(\mathbf{y}^{(t)}\right)+\left\langle\mathbf{h}^{(t)}, \mathbf{y}^{*}-\mathbf{y}^{(t)}\right\rangle-\epsilon^{(t)} \\ \Rightarrow\left\langle\mathbf{h}^{(t)}, \mathbf{y}^{(t)}-\mathbf{y}^{*}\right\rangle & \geq d\left(\mathbf{y}^{(t)}\right)-d\left(\mathbf{y}^{*}\right)-\epsilon^{(t)} \geq-\epsilon^{(t)}, \end{aligned}$$

where the last inequality comes from the optimality condition [TeX:] $$d\left(\mathbf{y}^{(t)}\right) \geq d\left(\mathbf{y}^{*}\right).$$ Then, it follows

(44)

[TeX:] $$\begin{aligned} \left\langle\mathbf{h}^{(t)}, \mathbf{y}-\mathbf{y}^{*}\right\rangle &=\left\langle\mathbf{h}^{(t)}, \mathbf{y}-\mathbf{y}^{(t)}\right\rangle+\left\langle\mathbf{h}^{(t)}, \mathbf{y}^{(t)}-\mathbf{y}^{*}\right\rangle \\ & \geq\left\langle\mathbf{h}^{(t)}, \mathbf{y}-\mathbf{y}^{(t)}\right\rangle-\epsilon^{(t)}. \end{aligned}$$

Substituting the right-hand side of (43) into the right-hand side of (44), it follows

[TeX:] $$\begin{aligned} \left\langle\mathbf{h}^{(t)}, \mathbf{y}-\mathbf{y}^{*}\right\rangle \geq & \frac{\left\|\mathbf{y}^{(t+1)}-\mathbf{y}\right\|_{2}^{2}-\left\|\mathbf{y}^{(t)}-\mathbf{y}\right\|_{2}^{2}}{2 c^{(t)} / Q} \\ &-\frac{c^{(t)} Q(G+\tau)^{2}}{2}-\epsilon^{(t)}. \end{aligned}$$

Summing the above expression across t to get

[TeX:] $$\begin{aligned} \sum_{t=1}^{n}\left\langle\mathbf{h}^{(t)}, \mathbf{y}-\mathbf{y}^{*}\right\rangle \geq &-\frac{\left\|\mathbf{y}^{(1)}-\mathbf{y}\right\|^{2}}{2 c^{(t)} / Q}-n \frac{c^{(t)} Q(G+\tau)^{2}}{2} \\ &-\sum_{t=1}^{n} \epsilon^{(t)}, \end{aligned}$$

which proves part (b).

Proof. (Theorem 4)

(a) By concavity of the primal cost function [TeX:] $$f(n) \triangleq \sum_{q} f q(n)$$ and definition of [TeX:] $$\mathrm{Q}^{q(n+1)}$$ as a local Lagrangian maximizer of local Lagrangian function [TeX:] $$\mathcal{L}^{q}\left(\mathbf{Q}^{q}, \ell^{q}\right),$$

(45)

[TeX:] $$\begin{aligned} f^{(n)} &=\sum_{q} d^{q}\left(\ell^{q(n)}\right)-\left\langle\mathrm{g}^{q(n)}, \ell^{q(n)}\right\rangle \\ & \geq \frac{1}{n} \sum_{t=1}^{n} \sum_{q} d^{q}\left(\ell^{q(t)}\right)-\left\langle\mathrm{g}^{q(t)}, \ell^{q(t)}\right\rangle. \end{aligned}$$

is true. By (37) in Lemma 5 with [TeX:] $$\mathbf{y}=\ell^{q(t)}$$

[TeX:] $$\begin{aligned} d^{q}\left(\ell^{q(t)}\right) \geq & d^{q}\left(\mathbf{y}^{(t)}\right)+\left\langle\mathrm{g}^{q(t)}, \ell^{q(t)}\right\rangle+\left\langle\nu, \ell^{q(t)}\right\rangle \\ &-\left\langle\mathrm{g}^{q(t)}+\nu, \mathrm{y}^{(t)}\right\rangle-\epsilon^{(t)} / Q \end{aligned}$$

with [TeX:] $$\epsilon^{(t)} / Q=\beta^{(t)}(4 G+2 \tau)+\zeta.$$ Summing over q leads to

(46)

[TeX:] $$\begin{aligned} \sum_{q} d^{q}\left(\ell^{q(t)}\right) \geq & \sum_{q}\left(d^{q}\left(\mathbf{y}^{(t)}\right)+\left\langle\ell^{q(t)}, \mathbf{g}^{q(t)}\right\rangle\right.\\ &\left.+\left\langle\nu, \ell^{q(t)}\right\rangle-\left\langle\mathbf{g}^{q(t)}+\nu, \mathbf{y}^{(t)}\right\rangle-\epsilon^{(t)} / Q\right). \end{aligned}$$

Replacing [TeX:] $$d^{q}\left(\ell^{q(t)}\right)$$ in (45) with the lower bound in the preceding equation, (45) becomes

(46)

[TeX:] $$f^{(n)} \geq \frac{1}{n} \sum_{t=1}^{n} \sum_{q} d^{q}\left(\mathbf{y}^{(t)}\right)+\left\langle\nu, \ell^{q(t)}\right\rangle-\left\langle\mathbf{g}^{q(t)}+\nu, \mathbf{y}^{(t)}\right\rangle-\frac{\epsilon^{(t)}}{Q} .$$

Since [TeX:] $$\mathbf{h}^{(t)}=\sum_{q} \mathbf{g}^{q(t)}+\nu,$$ the third term on the righthand side in (46) can be replaced by the lower bound in Lemma 6(a) by setting y=0. Furthermore, it is possible to bound [TeX:] $$\left\langle\nu, \ell^{q(t)}\right\rangle \geq-\tau \Lambda,$$ which comes by construction with [TeX:] $$\|\nu\|_{2} \leq \tau \text { and }\left\|\ell^{q(t)}\right\|_{2} \leq \Lambda.$$ Hence, (46) can be rewritten as

[TeX:] $$f^{(n)} \geq d\left(\mathbf{y}^{(n)}\right)-Q \tau \Lambda-\frac{\left\|\mathbf{y}^{(1)}\right\|_{2}^{2}}{2 n c^{(n)} / Q}-\frac{c^{(n)} Q(G+\tau)^{2}}{2}-\frac{1}{n} \sum_{t=1}^{n} \epsilon^{(t)}.$$

By strong duality [TeX:] $$d^{*}=f^{*} \text { and } d(\mathbf{y}) \geq d^{*}, \forall \mathrm{y} \in \mathbb{R}^{+},$$ so that the equation above can be rewritten as

[TeX:] $$f^{(n)} \geq f^{*}-Q \tau \Lambda-\frac{\left\|\mathbf{y}^{(1)}\right\|_{2}^{2}}{2 n c^{(n)} / Q}-\frac{c^{(n)} Q(G+\tau)^{2}}{2}-\frac{1}{n} \sum_{t=1}^{n} \epsilon^{(t)}.$$

In addition, from Assumption 5 [TeX:] $$\frac{1}{n} \sum_{t=1}^{n} \epsilon^{(t)} \leq \epsilon^{(1)}$$ and [TeX:] $$\left\|\mathrm{y}^{(1)}\right\|_{2}^{2} \leq \Lambda^{2} \text { since } \epsilon^{(n)}$$ is a nondecreasing sequence (because [TeX:] $$\beta^{(n)}$$ is an nondecreasing sequence) with [TeX:] $$\epsilon^{(1)} \geq \cdots \geq \epsilon^{(n)},$$ for [TeX:] $$n>1.$$ The expression above can be further rewritten as

[TeX:] $$\begin{aligned} f^{(n)} \geq & f^{*}-Q \tau \Lambda-\frac{\Lambda^{2}}{2 n c^{(n)} / Q}-\frac{c^{(n)} Q(G+\tau)^{2}}{2} \\ &-Q\left(\beta^{(1)}(4 G+2 \tau)+\zeta\right) \end{aligned}$$

and part (a) of the theorem is proven.

(b) Given any dual optimal solution y∗,

(47)

[TeX:] $$f^{(n)}=f^{(n)}+\left\langle\sum_{q} \mathrm{~h}^{q(n)}, \mathbf{y}^{*}\right\rangle-\left\langle\sum_{q} \mathbf{h}^{q(n)}, \mathbf{y}^{*}\right\rangle$$

is true. Also the first two terms on the right-hand side can be rewritten as

[TeX:] $$\begin{aligned} f^{(n)}+\left\langle\sum_{q} \mathrm{~h}^{q(n)}, \mathrm{y}^{*}\right\rangle=& f^{(n)}+\left\langle\sum_{q} \mathrm{~g}^{q(n)}, \mathbf{y}^{*}\right\rangle \\ &+Q\left\langle\mathbf{d}^{(n)} / c^{(n)}, \mathbf{y}^{*}\right\rangle \\ \leq & f^{(n)}+\left\langle\sum_{q} \mathrm{~g}^{q(t)}, \mathrm{y}^{*}\right\rangle+Q \Lambda \tau, \end{aligned}$$

where the Cauchy-Schwarz inequality was used to bound

[TeX:] $$\left\langle\mathrm{d}^{(n)} / c^{(n)}, \mathrm{y}^{*}\right\rangle \leq\left\|\mathrm{d}^{(n)} / c^{(n)}\right\|_{2}\left\|\mathrm{y}^{*}\right\|_{2} \leq \tau \Lambda.$$

Next, from the saddle point property, i.e., [TeX:] $$\mathcal{L}\left(\mathrm{Q}^{(n)}, \mathrm{y}^{*}\right) \leq \mathcal{L}\left(\mathbf{Q}^{*}, \mathbf{y}^{*}\right)=f^{*},$$ is used to further rewrite the bound as

(48)

[TeX:] $$\begin{aligned} f^{(n)}+\left\langle\sum_{q} \mathrm{~g}^{q(n)}, \mathrm{y}^{*}\right\rangle+Q \tau \Lambda &=\mathcal{L}\left(\mathbf{Q}^{(n)}, \mathrm{y}^{*}\right)+Q \tau \Lambda \\ & \leq f^{*}+Q \tau \Lambda. \end{aligned}$$

Also, it is possible to upper bound [TeX:] $$-\left\langle\mathrm{y}^{*}, \mathbf{h}^{(n)}\right\rangle$$ by substituting [TeX:] $$\mathrm{y}=2 \mathrm{y}^{*} \in \mathbb{R}^{+}$$ inside part (b) of Lemma 6,

(49)

[TeX:] $$-\left\langle\mathbf{h}^{(n)}, \mathbf{y}^{*}\right\rangle \leq \frac{\left\|\mathbf{y}^{(n)}-2 \mathbf{y}^{*}\right\|_{2}^{2}}{2 c^{(n)} / Q}+\frac{c^{(n)} Q(G+\tau)^{2}}{2}+\epsilon^{(n)} \\ \leq \frac{5 \Lambda^{2}}{2 c^{(n)} / Q}+\frac{c^{(n)} Q\left((G+\tau)^{2}\right.}{2}+\epsilon^{(n)},$$

where the last inequality was obtained due to the fact that [TeX:] $$\left\|\mathbf{y}^{(n+1)}-2 \mathbf{y}^{*}\right\|_{2}^{2} \leq\left\|\mathbf{y}^{(n)}\right\|_{2}^{2}+\left\|2 \mathbf{y}^{*}\right\|_{2}^{2} \leq 5 \Lambda^{2}$$ by Assumption 5. Therefore, by putting (48) and (49) into (47), an upper bound on [TeX:] $$f^{(n)}$$ can be obtained as

[TeX:] $$\begin{aligned} f^{(n)} \leq & f^{*}+Q \tau \Lambda+\frac{5 \Lambda^{2}}{2 n c^{(n)} / Q}+\frac{c^{(n)} Q(G+\tau)^{2}}{2} \\ &+Q\left(\beta^{(1)}(4 G+2 \tau)+\zeta\right), \end{aligned}$$

which concludes the proof.

Biography

Mykola Servetnyk

Mykola Servetnyk (S’17-M’20) received his B.Sc. degree in Electronics Engineering in 2015 and Ph.D. degree in Electronics Engineerin & Computer Science in 2020, both from National Chiao Tung University, Hsinchu, Taiwan. He also received a B.Sc. degree in 2017 in Applied Mathematics from Taras Shevchenko National University of Kyiv, Kyiv, Ukraine. He was a Visiting Research Assistant in 2019 at the University of Houston, TX, USA. He is currently working as a Software Engineer in tests for wireless communications systems. His research interests include mobile communications, distributed algorithm design, and data science.

Biography

Carrson C. Fung

Carrson C. Fung received his B.S. degree from Carnegie Mellon University, Pittsburgh, PA, USA, in 1994, M.S. degree from Columbia University, New York City, NY , USA, in 1996 and Ph.D. degree at The Hong Kong University of Science and Technology in 2005, all in Electrical Engineering. He was the Recipient of the prestigious Sir Edward Youde Ph.D. Fellowship in 2001-2002. He was the Recipient of the National Chiao Tung University College of Electrical & Computer Engineering Excellent Teaching Award in 2011, 2017, and 2018. He received the National Chiao Tung University College of Electrical & Computer Engineering Outstanding Teaching Award in 2020. He also received the National Chiao Tung University Excellent Teaching Award in 2019. He became a Fellow of the Advance Higher Education Academy in 2020. He was a Member of Technical Staff at AT&T and Lucent Technologies Bell Laboratories, Holmdel, NJ, USA, from 1994-1999, where he worked on video and audio coding. He was also a Researcher at the Hong Kong Applied Science and Technology Research Institute (ASTRI) in 2005, where he worked on MIMO-OFDM systems and a Senior DSP Engineer at Sennheiser Research Lab in Palo Alto, CA, USA, in 2006, where he worked on microphone and microphone array technologies. He is currently an Associate Professor at the National Yang Ming Chiao Tung University in Hsinchu, Taiwan. His research interests include graph signal processing, data science, and optimization.

References

1 3GPP, "TR 36.819 coordinated multi-point operation for LTE physical layer aspects", 2013.custom:[[[https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=2498]]]
2 G. Interdonato, E. Bj¨ ornson, H.Q. Ngo, P. Frenger, and E.G. Larsson, "Ubiquitous cell-free massive MIMO communications," EURASIP J. Wireless Commun. and Netw., vol. 197, pp. 1-13, 2019.doi:[[[10.1186/s13638-019-1507-0]]]
3 Qualcomm: How can comp extend 5g nr to high capacity and ultra-reliable communications? (Online). Available: https://www.qualcomm.com/media/documents/files/how-comp-canextend-5g-nr-to-high-capacity-and-ultra-reliable-communications.pdf.custom:[[[https://www.qualcomm.com/media/documents/files/how-comp-canextend-5g-nr-to-high-capacity-and-ultra-reliable-communications.pdf]]]
4 R. Irmer et al., "Coordinated multipoint: Concepts, performance, and field trial results," IEEE Commun. Mag., vol. 49, pp. 102-111, Feb. 2011.doi:[[[10.1109/MCOM.2011.5706317]]]
5 Y . Cheng, S. Drewes, A. Philipp, and M. Pesavento, "Joint network topology optimization and multicell beamforming using mixed integer programming," IEEE ITG WSA, pp. 187-192, Mar. 2012.doi:[[[10.1109/wsa.2012.6181204]]]
6 J. Zhao, T.Q.S. Quek, and Z. Lei, "Coordinated multipoint transmission with limited backhaul data transfer," IEEE Trans. Wireless Commun., vol. 12, pp. 2762-2775, Jun. 2013.doi:[[[10.1109/TWC.2013.050613.120825]]]
7 K.-G. Nguyen, Q.-D. Vu, M. Juntti, and L.-N. Tran, "Globally optimal beamforming design for downlink CoMP transmission with limited backhaul capacity," in Proc. IEEE ICASSP, Mar. 2017.doi:[[[10.1109/icassp.2017.7952837]]]
8 P. Luong, L.-N. Tran, C. Despins and G. Gagnon, "Joint beamforming and remote radio head selection in limited fronthaul C-RAN," in Proc. IEEE VTC, Sep. 2016.doi:[[[10.1109/vtcfall.2016.7881055]]]
9 Q. Shi, M. Razaviyayn, Z.-Q. Luo and C. He, "An iteratively weighted MMSE approach to distributed sum-utility maximization for a MIMO interfering broadcast channel," IEEE Trans. Signal Process., vol. 59, pp. 4331-4340, Sep. 2011.doi:[[[10.1109/TSP.2011.2147784]]]
10 S.S. Christensen, R. Agarwal, E. Carvalho, and J.M. Cioffi, "Weighted sum-rate maximization using weighted MMSE for MIMO-BC beamforming design," IEEE Trans. Wireless Commun., vol. 7, no. 12, pp. 4792-4799, Dec. 2008.doi:[[[10.1109/T-WC.2008.070851]]]
11 R. Agarwal and J.M. Cioffi, "Beamforming design for the MIMO downlink for maximizing weighted sum-rate," ISITA, pp. 1-6, Dec. 2008.doi:[[[10.1109/isita.2008.4895555]]]
12 P. Komulainen, A. Tolli, and M. Juntti, "Effective CSI signaling and decentralized beam coordination in TDD multi-cell MIMO systems," IEEE Trans. Signal Process., vol. 61, no. 9, pp. 2204-2218, May 2013.doi:[[[10.1109/TSP.2013.2247597]]]
13 P.C. Weeraddana, M. Codreanu, M. Latva-aho, and A. Ephremides, "Multicell MISO downlink weighted sum-rate maximization: A distributed approach," IEEE Trans. Signal Process., vol. 61, no. 3, pp. 556-570, Feb. 2013.doi:[[[10.1109/TSP.2012.2225060]]]
14 A. T¨ olli, H. Pennanen, and P. Komulainen, "Decentralized minimum power multi-cell beamforming with limited backhaul signaling," IEEE Trans. Wireless Commun., vol. 10, no. 2 pp. 570-580, Feb. 2011.doi:[[[10.1109/TWC.2011.120810.100363]]]
15 C. Shen, T.-H.Chang, K.-Y . Wang, Z. Qiu, and C.-Y . Chi, "Distributed robust multicell coordinated beamforming with imperfect CSI: An ADMM approach," IEEE Trans. Signal Process., no.6, vol. 60, pp. 2988-3003, Jun. 2012.doi:[[[10.1109/TSP.2012.2188719]]]
16 M. Hong, R. Sun, H. Baligh and Z.-Q. Luo, "Joint base station clustering and beamformer design for partial coordinated transmission in heterogeneous networks," IEEE J. Sel. Areas Commun., no.2, vol. 31, pp. 226-240, Feb. 2013.doi:[[[10.1109/JSAC.2013.130211]]]
17 J. Kaleva et al., "Decentralized sum rate maximization with QoS constraints for interfering broadcast channel via successive convex approximation," IEEE Trans. Signal Process., vol. 64, no. 11, pp. 2788-2802, Jun. 2016.doi:[[[10.1109/TSP.2016.2531620]]]
18 J. Kaleva et al., "Decentralized joint precoding with pilot-aided beamformer estimation," IEEE Trans. Signal Process., no.9, vol. 66, pp. 23302341, May 2018.doi:[[[10.1109/TSP.2018.2812750]]]
19 I. Atzeni, I., B. Gouda, and A. T¨ olli, "Distributed precoding design via over-the-air signaling for cell-free massive MIMO," IEEE Trans. Wireless Commun., no.2, vol. 20, pp. 1201-1216, Feb. 2021.doi:[[[10.1109/twc.2020.3031807]]]
20 R. Olfati-Saber and R.M. Murray, "Consensus problems in networks of agents with switching topology and time-delays," IEEE Trans. Automatic Control, vol. 49, no. 9, pp. 1520-1533, Sep. 2004.doi:[[[10.1109/TAC.2004.834113]]]
21 A. Simonetto and H. Jamali-Rad, "Primal recovery from consensus-based dual decomposition for distributed convex optimization," J. Optim. Theory Appl., vol. 168, no. 1, pp. 172-197, May 2015.doi:[[[10.1007/s10957-015-0758-0]]]
22 A. Falsone, K. Margellos, S. Garatti, and M. Prandini, "Dual decomposition for multi-agent distributed optimization with coupling constraints," Automatica, vol. 84, pp. 149-158, Oct. 2017.doi:[[[10.1016/j.automatica.2017.07.003]]]
23 S. Liang, L.Y . Wang, and G. Yin, "Distributed smooth convex optimization with coupled constraints," IEEE Trans. Autom. Control, vol. 65, no. 1, pp. 347-353, Jan. 2020.doi:[[[10.1109/tac.2019.2912494]]]
24 A. Falsone, I. Notarnicola, G. Notarsefano, and M. Prandini, "Tracking ADMM for distributed constraint-coupled optimization," Automatica, vol. 117, Jul. 2020.doi:[[[10.1016/j.automatica.2020.108962]]]
25 T. Sherson, R. Heusden, and W.B. Kleijn, "On the distributed method of multipliers for separable convex optimization problems," IEEE Trans. Signal Inf. Proces. Over Netw., vol. 5, no. 3, pp. 495-510, Mar. 2019.doi:[[[10.1109/tsipn.2019.2901649]]]
26 B. Johansson, T. Keviczky, M. Johansson, and K.H. Johansson, "Subgradient methods and consensus algorithms for solving convex optimization problems," in Proc. IEEE CDC, Dec. 2008.doi:[[[10.1109/cdc.2008.4739339]]]
27 J.C. Duchi, A. Agarwal, and M.J. Wainwright, "Dual averaging for distributed optimization: Convergence analysis and network scaling," IEEE Trans. Automatic Control, no.3, vol. 57, pp. 592-606, Mar. 2012.doi:[[[10.1109/TAC.2011.2161027]]]
28 C.-Y . Chang and C.C. Fung, "Sparsity enhanced mismatch model for robust spatial intercell interference cancelation in heterogeneous networks," IEEE Trans. Commun., vol. 63, no. 1, pp. 125-139, Jan. 2015.doi:[[[10.1109/TCOMM.2014.2369037]]]
29 C.-Y . Chang and C.C. Fung, "Sparsity enhanced mismatch model for robust spatial intercell interference management in heterogeneous networks with doubly-selective fading channels," IEEE Trans. Commun., vol. 63, no. 7, pp. 2671-2684, Jul. 2015.doi:[[[10.1109/TCOMM.2015.2427160]]]
30 C.C. Fung and S.-T. Tai, "Orthogonalized" sparsity enhanced mismatch models for wireless heterogeneous networks," Physical Commun., vol. 51, pp. 1-14, 2022.doi:[[[10.1016/j.phycom.2021.101520]]]
31 A. Nagate, S. Nabatame, D. Ogata, K. Hoshino and T. Fujii, "Field experiment of CoMP joint transmission over x2 interface for LTE-advanced," in Proc. IEEE VTC, Jun. 2013.doi:[[[10.1109/vtcspring.2013.6692564]]]
32 X. Yu, Q. Cui, and M. Haenggi, "Coherent joint transmission in downlink heterogeneous cellular networks," IEEE Wireless Commun. Lett., vol. 7, no. 2, pp. 274-277, Apr. 2018.doi:[[[10.1109/LWC.2017.2771517]]]
33 J. Sun, W. Chen, P. Gaal, and J. Montojo, "Non-coherent joint transmission techniques," World International Property Organization WO 2018/031872 A1, 2018.custom:[[[https://patents.google.com/patent/WO2018031872A1/en]]]
34 3GPP, TR 38.913 v14.3.9 5g: Study on scenarios and requirements for next generation access technologies, 2018.custom:[[[-]]]
35 C. Helmberg, F. Rendl, R.J. Vanderbei, and H. Wolkowicz, "An interiorpoint method for semidefinite programming," SIAM J. Optimization, vol. 6, no 2, pp. 342-361, 1996.doi:[[[10.1007/978-0-387-74759-0_292]]]
36 M. Servetnyk and C.C. Fung, "Distributed joint transmitter design and selection using augmented ADMM," in Proc. IEEE ICASSP, May 2019.doi:[[[10.1109/icassp.2019.8682789]]]
37 L. Xiao and S. Boyd, "Optimal scaling of a gradient method for distributed resource allocation," J. of Optim. Theory Appl., vol. 129, no 3, pp. 469-488, Nov. 2006.doi:[[[10.1007/s10957-006-9080-1]]]
38 S.M. Robinson, "Linear convergence of epsilon-subgradient descent methods for a class of convex functions," Mathematical Programming, vol. 86, no. 1, pp. 41-50, 1999.doi:[[[10.1007/s101070050078]]]
39 M. Servetnyk and C.C. Fung, "Precoding and selection for coordinated multipoint transmission in fronthaul-constrained cloud-RAN," IEEE Wireless Commun. Lett., vol. 9, no. 1, pp. 51-55, Jan. 2020.doi:[[[10.1109/lwc.2019.2941209]]]
40 S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein, "Distributed optimization and statistical learning via the alternating direction of method of multipliers," Foundations Trends Machine Learning, vol. 3, no. 1, pp. 1-122, 2010.doi:[[[10.1561/2200000016]]]
41 L. Vandenberghe, V .R. Balakrishnan, R. Wallin, A. Hansson, and T. Roh, "Interior-point algorithms for semidefinite programming problems derived from the KYP lemma," Henrion D., Garulli A. (eds) Positive Polynomials in Control, Lecture Notes in Control and Information Science, Springer, Berlin, Heidelberg, vol. 312, pp. 195-238, Aug. 2005.doi:[[[10.1007/10997703_12]]]
42 J. Mattingley and S. Boyd, "Real-time convex optimization in signal processing," IEEE Signal Process. Mag., vol. 27, no. 3, pp. 50-61, May 2010.doi:[[[10.1109/MSP.2010.936020]]]
43 3GPP TS 38.101-1 v17.6.0, "3GPP Technical Specification Group Radio Access Network; NR; User Equipment (UE) radio transmission and reception; Part 1: Range 1 Standalone (Release 17)", 2017.custom:[[[-]]]
44 J. Hansryd and J. Edstam, "Microwave capacity evolution," Ericsson Review, vol. 1, pp. 22-27, 2011.custom:[[[-]]]
45 A. Nedic, "Subgradient methods for convex minimization," Ph.D. Thesis, Massachusetts Institute of Technology, 2002.custom:[[[-]]]
46 Q.D. Vu, K.G. Nguyen, and M. Juntti, "Weighted max-min fairness for C-RAN multicasting under limited fronthaul constraints," IEEE Trans. Commun., vol. 66, no. 4, pp. 1534-1548, Apr. 2018.doi:[[[10.1109/TCOMM.2017.2786664]]]
47 M. Grant, and S. Boyd, "CVX: Matlab software for disciplined convex programming, version 2.1," 2014.custom:[[[http://cvxr.com/cvx/citing/]]]
48 ApS, M. O. S. E. K., MOSEK optimization suite, 2017.doi:[[[10.1007/978-3-211-79280-3_73]]]

Received: January 30 2022

Revision received: June 16 2022

Accepted: July 21 2022

Published (Electronic): August 31 2022

Corresponding Author: Carrson C. Fung , c.fung@ieee.org.

Mykola Servetnyk, Institute of Electronics, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan, rusly.eic04g@nctu.edu.tw

Carrson C. Fung, Institute of Electronics, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan, c.fung@ieee.org.