Dynamic Power Optimization of Pilot and Data for Downlink OFDMA Systems

Yong Liu; Fei Liu; Guang Zhu; Xiaolei Wang; Yuechao Jiao

doi:10.23919/JCN.2021.000023

ISSN: 1976-5541

Volume 23, No 4 (2021), pp. 250 - 259

10.23919/JCN.2021.000023

Yong Liu , Fei Liu , Guang Zhu , Xiaolei Wang and Yuechao Jiao

Dynamic Power Optimization of Pilot and Data for Downlink OFDMA Systems

Abstract: In this paper, we investigate the power allocation problem in downlink orthogonal frequency-division multiple access (OFDMA) networks. Different from previous researches on power allocation, we take into account various practical factors, such as the stochastic traffic arrival, the time-varying channel, the queue stability requirements of all users, the channel estimation cost and the corresponding effect of imperfect channel state information (CSI) on data transmission rate. The power allocation problem is formulated as maximizing the time-averaged data transmission rate by optimizing pilot and data power allocation subject to the queue stability and the maximum transmit power constraints. The data transmission rate is defined in terms of the pilot transmit power, the data transmit power and the channel estimation error, which is non-concave. To solve the non-concave and stochastic optimization problem, a dynamic pilot and data power allocation (DPDPA) algorithm is proposed with the aids of approximate transformation, Lyapunov optimization and Lagrange dual formulation. Moreover, we derive the bounds of performances, in terms of the time-averaged data transmission rate and queue length.

Keywords: Channel estimation , dynamic power control , imperfect CSI , queue stability

I. Introduction

The tremendous growth of applications, such as entertainment, online shopping, video conferencing and social media, lends to strong demands for high-data rate and low latency services in future wireless mobile communication [1]. Due to its ability of supporting high data rate, the orthogonal frequencydivision multiple access (OFDMA) has been adopted as the major access scheme for the next generation. Meanwhile, appropriate algorithm design for resource allocation can further improve the data transmission rate, while reducing delay for data transmission by efficiently used the limited wireless resource. At present, there exist extensive researches to maximize the overall throughput of OFDMA-based systems by devising appropriate resource allocation policies [2]–[6]. However, these works are based a common assumption of perfect channel state information (CSI), which is impossible obtained in practice. In practice wireless communication systems, pilot symbols are usually in- serted into data stream to facilitate channel estimation. The estimation accuracy increases with the resource cost used to send pilot symbols such as power, subcarrier, and time. While excessive estimation cost necessarily leads to the reduction of communication resource for data transmission, further degrading the data transmission rate. Hence, the resource allocation problem should take into account the resource balance between pilot and data.

In addition, data traffic arrives randomly in wireless radio access network, and is stored in a separate queue corresponding to each user device in base station before transmission. The wireless radio access network is in charge of transmitting these stored data to user devices through wireless channel. The wireless channel is usually fluctuant in practice wireless networks due to various factors such as path loss and user mobility. To increase system throughput, the operator allocates resource to the users with good channel conditions. Thus, the event of queue length (i.e. delay) increase will occur, if the channel condition degrades to the event that the instantaneous transmission rate from BS is smaller than (or cannot support) data arrival rate. However, in practice application, queue length is one of the most important metrics for user perceived application performance. The longer queue lends to poor user perceived application performance, and users even quit the application or service. Thus, in practical wireless systems, the resource allocation algorithm should be designed with considering not only CSI, but also queue state information.

In recent years, many investigations on resource allocation have been performed with practical factors. The optimal power allocation and subcarrier assignment between pilot and data were proposed to maximize the system capacity in [7] and [8], respectively. In [9], the authors investigated the power optimization problem to maximize the system EE with considering the CSI estimation cost for downlink OFDMA multi-user systems. In [10], the joint power and subcarrier allocation problem between pilot and data was studied to maximize each user’s EE in multi-user OFDMA wireless networks. Note that the works [7]– [10] assume infinite backlog at the transmitter and the stationary channel conditions, while disregarding the time-varying channel conditions, the stochastic data arrivals, and the associated queue stability demands of all user devices. For matching time-varying channel conditions, stochastic data arrivals, and queue stability requirements of all user devices, many resource allocation polices have been designed with various optimization objectives, i.e., power consumption minimization [11], rate maximization [12], and energy efficiency maximization [13]. However, the authors in [11]–[13] devised dynamic resource allocation policies relying on the assumption of perfect CSI, which doesn’t occur in reality. Comparatively, the paper [14] proposed a policy to simultaneously perform optimal subband assignment and the rate allocation under imperfect feedback CSI and queue stability constraints of each user. With taking into account channel estimation error and queue stability constraints, the dynamic power and subcarrier allocation problem was investigated in [15]. Although [14], [15] took into account both imperfect CSI and queue stability demands, they fixed pilot resource without considering resource couple between pilot and data. To the best of our knowledge, the existing researches in resource allocation, based on one or more constraints: Time-varying channel, imperfect CSI, resource tradeoff between pilot and data, and queue stability, do not jointly consider these practical factors.

In this paper, we investigate the power allocation in downlink OFDMA networks with respect to the channel estimation cost and the corresponding effect of imperfect CSI on data transmission rate as well as the time-varying channel and the queue stability constraints. The data transmission rate is defined as a function of the pilot transmit power, the data transmit power and the channel estimation error. We formulate the problem as a stochastic optimization model for maximizing the timeaveraged data transmission rate under the queue stability and the maximum transmit power constraints. This problem is extremely challengeable due to the inherently non-concave definition of data transmission rate. To tackle the tough non-concave and stochastic optimization model, we propose a dynamic power allocation algorithm (DPDPA), by exploiting approximate transformation, Lyapunov optimization technique and Lagrange dual formulation. By theoretical analysis, the bounds of performances in terms of the time-averaged data transmission rate and queue length are determined.

The rest of the paper is organized as follows. In Section II, we describe the system scenario, introduce the definition of queue stability, and formulate the resource optimization problem with imperfect CSI. The DPDPA is devised to solve this optimization problem in Section III, and its performance is analyzed in Section IV. Section V presents and comments simulation results. Finally, we conclude the paper in Section VI.

II. SYSTEM MODEL

A. Description of the System Scenario

Consider a downlink OFDMA system consisting of one BS and user set [TeX:] $$\mathbf{M}=\{1,2, \cdots, M\}.$$ Denote [TeX:] $$\mathrm{g}_{i}(\tau)= \left[g_{0, i}(\tau), g_{1, i}(\tau), \cdots, g_{l, i}(\tau), \cdots, g_{L_{i}, i}(\tau)\right]$$ as the channel vector between the BS and user i, where [TeX:] $$g_{l, i}(\tau)$$ is the channel impulse response of the lth path at time-slot [TeX:] $$\tau \text { and } L_{i}+1$$ is the number of discrete paths. For simplicity, we assume that [TeX:] $$g_{l, i}(\tau)$$ is Gaussian random variable with zero mean and variance [TeX:] $$1 /\left(L_{i}+1\right),$$ and stays constant in each slot and changes independently from one time-slot to another. To obtain the CSI of all users, the BS transmits pilot symbols, from which each user estimates its own CSI by utilizing the minimum-mean squared-error (MMSE) estimator and feeds it back to BS. Pilot symbols are periodically placed in the frequency domain (shown in Fig. 1) and shared by different users for channel estimation. Let [TeX:] $$x_{i j}(\tau)$$ denote the transmit signal of user i on subcarrier j at time-slot [TeX:] $$\tau$$, which satisfies [TeX:] $$\mathbb{E}\left[\left|x_{i j}(\tau)\right|^{2}\right]=1,$$ then the corresponding received signal

Fig. 1.

System model.

is expressed as

(1)

[TeX:] $$y_{i j}(\tau)=\hat{G}_{i j}(\tau) \sqrt{P_{i j}(\tau)} x_{i j}(\tau)+\tilde{G}_{i j}(\tau) \sqrt{P_{i j}(\tau)} x_{i j}(\tau)+n_{i j}(\tau),$$

where [TeX:] $$G_{i j}(\tau)$$ is channel gain, [TeX:] $$\hat{G}_{i j}(\tau) \text { and } \tilde{G}_{i j}(\tau)$$ represent the estimation and the estimation error of [TeX:] $$G_{i j}(\tau), P_{i j}(\tau)$$ is data transmit power, and [TeX:] $$n_{i j}(\tau)$$ is zero mean Gaussian noise with variance [TeX:] $$\mathbb{E}\left[\left|n_{i j}(\tau)\right|^{2}\right]=N_{0}.$$ According to MMSE estimator, [TeX:] $$\hat{G}_{i j}(\tau) \text { and } \tilde{G}_{i j}(\tau)$$ are uncorrelated mean-zero Gaussian random vectors with variances [TeX:] $$\sigma_{\hat{G}_{i j}}^{2}(\tau)=\frac{\gamma_{t}(\tau)}{1+\gamma_{t}(\tau)}$$ and [TeX:] $$\sigma_{\tilde{G}_{i y}}^{2}(\tau)=\frac{1}{1+\gamma_{t}(\tau)},$$ respectively, where [TeX:] $$\gamma_{t}(\tau)=\frac{P_{t}(\tau)}{\left(L_{i}+1\right) N_{0}}$$ is the signal-to-noise ratio (SNR) for pilot transmission, and [TeX:] $$P_{t}(\tau)$$ is the pilot power which uniformly spreads in frequency domain. The data transmission rate in terms of estimation error is given by [16]:

(2)

[TeX:] $$\begin{aligned} R_{i j}(\tau) &=B_{0} \log _{2}\left(1+\frac{P_{i j}(\tau)\left|\hat{G}_{i j}(\tau)\right|^{2}}{P_{i j}(\tau) \sigma_{\tilde{G}_{i j}(\tau)}^{2}+N_{0}}\right) \\ &=B_{0} \log _{2}\left(1+\frac{P_{i j}(\tau) \sigma_{\hat{G}_{i j}(\tau)}^{2}\left|\nu_{i j}(\tau)\right|^{2}}{P_{i j}(\tau) \sigma_{\bar{G}_{i j}(\tau)}^{2}+N_{0}}\right), \end{aligned}$$

where [TeX:] $$\nu_{i j}(\tau)$$ is a Gaussian random variable with zero mean and unit variance. The second equality is true due to the fact that an arbitrary random variable, which follows a Gaussian distribution with zero-mean, can be expressed as the product of its standard deviation and a standard Gaussian random variable.

B. Definitions of Stability

A separate queue corresponding to each user is maintained at BS, which temporarily stores the arrived data before transmission, as shown in Fig. 1. We use the vector [TeX:] $$\mathbf{A}(\tau)=\left[A_{i}(\tau) ; i \in\right. \mathbf{M}]$$ denoting the arrival process of queue, where [TeX:] $$A_{i}(\tau)$$ is the amount of new data that enters the queue of user i during timeslot [TeX:] $$\mathcal{T}.$$ Assume that [TeX:] $$\mathbf{A}(\tau)$$ is independently and identically distributed (i.i.d.) over time-slots with mean [TeX:] $$\boldsymbol{\lambda}=\left[\lambda_{i} ; i \in \mathbf{M}\right].$$ If user i occupies data subcarrier set [TeX:] $$\Omega_{i}$$ for transmission, the data transmission rate [TeX:] $$S_{i}(\tau)$$ from queue i to its corresponding user is expressed as

(3)

[TeX:] $$S_{i}(\tau)=\sum_{j \in \Omega_{i}} R_{i j}(\tau).$$

Let [TeX:] $$\mathbf{Q}(\tau)=\left[Q_{i}(\tau) ; i \in \mathbf{M}\right]$$ be the current queue length, where [TeX:] $$Q_{i}(\tau)$$ represents the number of data stored in queue i during time-slot [TeX:] $$\tau.$$ Based on the arrival rate [TeX:] $$A_{i}(\tau)$$ and the departure rate [TeX:] $$S_{i}(\tau),$$ the dynamic of queue length [TeX:] $$Q_{i}(\tau)$$ can be modeled as

(4)

[TeX:] $$Q_{i}(\tau+1)=\max \left[Q_{i}(\tau)-S_{i}(\tau), 0\right]+A_{i}(\tau)$$

If the queue length satisfies

(5)

[TeX:] $$\lim _{\tau \rightarrow \infty} \frac{\mathbb{E}\left[Q_{i}(\tau)\right]}{\tau}=0,$$

it is mean rate stable [19]. That is, the mean rate stability of the queue is guaranteed if its length is finite. In the following discussion, we will use the term “stable" to refer to the mean rate stable.

C. Problem Formulation

Our objective is to maximize the long-term data transmission rate of the system by optimizing pilot and data transmit power subject to the queue stability and the total transmit power constraints. We define the long-term data transmission rate of the system [TeX:] $$R_{\text {tot }}$$ as

(6)

[TeX:] $$R_{t o t}=\lim _{T \rightarrow \infty} \frac{1}{T} \sum_{\tau=0}^{T-1} \mathbb{E}\left[\sum_{i=1}^{M} S_{i}(\tau)\right].$$

The overall transmit power [TeX:] $$P_{\text {tot }}(\tau)$$ during time-slot [TeX:] $$\tau$$ consists of two parts: data power and pilot power. Mathematically,

(7)

[TeX:] $$P_{t o t}(\tau)=P_{t}(\tau)+\sum_{i=1}^{M} \sum_{j \in \Omega_{i}} P_{i j}(\tau).$$

Then the resource optimization problem is formulated as the following stochastic optimization model.

(8)

[TeX:] $$\text { P1: } \max _{P_{t}(\tau), P_{i j}(\tau)} \lim _{T \rightarrow \infty} \frac{1}{T} \sum_{\tau=0}^{T-1} \mathbb{E}\left[\sum_{i=1}^{M} S_{i}(\tau)\right]$$

(9)

[TeX:] $$\text { P1: } \max _{P_{t}(\tau), P_{i j}(\tau)} \lim _{T \rightarrow \infty} \frac{1}{T} \sum_{\tau=0}^{T-1} \mathbb{E}\left[\sum_{i=1}^{M} S_{i}(\tau)\right],$$

(10)

[TeX:] $$P_{t o t}(\tau) \leq P_{\max }, \forall \tau,$$

(11)

[TeX:] $$P_{i j}(\tau) \geq 0, \quad \forall i, j, \tau,$$

(12)

[TeX:] $$P_{t}(\tau) \geq 0, \quad \forall \tau,$$

Queue stability constraint (9) guarantees all arrived data leaving the queues in a finite time. Constraint (10) limits the maximum transmit power at BS. Constraints (11) and (12) ensure that the instantaneous powers for pilot and data are nonnegative.

Note that the objective function is non-concave with respect to [TeX:] $$P_{t}(\tau) \text { and } P_{i j}(\tau)$$ (the proof as shown in Appendix A). Hence, P1 is the non-convex and stochastic optimization problem, which generally has no feasible solution due to its NPhardness. Additionally, the high level of correlation among the time-averaged constraint (i.e., (9) and the instantaneous constraints (i.e., (10)–(12)) further increases the complexity of the problem. However, in the next section, we shall develop an effective data and pilot power allocation algorithm to solve abovementioned difficulties in problem P1.

III. DYNAMIC JOINT RESOURCE ALLOCATION

In this section, we solve problem P1 by three steps. Firstly, an approximate transformation is implemented for problem P1. Next, using the Lyapunov optimization technology, the approximate transformed problem is converted into a series of instantaneous optimization problems. Finally, an effective pilot and data power allocation algorithm is proposed to solve these instantaneous optimization problems based on the Lagrange dual formulation.

A. Approximate Transformation

Due to the non-concavity of objective function (8), problem P1 falls into the scope of non-convex stochastic optimization programming, whose solution is difficult to find. However, it is clear that the optimal power allocation of problem P1 satisfies [TeX:] $$P_{t o t}(\tau)=P_{\max },$$ since the objective function increases with data power [TeX:] $$P_{i j}(\tau)$$ and pilot power [TeX:] $$P_{t}(\tau).$$ In realistic cellular system, the maximum transmit power of BS should guarantee that all users in the system can receive the pilots with high SNR [TeX:] $$\left(P_{t} \gg\left(L_{i}+1\right) N_{0}\right)$$ [9], [17]. Then, problem P1 can be approximately converted into

(13)

[TeX:] $$\begin{aligned} \mathbf{P} 2: & \max _{P_{t}(\tau), P_{i j}(\tau)} \dot{R}_{t o t}=\lim _{T \rightarrow \infty} \frac{1}{T} \sum_{\tau=0}^{T-1} \mathbb{E}\left[\sum_{i=1}^{M} \dot{S}_{i}(\tau)\right] \\ \text { s.t. } \quad(9),(10),(11),(12), \end{aligned}$$

where

(14)

[TeX:] $$\begin{aligned} \dot{S}_{i}(\tau) &=\sum_{j \in \Omega_{i}} \dot{R}_{i j}(\tau) \\ &=\sum_{j \in \Omega_{\imath}} B_{0} \log _{2}\left(1+\frac{P_{i j}(\tau) P_{t}(\tau)\left|\nu_{i j}(\tau)\right|^{2}}{\left(\left(L_{i}+1\right) P_{i j}(\tau)+P_{t}(\tau)\right) N_{0}}\right) \end{aligned}.$$

In Appendix C, we prove that [TeX:] $$\dot{R}_{i j}(\tau)$$ is a jointly concave function of [TeX:] $$P_{t}(\tau) \text { and } P_{i j}(\tau).$$ Hereto, by approximate transformation, the original problem is converted into a convex stochastic optimization problem P2.

B. Lyapunov Optimization

In this subsection, the Lyapunov optimization method is employed to solve problem P2, because it allows us to consider the joint problem of stabilizing queues and optimizing timeaveraged throughput. By using the Lyapunov optimization technology, problem P2 is converted into a series of instantaneous optimization problems. Then, these instantaneous problem are solved in its corresponding time-slot.

According to Lyapunov method [18]–[20], the transformation process for problem P2 is detailed in this subsection. To this end, we first define the Lyapunov function as

(15)

[TeX:] $$L(\mathbf{Q}(\tau))=\frac{1}{2} \sum_{i=1}^{M} Q_{i}(\tau)^{2}.$$

Then, the conditional Lyapunov drift [TeX:] $$\Delta(\mathbf{Q}(\mathbf{t}))$$ is given by

(16)

[TeX:] $$\Delta(\mathbf{Q}(\tau))=\mathbb{E}\{L(\mathbf{Q}(\tau+1))-L(\mathbf{Q}(\tau)) \mid \mathbf{Q}(\tau)\}.$$

In [19], it has proved that the data and pilot power allocation implemented in each time-slot to minimizing the Lyapunov drift can always pushed towards a lower queue length (i.e. all the data queues’ stability can be maintained). Although greedily minimizing queue length ensures the queue stability constraints (9) can be satisfied, but it may cause the objective of problem is deviated. In order to totally solve problem P2, instead of greedily minimizing the Lyapunov drift only, drift-plus-penalty method is employed to minimize the drift-plus-penalty expression [20], which is defined as following.

(17)

[TeX:] $$\Delta(\mathbf{Q}(\tau))-V \sum_{i=1}^{M} \mathbb{E}\left\{\dot{S}_{i}(\tau)\right\},$$

where V is a nonnegative control parameter characterizing the tradeoff between the system throughput and delay. A greater value of V indicates a higher priority allocation to optimize objective of problem P2 at the expense of greater queue lengths, and vice versa. The upper bound of drift-plus-penalty term is given in the following lemma.

Lemma 1: In every time-slot [TeX:] $$\tau,$$ for all possible values of [TeX:] $$\mathrm{Q}(\tau)$$ and V , as well as any pilot and data transmission power allocation, the upper bound of “drift-plus-penalty" is determined by

(18)

[TeX:] $$\begin{aligned} &\Delta(\mathbf{Q}(\tau))-V \mathbb{E}\left\{\sum_{i=1}^{M} \dot{S}_{i}(\tau) \mid \mathbf{Q}(\tau)\right\} \\ &\leq C_{1}-\mathbb{E}\left\{\sum_{i=1}^{M}\left(Q_{i}(\tau)+V\right) \dot{S}_{i}(\tau)-Q_{i}(\tau) A_{i}(\tau) \mid \mathbf{Q}(\tau)\right\} \end{aligned},$$

where [TeX:] $$C_{1}$$ is a positive constant that satisfies the following inequality

(19)

[TeX:] $$C_{1} \geq \frac{1}{2} \mathbb{E}\left\{\sum_{i=1}^{M} \dot{S}_{i}(\tau)^{2}+A_{i}(\tau)^{2} \mid \mathbf{Q}(\tau)\right\}.$$

Proof: See Appendix B.

Based on the design principle Lyapnov optimization [19], [20], the stochastic optimization problem can be solved by minimizing the upper bound of its “drift-plus-penalty" term subject to the same constrains except the queue stability ones. Therefore, the design goal of joint pilot and data power allocation strategy is transformed to adaptively minimizing the right hand side (RHS) of (18) in each time-slot given [TeX:] $$\mathrm{Q}(\tau)$$ subject to constraints (10)–(12). We can remove the [TeX:] $$Q_{i}(\tau) A_{i}(\tau)$$ term from the RHS of (18) since [TeX:] $$Q_{i}(\tau) \text { and } A_{i}(\tau)$$ are observable in each time slot. Furthermore, the expectation operation can be also removed because minimizing [TeX:] $$f(\tau)$$ ensures that [TeX:] $$\mathbb{E}\{f(\tau) \mid \mathbf{Q}(\tau)\}$$ is minimized based on the principle of opportunistically minimizing an expectation [19]. Thus, problem P2 can be reformulated as:

(20)

[TeX:] $$\begin{gathered} \mathrm{P} 3: \max _{P_{t}(\tau), P_{i j}(\tau)} U_{1}=\sum_{i=1}^{M} \sum_{j \in \Omega_{i}}\left(Q_{i}(\tau)+V\right) \dot{R}_{i j}(\tau) \\ \text { s.t. }(10),(11),(12). \end{gathered}$$

Hereto, by utilizing Lyapunov optimization technology, problem P2 is transformed into a series of optimization problems P3, each of which is solved in each time-slot.

C. Pilot and Data Power Allocation Algorithm

In this subsection, an efficient pilot and data transmission power allocation algorithm is designed to solve the transformed instantaneous optimization problem P3. It is proved in Appendix C, [TeX:] $$\dot{R}_{i j}(\tau)$$ is a jointly concave function of [TeX:] $$P_{t}(\tau)$$ and [TeX:] $$P_{i j}(\tau),$$ based on which the objective function of problem P3 is also concave from the composition rule that preserves concavity [21]. Furthermore, the constraints (10)–(12) are linear. Thus, P3 is a convex optimization problem. To solve this convex optimization problem, we first define the Lagrangian function for problem P3 as

(21)

[TeX:] $$\begin{aligned} J\left(P_{t}(\tau), P_{i j}(\tau), \mu\right)=& \sum_{i=1}^{M} \sum_{j \in \Omega_{i}}\left(Q_{i}(\tau)+V\right) \dot{R}_{i j}(\tau) \\ &-\mu \sum_{i=1}^{M} \sum_{j \in \Omega_{i}} P_{i j}(\tau)-\mu P_{t}+\mu P_{\max }, \end{aligned}$$

where [TeX:] $$\mu$$ is the Lagrange multiplier for constraint (10). Then the dual problem is formulated as

(22)

[TeX:] $$\min _{\mu>0} \max _{P_{t}(\tau) \geq 0, P_{i j}(\tau) \geq 0} J\left(P_{t}(\tau), P_{i j}(\tau), \mu\right) .$$

The difference between the optimal objective of problem P3 and that of its dual problem (22) is referred as the duality gap. According to the convex optimization theory [21], the duality gap reduces to zero at the optimum if problem P3 is convex. Hence, problem P3 can be equivalently transformed into the dual problem (22).

The optimal solution of dual problem (22) can be found by optimizing variables [TeX:] $$P_{t}(\tau), P_{i j}(\tau) \text { and } \mu$$ alternately. First, fixed Lagrange multiplier [TeX:] $$\mu$$ and pilot power [TeX:] $$P_{t}(\tau),$$ we derive optimal data power [TeX:] $$P_{i j}^{*}(\tau)$$ (shown in (23)) by using standard convex optimization technique. Next, with the solution and the fixed Lagrange multiplier [TeX:] $$\mu$$ of the first step, the optimal [TeX:] $$P_{t}^{*}(\tau)$$ is obtained to maximize the Lagrange function [TeX:] $$J\left(P_{t}(\tau), P_{i j}(\tau), \mu\right).$$ Specifically, if [TeX:] $$\frac{\partial J\left(P_{t}(\tau), P_{i j}^{*}(\tau), \mu\right)}{\partial P_{t}(\tau)} \mid P_{t}(\tau)=0<0,$$ the optimal pilot power [TeX:] $$P_{t}^{*}(\tau)=0 ; \text { if } \frac{\partial J\left(P_{t}(\tau), P_{i j}^{*}(\tau), \mu\right)}{\partial P_{t}(\tau)} \mid P_{t}(\tau)=P_{\max }>0,$$ otherwise, the [TeX:] $$P_{t}^{*}(\tau)$$ is found by solving [TeX:] $$\frac{\partial J\left(P_{t}(\tau), P_{i j}^{*}(\tau), \mu\right)}{\partial P_{t}(\tau)}=0,$$ i.e., (24), with the bisection method. Finally, the Lagrange multiplier is updated by using subgradient method, i.e.,

(25)

[TeX:] $$\mu^{(\kappa+1)}=\left[\mu^{(\kappa)}-\delta\left(P_{\max }-\sum_{i=1}^{M} \sum_{j \in \Omega_{i}} P_{i j}^{*}(\tau)-P_{t}^{*}(\tau)\right)\right]^{+},$$

where [TeX:] $$\delta$$ is the positive step size and is the iteration index. The iteration procedure will not be stopped until the results converge.

(23)

[TeX:] $$\begin{aligned} P_{i j}^{*}(\tau) \\ =\left\{\begin{array}{l} 0, z_{i j}(\tau) \leq 0 ; \\ \frac{P_{t}(\tau)\left[B_{0} n_{0}\left(L_{i}+1\right)+\alpha_{i j}(\tau)\right]}{2\left(L_{i}+1\right) \alpha_{1 j}(\tau)}\left(-1+\sqrt{1+4 z_{i j}(\tau)\left(L_{i}+1\right) \frac{\alpha_{i j}(\tau)}{\left[B_{0} N_{0}\left(L_{i}+1\right)+\alpha_{i j}(\tau)\right]^{2}}}\right), \\ \text { otherwise } \end{array}\right. \\ \text { where } \mathrm{z}_{\mathrm{ij}}(\tau)=\frac{\left|\nu_{\mathrm{ij}}(\tau)\right|{ }^{2} \mathrm{~B}_{0}\left(\mathrm{Q}_{\mathrm{i}}(\tau)+\mathrm{V}\right) \log _{2} \mathrm{e}}{\mu}-\mathrm{N}_{0}, \quad \alpha_{\mathrm{ij}}(\tau)=\mathrm{N}_{0}\left(\mathrm{~L}_{\mathrm{i}}+1\right)+\left|\nu_{\mathrm{ij}}(\tau)\right|^{2} \mathrm{P}_{\mathrm{t}}(\tau). \end{aligned}$$

(24)

[TeX:] $$\sum_{i=1}^{M} \sum_{j \in \Omega_{i}} \frac{\left(Q_{i}(\tau)+V\right) B_{0}\left|\nu_{d i j}(\tau)\right|^{2} \log _{2} e}{P_{i j}^{*}(\tau) P_{t}(\tau)\left|\nu_{i j}(\tau)\right|^{2}+\left[\left(L_{i}+1\right) P_{i j}^{*}(\tau)+P_{t}(\tau)\right] N_{0}} \frac{\left(L_{i}+1\right)\left(P_{i j}^{*}(\tau)\right)^{2}}{\left(L_{i}+1\right) P_{i j}^{*}(\tau)+P_{t}(\tau)}=\mu$$

Based on all above discussion, the algorithm DPDPA is outlined in Algorithm 1.

Given a time-slot, the computational complexity of Algorithm 1 is mainly dominated by the iterative process designed to achieve the global optimum of P3. In each iteration, the optimal data transmit power [TeX:] $$P_{i j}(\tau)$$ is determined according to (23) and pilot transmission power [TeX:] $$P_{t}(\tau)$$ is obtained by the bisection method. The computational complexity of the bisection method is [TeX:] $$O\left(\left\lceil\log _{2}(1 / \varepsilon)\right\rceil\right),$$ where [TeX:] $$\varepsilon$$ is the required relative accuracy, and [TeX:] $$\lceil x\rceil$$ represents the smallest integer bigger than or equal to x. Furthermore, the subgradient method converges to the desired state after [TeX:] $$O\left(1 / \xi^{2}\right),$$ where [TeX:] $$\xi$$ is the maximum tolerance deviation from the optimal value [TeX:] $$\mu^{*}.$$ Therefore the total complexity of Algorithm 1 is [TeX:] $$O\left(\frac{N+\log _{2}(1 / \varepsilon)}{\xi^{2}}\right)$$ in each time-slot.

IV. PERFORMANCE ANALYSIS

In this section, we mathematically analyze the performance of the proposed DPDPA. A necessary and practical boundedness assumption is given first, followed by discussing the performance bounds of DPDPA.

With any pilot and data power allocation police, the total data transmission rate [TeX:] $$\sum_{i}^{M} \dot{S}_{i}(\tau)$$ is assumed to guarantee the following bound:

(26)

[TeX:] $$S_{t o t}^{\min } \leqslant \mathbb{E}\left[\sum_{i}^{M} \dot{S}_{i}(\tau)\right] \leqslant \dot{S}_{t o t}^{\max },$$

where [TeX:] $$S_{\text {tot }}^{\text {min }} \text { and } \dot{S}_{\text {tot }}^{\max }$$ are nonnegative constant. The assumption is very reasonable, since departure rate is bounded from above and below with limited resource in realistic systems [22]. With the help of the assumption (26) and Lemma 1, the performance of our proposed DPDPA is revealed in the following theorem.

Theorem 1: If [TeX:] $$\lambda$$ is strictly interior to the network capacity region [TeX:] $$\Lambda$$¹ and [TeX:] $$\mathbb{E}\{L(\mathbf{Q}(\mathbf{0}))\}\infty,$$ the proposed DPDPA with

¹ Define the network capacity region [TeX:] $$\Lambda$$ as all the set of data arrival rate that can be stably supported by the network, considering all possible pilot and data power allocation decisions. In other words, there at lest exists a policy that stabilizes the network under this arrival rate [23].

Algorithm 1

Dynamic pilot and data power allocation algorithm (DPDPA)

any V > 0 provides the following performance:

(a) All queues are stable in BS.

(b) The time-averaged data rate of the system is lower bounded by

(27)

[TeX:] $$\dot{S}_{t o t} \geq \dot{S}_{\text {tot }}^{\text {opt }}-\frac{C_{1}}{V}.$$

(28)

[TeX:] $$\lim _{T \rightarrow \infty} \frac{1}{T} \sum_{\tau=0}^{T-1} \sum_{i=1}^{M} \mathbb{E}\left\{Q_{i}(\tau)\right\} \leq \frac{C_{1}+V\left(\dot{S}_{\text {tot }}^{\max }-\dot{S}_{\text {tot }}^{o p t}\right)}{\vartheta}.$$

Here, [TeX:] $$\dot{S}_{\text {tot }}^{o p t}$$ represents the maximum time-averaged rate incurred over all possible pilot and data power control decisions for solving problem P2.

Proof: See Appendix D.

Since the average delay is proportional to the average queue length from Little’s Theorem [24], we can depict the average delay by the average queue length. The bound of average queue length increases with V according to (28). Meanwhile, from (27), if V is sufficiently large, which results in the arbitrarily small [TeX:] $$C_{1} / V, \dot{S}_{t o t}$$ arbitrarily approaches [TeX:] $$\dot{S}_{\text {tot }}^{o p t}.$$ As above analysis, it reveals that an increasing V results in an improvement for [TeX:] $$\dot{S}_{t o t}$$ and as well an increasing delay.

V. SIMULATION RESULTS

In this section, simulation results are provided to illustrate the performance of the proposed DPDPA algorithm.We consider an example of the network consists of one BS, three users, three pilot subcarriers, and six data subcarriers that are equally allocated to all users. The number of resolvable paths for user i, [TeX:] $$L_{i},$$ is set to 2. The data arrival amount to the queue i during each timeslot follows the poisson distribution with the mean [TeX:] $$\lambda_{i},$$ which is assumed to the same for all users. For simplicity, the noise power is normalized to one. We set [TeX:] $$P_{\max }=100 \text { and } T=2000$$ time-slots, that is, each point of the following curves is averaged over 2000 runs.

Fig. 2 shows the queue length against the time-solt with V = 40. As it can be observed from Fig. 2, the queue length of each user is determinately upper bounded by 100. This demonstrates all queues are stabilized in the system, which matches the theoretical analysis in Theorem 1(a).

Fig. 3 evaluates the impact of control parameter V on averaged data rate and queue length. In Fig. 3, with an increasing V , we observe that the average data rate and average queue length are slightly increased. This is due to the fact that an increasing V leads to an ever-increasing concern on maximizing objective instead of minimizing queue length. Therefore, the average data rate and queue length increase with the growth of V . Based on above analysis, by setting appropriate control parameter V , the network can operate in a predefined state. Specifically, if the system pursues a high time-averaged data rate, a large V is required. On the contrary, if the system prefers to a stricter delay, then a smaller V is desired.

Fig. 4 shows the network performance of DPDPA with respect to different [TeX:] $$\lambda_{i}.$$ We observe that average data rate and queue length grow with an increasing average arrival data rate [TeX:] $$\lambda_{i}.$$ The reason for this phenomena is that a larger [TeX:] $$\lambda_{i}$$ requires higher data rate to ensure a finite queue length, i.e., guarantee queue stability. Additionally, it is obvious that once the arrival data rate [TeX:] $$\lambda_{i}$$

Fig. 2.

Queue length vs. time-slot with [TeX:] $$V=40, \lambda_{i}=4$$

Fig. 3.

Network performance of DPDPA vs. V with [TeX:] $$\lambda_{i}=4.$$

varies, then the time data rate changes as well, which means the DPDPA is adapted to the average arrival data rate [TeX:] $$\lambda_{i}.$$

Next, we evaluate the performance of the proposed algorithm by comparing it with the following algorithms: 1) A police in [26] that maximizes the time-averaged data rate with perfect CSI and queue stability constraint (labelled “perfect CSI"); 2) a police that maximizes the time-averaged data rate with fixed pilot power [TeX:] $$P_{\text {cont }}$$ and queue stability constraint (labelled [TeX:] $$" P_{t}=P_{\text {cont }} \text { ); }$$ as shown in Fig. 5, the algorithm “perfect CSI" achieves the best performance among the three policies, i.e. the highest average data rate and the shortest queue length. This is because no power is required to transmit pilot symbols for CSI estimation. Additionally, whatever the value of [TeX:] $$P_{t}$$ is set in

Fig. 4.

Network performance of DPDPA vs. [TeX:] $$\lambda_{i} \text { with } V=0 \text {. }$$

Algorithm 2

The power allocation algorithm for problem P1

algorithm [TeX:] $$" P_{t}=P_{\text {cont }}^{\prime \prime} \text {, }$$ our proposed algorithm has much better performance, i.e., higher average data rate and shorter queue length, since the algorithm [TeX:] $${ }^{"} P_{t}=P_{\text {cont }} "$$ only optimize data transmit power without taking into account channel estimation cost [TeX:] $$P_{t}.$$

Our algorithm is proposed based on how to solve problem P2 which is an approximation of the original problem P1. To demonstrate the rationality of the approximate transform method, we compare the performance of Algorithm 1 with that of the algorithm proposed for P1. Similarly, according to the formulation of problem P1, the Lyapunov optimization method is employed to tackle it. The solving process for P1 is described in Algorithm 2.

Due to the non-concavity of objective function (29), problem P4 falls into the scope of nonconvex programming, whose solution is difficult to find. We employ exhaustive searching to obtain optimal power allocation for problem P4. Fig. 6 shows the average data rate versus V under Algorithm 1 and Algorithm 2. The curve with the square marks represents the performance of Algorithm 1 while the curve with the star marks denotes the performance of Algorithm 2. It is obvious that the performance obtained by Algorithm 1 almost overlaps with that obtained by Algorithm 2.

Fig. 5.

Network performance comparison with [TeX:] $$\lambda_{i}=4.$$

VI. CONCLUSIONS

In this paper, we have investigated the dynamic power allocation problem in downlink OFDMA networks while considering the estimation cost, the time varying channel, and the queue stability constraints of all users. The DPDPA has been developed to solve the problem by adopting approximate transformation, Lyapunov optimization and Lagrange dual formulation. The bounds of its performances, in terms of time-averaged data transmission rate and queue length, were derived, which are increasing functions of the control parameter V .

APPENDIX A

Fig. 6.

Verification for the performance of approximate transform method with [TeX:] $$\lambda_{i}=4.$$

PROOF OF CONCAVITY OF [TeX:] $$R_{I J}(\tau)$$

Proof : The second-order partial derivative of [TeX:] $$R_{i j}(\tau)$$ with respect to [TeX:] $$P_{i j}(\tau)$$ is

(30)

[TeX:] $$\begin{aligned} \frac{\mathrm{d}^{2} R_{i j}(\tau)}{\mathrm{d} P_{i j}(\tau)^{2}} =-B_{0} \frac{\log _{2} e}{\left(1+\eta_{i j}(\tau)\right)^{2}} \frac{\mathrm{d} \eta_{i j}(\tau)}{\mathrm{d} P_{i j}(\tau)} \frac{\mathrm{d} \eta_{i j}(\tau)}{\mathrm{d} P_{i j}(\tau)} \\ +B_{0} \frac{\log _{2} e}{1+\eta_{i j}(\tau)} \frac{\mathrm{d}^{2} \eta_{i j}(\tau)}{\mathrm{d} P_{i j}(\tau)^{2}}, \end{aligned}$$

where

(31)

[TeX:] $$\begin{aligned} \eta_{i j}(\tau) \\ =\frac{\left|\nu_{i j}(\tau)\right|^{2}}{N_{0}} \frac{P_{t}(\tau) P_{i j}(\tau)}{P_{i j}(\tau)\left(L_{i}+1\right)+\left(L_{i}+1\right) N_{0}+P_{t}(\tau)}, \end{aligned}$$

(32)

[TeX:] $$\begin{aligned} \frac{\mathrm{d} \eta_{i j}(\tau)}{\mathrm{d} P_{i j}(\tau)} \\ =\frac{\left|\nu_{i j}(\tau)\right|^{2}}{N_{0}} \frac{P_{t}(\tau)\left[\left(L_{i}+1\right) N_{0}+P_{t}(\tau)\right]}{\left[P_{i j}(\tau)\left(L_{i}+1\right)+\left(L_{i}+1\right) N_{0}+P_{t}(\tau)\right]^{2}}, \end{aligned}$$

and

(33)

[TeX:] $$\begin{aligned} \frac{\mathrm{d}^{2} \eta_{i j}(\tau)}{\mathrm{d} P_{i j}(\tau)^{2}} \\ =\frac{-2\left|\nu_{i j}(\tau)\right|^{2}}{N_{0}} \frac{\left(L_{i}+1\right) P_{t}(\tau)\left[\left(L_{i}+1\right) N_{0}+P_{t}(\tau)\right]}{\left[P_{i j}(\tau)\left(L_{i}+1\right)+\left(L_{i}+1\right) N_{0}+P_{t}(\tau)\right]^{3}}. \end{aligned}$$

We can easily derive [TeX:] $$\frac{\mathrm{d}^{2} R_{i j}(\tau)}{\mathrm{d} P_{i j}(\tau)^{2}}0,$$ which indicates that [TeX:] $$R_{i j}(\tau)$$ is a concave function with respect to [TeX:] $$P_{i j}(\tau).$$ Similarly, we have [TeX:] $$\frac{\mathrm{d}^{2} R_{i j}(\tau)}{\mathrm{d} P_{t}(\tau)^{2}}0.$$ Therefore [TeX:] $$R_{i j}(\tau)$$ is a concave function of [TeX:] $$P_{t}(\tau).$$ Additionally, the Hessian matrix of [TeX:] $$R_{i j}(\tau)$$ is expressed as

(34)

[TeX:] $$\nabla^{2} R_{i j}(\tau)=\left(\begin{array}{cc} \frac{\mathrm{d}^{2} R_{i j}(\tau)}{\mathrm{d} P_{t}(\tau)^{2}} \frac{\mathrm{d}^{2} R_{i_{j}}(\tau)}{\mathrm{d} P_{t}(\tau) \mathrm{d} P_{i j}(\tau)} \\ \frac{\mathrm{d}^{2} R_{i j}(\tau)}{\mathrm{d} P_{i \jmath}(\tau) \mathrm{d} P_{t}(\tau)} \frac{\mathrm{d}^{2} R_{i j}(\tau)}{\mathrm{d} P_{\imath j}(\tau)^{2}} \end{array}\right).$$

It is no difficult to validate that [TeX:] $$\nabla^{2} R_{i j}(\tau)$$ is not always negative semi-definite for any [TeX:] $$P_{t}(\tau)>0 \text { and } P_{i j}(\tau)>0.$$ Then we prove that [TeX:] $$R_{i j}(\tau)$$ is not a concave function with both [TeX:] $$P_{t}(\tau)$$ and [TeX:] $$P_{i j}(\tau).$$

APPENDIX B PROOF OF LEMMA 1

From (4) and (15), we get

(35)

[TeX:] $$\begin{aligned} L(\mathbf{Q}(\tau+1))=\frac{1}{2} \sum_{i=1}^{M}\left\{\max \left[Q_{i}(\tau)-\dot{S}_{i}(\tau), 0\right]+A_{i}(\tau)\right\}^{2} \\ \leq \frac{1}{2} \sum_{i=1}^{M}\left\{Q_{i}^{2}(\tau)+\dot{S}_{i}^{2}(\tau)+A_{i}^{2}(\tau)+2 Q_{i}(\tau)\left[A_{i}(\tau)-\dot{S}_{i}(\tau)\right]\right\} . \end{aligned}$$

Substitute (35) into (16), [TeX:] $$\Delta(\mathrm{Q}(\tau))$$ is expressed as

(36)

[TeX:] $$\begin{aligned} \Delta(\mathbf{Q}(\tau)) \leq \frac{1}{2} \mathbb{E}\left\{\sum_{i=1}^{M}\left[\dot{S}_{i}^{2}(\tau)+A_{i}^{2}(\tau)\right] \mid \mathbf{Q}(\tau)\right\} \\ +\mathbb{E}\left\{\sum_{i=1}^{M} Q_{i}(\tau)\left[A_{i}(\tau)-\dot{S}_{i}(\tau)\right] \mid \mathbf{Q}(\tau)\right\} \\ \leq C_{1}-\mathbb{E}\left\{\sum_{i=1}^{M} Q_{i}(\tau)\left[\dot{S}_{i}(\tau)-A_{i}(\tau)\right] \mid \mathbf{Q}(\tau)\right\}, \end{aligned}$$

where [TeX:] $$C_{1}$$ is a positive constant that satisfies the following inequality

[TeX:] $$C_{1} \geq \frac{1}{2} \mathbb{E}\left\{\sum_{i=1}^{M} \dot{S}_{i}(\tau)^{2}+A_{i}(\tau)^{2} \mid \mathbf{Q}(\tau)\right\}.$$

Subtracting [TeX:] $$V \mathbb{E}\left\{\sum_{i=1}^{M} \dot{S}_{i}(\tau) \mid \mathbf{Q}(\tau)\right\}$$ from both sides of inequality (36), we obtain Lemma1.

APPENDIX C PROOF OF CONCAVITY OF [TeX:] $$\dot{R}_{I J}(\tau)$$

The Hessian matrix of [TeX:] $$\dot{R}_{i j}(\tau)$$ is given by

(37)

[TeX:] $$\nabla^{2} \hat{R}_{i j}(\tau)=\left(\begin{array}{cc} \frac{\mathrm{d}^{2} \dot{R}_{i j}(\tau)}{\mathrm{d} P_{t}(\tau)^{2}} \frac{\mathrm{d}^{2} \dot{R}_{i j}(\tau)}{\mathrm{d} P_{t}(\tau) \mathrm{d} P_{i j}(\tau)} \\ \frac{\mathrm{d}^{2} \dot{R}_{i j}(\tau)}{\mathrm{d} P_{i j}(\tau) \mathrm{d} P_{t}(\tau)} \frac{\mathrm{d}^{2} R_{i j}(\tau)}{\mathrm{d} P_{i j}(\tau)^{2}} \end{array}\right).$$

It is easy to prove that [TeX:] $$\nabla^{2} \dot{R}_{i j}(\tau)$$ is a negative semidefinite matrix. Therefore, [TeX:] $$\dot{R}_{i j}(\tau)$$ is a concave function with respect to [TeX:] $$P_{t}(\tau) \text { and } P_{i j}(\tau).$$

APPENDIX D PROOF OF THEOREM 1

According to (18), for the proposed dynamic power allocation scheme, we have

(38)

[TeX:] $$\begin{aligned} \Delta(\mathbf{Q}(\tau))-V \mathbb{E}\left\{\sum_{i=1}^{M} \sum_{j \in \Omega_{i}} \dot{R}_{i j}\left(P_{t}^{*}(\tau), P_{i j}^{*}(\tau)\right) \mid \mathbf{Q}(\tau)\right\} \\ \leq C_{1}-V \mathbb{E}\left\{\sum_{i=1}^{M} \sum_{j \in \Omega_{i}} \dot{R}_{i j}\left(P_{t}^{*}(\tau), P_{i j}^{*}(\tau)\right) \mid \mathbf{Q}(\tau)\right\} \\ -\mathbb{E}\left\{\sum_{i=1}^{M} \sum_{j \in \Omega_{i}} Q_{i}(\tau)\left[\dot{R}_{i j}\left(P_{t}^{*}(\tau), P_{i j}^{*}(\tau)\right)-A_{i}(\tau)\right] \mid \mathbf{Q}(\tau)\right\} \\ \leq C_{1}-V \mathbb{E}\left\{\sum_{i=1}^{M} \sum_{j \in \Omega_{i}} \dot{R}_{i j}\left(P_{t}^{\prime}(\tau), P_{i j}^{\prime}(\tau)\right) \mid \mathbf{Q}(\tau)\right\} \\ -\mathbb{E}\left\{\sum_{i=1}^{M} \sum_{j \in \Omega_{i}} Q_{i}(\tau)\left[\dot{R}_{i j}\left(P_{t}^{\prime}(\tau), P_{i j}^{\prime}(\tau)\right)-A_{i}(\tau)\right] \mid \mathbf{Q}(\tau)\right\}, \end{aligned}$$

where the power allocation decisions [TeX:] $$P_{t}^{\prime}(\tau) \text { and } P_{i j}^{\prime}(\tau)$$ are implemented with any stationary randomized policy. The second inequality sign of (38) holds based on the fact that the proposed power allocation scheme is optimal to minimize the RHS of the bounds in (18) compared with any other power control policies.

According to the stochastic network optimization theory [19], [25], if [TeX:] $$\lambda$$ is strictly interior to the capacity region [TeX:] $$\boldsymbol{\Lambda}$$, then there exists the positive [TeX:] $$\vartheta$$ satisfying [TeX:] $$\lambda+\vartheta \in \Lambda$$ and a stationary randomized policy satisfying

(39)

[TeX:] $$\mathbb{E}\left[\sum_{j \in \Omega_{i}} \dot{R}_{i j}\left(P_{t}^{\prime}(\tau), P_{i j}^{\prime}(\tau)\right)\right] \geq \lambda_{i}+\vartheta.$$

From (39) and (38) can be further simplified as

(40)

[TeX:] $$\begin{aligned} \triangle(\mathbf{Q}(\tau))-V \mathbb{E}\left\{\sum_{i=1}^{M} \sum_{j \in \Omega_{i}} \dot{R}_{i j}\left(P_{t}^{*}(\tau), P_{i j}^{*}(\tau)\right) \mid \mathbf{Q}(\tau)\right\} \\ \leq C_{1}-V \mathbb{E}\left\{\sum_{i=1}^{M} \sum_{j \in \Omega_{i}} \dot{R}_{i j}\left(P_{t}^{\prime}(\tau), P_{i j}^{\prime}(\tau)\right) \| \mathbf{Q}(\tau)\right\} \\ -\sum_{i=1}^{M} \vartheta \mathbb{E}\left\{Q_{i}(\tau)\right\} . \end{aligned}$$

(a) From (40), it is easily proved that there exist some finite positive constants C satisfying

(41)

[TeX:] $$\triangle(\mathbf{Q}(\tau)) \leq C.$$

Substituting (15) and (16) into above inequality and summing over [TeX:] $$\tau \in\{0,1, \cdots, T-1\},$$ we obtain

(42)

[TeX:] $$\mathbb{E}\left\{Q_{i}(T)^{2}\right\} \leq 2 T C+2 \mathbb{E}\{L(\mathbf{Q}(\mathbf{0}))\}.$$

From the variance formula [TeX:] $$D\left\{Q_{i}(T)\right\}=\mathbb{E}\left\{Q_{i}^{2}(T)\right\}- \mathbb{E}^{2}\left\{Q_{i}(T)\right\},$$ we have [TeX:] $$\mathbb{E}\left\{Q_{i}^{2}(T)\right\} \geq \mathbb{E}^{2}\left\{Q_{i}(T)\right\},$$ since [TeX:] $$D\left\{Q_{i}(T)\right\}>0.$$ Thus

(43)

[TeX:] $$\mathbb{E}\left\{Q_{i}(T)\right\} \leq \sqrt{2 T C+2 \mathbb{E}\{L(\mathbf{Q}(0))\}}.$$

Dividing the inequality (43) by T and taking a limit as [TeX:] $$T \rightarrow \infty \text {, }$$ we have

(44)

[TeX:] $$\lim _{T \rightarrow \infty} \frac{\mathbb{E}\left\{Q_{i}(T)\right\}}{T}=0,$$

with the assumption [TeX:] $$\mathbb{E}\{L(\mathbf{Q}(0))\infty.$$ Therefore, all queues [TeX:] $$Q_{i}(\tau)$$ are stable in BS.

(b) Let [TeX:] $$\dot{S}_{\text {tot }}^{\text {opt }}$$ represent the maximum time-average rate incurred over all possible power control decisions for solving problem P2. There exists stationary randomized policy that obtains [TeX:] $$\dot{S}_{\text {tot }}^{o p t}.$$ Thus, the inequality (40) can be rewritten as

(45)

[TeX:] $$\begin{aligned} \triangle(\mathbf{Q}(\tau))-V \mathbb{E}\left\{\sum_{i=1}^{M} \sum_{j \in \Omega_{i}} \dot{R}_{i j}\left(P_{t}^{*}(\tau), P_{i j}^{*}(\tau)\right) \mid \mathbf{Q}(\tau)\right\} \\ \leq C_{1}-V \dot{S}_{t o t}^{\circ o p t}-\sum_{i=1}^{M} \vartheta \mathbb{E}\left\{Q_{i}(\tau)\right\}. \end{aligned}$$

By substituting (16) into (45) and summing over [TeX:] $$\tau \in \ \{0,1, \cdots, T-1\},$$ we obtain

(46)

[TeX:] $$\begin{aligned} \mathbb{E}\{L(\mathbf{Q}(T))\}-\mathbb{E}\{L(\mathbf{Q}(0))\} \\ -V \sum_{\tau=0}^{T-1} \mathbb{E}\left\{\sum_{i=1}^{M} \sum_{j \in \Omega_{i}} \dot{R}_{i j}\left(P_{t}^{*}(\tau), P_{i j}^{*}(\tau)\right) \mid \mathbf{Q}(\tau)\right\} \\ \leq T C_{1}-T V \dot{S}_{\text {tot }}^{o p t}-\sum_{\tau=0}^{T-1} \sum_{i=1}^{M} \vartheta \mathbb{E}\left\{Q_{i}(\tau)\right\}. \end{aligned}$$

Since [TeX:] $$Q_{i}(\tau) \geq 0,$$ the above inequality is further simplified as

(47)

[TeX:] $$\begin{aligned} V \sum_{\tau=0}^{T-1} \mathbb{E}\left\{\sum_{i=1}^{M} \sum_{j \in \Omega_{i}} \dot{R}_{i j}\left(P_{t}^{*}(\tau), P_{i j}^{*}(\tau)\right) \mid \mathbf{Q}(\tau)\right\} \\ \geq T V \dot{S}_{t o t}^{o p t}-T C_{1}-\mathbb{E}\{L(\mathbf{Q}(0))\}. \end{aligned}$$

Dividing the inequality (47) by V T and taking a limit as [TeX:] $$T \rightarrow \ \infty,$$ we have

(48)

[TeX:] $$\dot{S}_{t o t} \geq \dot{S}_{t o t}^{o p t}-\frac{C_{1}}{V}.$$

(c) From inequality (46), dividing by [TeX:] $$\vartheta T,$$ rearranging terms and using the fact that [TeX:] $$Q_{i}(\tau) \geq 0,$$ we obtain

(49)

[TeX:] $$\begin{aligned} \frac{1}{T} \sum_{\tau=0}^{T-1} \sum_{i=1}^{M} \mathbb{E}\left\{Q_{i}(\tau)\right\} \leq \frac{\mathbb{E}\{L(\mathbf{Q}(0))\}}{\vartheta T}+\frac{C_{1}}{\vartheta}-\frac{V}{\vartheta} \dot{S}_{t o t}^{o p t} \\ +\frac{V}{\vartheta T} \sum_{\tau=0}^{T-1} \mathbb{E}\left\{\sum_{i=1}^{M} \sum_{j \in \Omega_{i}} \dot{R}_{i}\left(P_{t}^{*}(\tau), P_{i j}^{*}(\tau)\right) \mid \mathbf{Q}(\tau)\right\}. \end{aligned}$$

Based on the assumption (26) and [TeX:] $$\mathbb{E}\{L(\mathbf{Q}(0))\infty,$$ and taking a limit as [TeX:] $$T \rightarrow \infty$$ for above inequality, we have

(50)

Biography

Yong Liu

Yong Liu received the B.S. degree in Electronic Information Science and Technology from Information Engineering University, China, in 2008, the M.S. degree in Electromagnetic Field and Microwave Technology from Zhengzhou University, China, in 2012, and the Ph.D. degree in Electrical Engineering from the Xi’an Jiaotong University, in 2017. Since 2018, he has been with Zhongyuan University of Technology, China. His present research works are wireless resource management and ubiquitous electric internet of things as well as electrical equipment condition monitoring and fault diagnosis technology.

Biography

Fei Liu

Fei Liu received the B.S. degree from Henan University in 2009, the M.S. degree from Zhengzhou University in 2012, and the Ph.D. degree in Communication and Information Systems at Xidian University, in 2017. She is currently with the Zhongyuan University of Technology. Her research interests include resource allocation and information management.

Biography

Guang Zhu

Guang Zhu received the B.S. degree from Henan University of Technology, China, in 2013, the M.S. degree from Information Engineering University, China, in 2016. He is currently with Zhongyuan University of Technology. His research interests include resource allocation and information Security.

Biography

Xiaolei Wang

Xiaolei Wang received the B.S. degree from Huazhong University of Science and Technology, China, in 1985, the M.S. degree and the Ph.D., degree in Electric Machines and Electric Apparatus from Xi’an Jiaotong University, in 1990 and 2003, respectively. Since 1993, he has been with Zhongyuan University of Technology, China. His present research works are smart grid, aircraft electric power source, network security and micro-grid technology.

Biography

Yuechao Jiao

Yuechao Jiao received the B.S. degree, the M.S. degree, and the Ph.D. degree from Zhengzhou University, China, in 2005, 2008, and 2013, respectively. Since 2013, he has been with Zhongyuan University of Technology, China. His research interests include power quality analysis and control, and intelligent optimization algorithm.

References

1 F. Raphel, S. M. Sameer, "A Speed adaptive joint subcarrier and power allocation technique for downlink OFDMA video transmission over doubly selective channels," IEEE Trans. Veh. Technol., vol. 69, no. 2, pp. 1879-1887, Dec, 2019.custom:[[[-]]]
2 Y. Kai, J. Wang, H. Zhu, J. Wang, "Resource allocation and performance analysis of cellular-assisted OFDMA device-to device communications," IEEE Trans. Wireless Commun., vol. 18, no. 1, pp. 416-431, Jan, 2019.custom:[[[-]]]
3 S. Gautam, E. Lagunas, S. Chatzinotas, B. Ottersten, "Relay selection and resource allocation for SWIPT in multi-user OFDMA systems," IEEE Trans. Wireless Commun., vol. 18, no. 5, pp. 2493-2508, May, 2019.custom:[[[-]]]
4 Y. Liu, L. Lu, G. Li, Q. Cui, W. Han, "Joint user association and spectrum allocation for small cell networks with wireless backhauls," IEEE Wireless Commun. Lett., vol. 5, no. 5, pp. 496-499, Oct, 2016.doi:[[[10.1109/LWC.2016.2593465]]]
5 R. Sultan, L. Song, K. G. Seddik, Z. Han, "Joint resource management with distributed uplink power control in full-duplex OFDMA networks," IEEE Trans. Veh. Technol., vol. 69, no. 3, pp. 2850-2863, Mar, 2020.custom:[[[-]]]
6 F. Wang, W. Chen, H. Tang, Q. Wu, "Joint optimization of user association, subchannel allocation, and power allocation in multi-cell multiassociation OFDMA heterogeneous networks," IEEE Trans. Commun., vol. 65, no. 6, pp. 2672-2684, June, 2017.custom:[[[-]]]
7 M. C. Gursoy, "On the capacity and energy efficiency of the training-base transmissions over fading channels," IEEE Trans. Inf. Theroy, vol. 55, no. 10, pp. 4543-4567, Oct, 2009.custom:[[[-]]]
8 S. Adireddy, L. Tong, H. Viswanathan, "Optimal placement of training for frequency-selective block-fading channels," IEEE Trans. Inf. Theory, vol. 48, no. 8, pp. 2338-2353, Aug, 2002.doi:[[[10.1109/TIT.2002.800466]]]
9 Z. Xu et al., "Energy-efficient power allocation for pilots in trainingbased downlink OFDMA systems," IEEE Trans. Commun., vol. 60, no. 10, pp. 3047-3058, Oct, 2012.custom:[[[-]]]
10 F. Liu, Q. Yang, P. Gong, K. Kwak, "Subcarrier and power allocation for multi-user OFDMA wireless networks under imperfect channel state information," IET Commun., vol. 10, no. 8, pp. 873-881, May, 2016.doi:[[[10.1049/iet-com.2015.0711]]]
11 M. J. Neely, "Energy optimal control for time-varying wireless networks," IEEE Trans. Inf. Theory, vol. 52, no. 7, pp. 2915-2934, July, 2006.doi:[[[10.1109/TIT.2006.876219]]]
12 R. Urgaonkar, M. J. Neely, "Opportunistic scheduling with reliability guarantees in cognitive radio networks," IEEE Trans. Mobile Computing, vol. 8, no. 6, pp. 766-777, June, 2009.doi:[[[10.1109/TMC.2009.38]]]
13 Y. Li, M. Sheng, Y. Shi, X. Ma, W. Jiao, "Energy efficiency and delay tradeoff for time varying and interference-free wireless networks," IEEE Trans. Wireless Commun., vol. 13, no. 11, pp. 5921-5931, Nov, 2014.custom:[[[-]]]
14 H. Ahmed, K. Jagannathan, S. Bhashyam, "Queue-aware optimal resource allocation for the LET downlink with best M subband feedback," IEEE Trans. Wireless Commun., vol. 14, no. 9, pp. 4923-4933, Sept, 2015.custom:[[[-]]]
15 F. Liu, Q. Yang, Q. He, D. Park, K. Kwak, "Dynamic power and subcarrier allocation for downlink OFDMA systems under imperfect CSI," Wireless Networks, vol. 25, no. 2, pp. 545-558, Feb, 2019.custom:[[[-]]]
16 Y. Wu, R. H. Y. Louie, M. R. McKay, "Analysis and design of wireless ad hoc networks with estimation errors," IEEE Trans. Signal Commun.ch, vol. 61, no. 6, pp. 1447-1459, Mar, 2013.custom:[[[-]]]
17 Y. Li, B. Vucetic, Z. Zhou, M. Dohler, "Distributed adaptive power allocation for wireless relay networks," IEEE Trans. Wireless Commun., vol. 6, no. 3, pp. 948-958, Mar, 2007.doi:[[[10.1109/TWC.2007.05256]]]
18 L. Huang, M. J. Neely, "Utility optimal scheduling in energy harvesting networks," IEEE /ACM Trans. Netw., vol. 21, no. 4, pp. 1117-1130, Aug, 2013.doi:[[[10.1145/2107502.2107531]]]
19 M. J. Neely, Stochastic network optimization with application to communication and queueing systems, San Rafael CA USA: Morgan Claypool, 2010.custom:[[[-]]]
20 L. Georgiadis, M. J. Neely, L. Tassiulas, "Resource allocation and cross-layer control in wireless networks," Foundations and Trends in Networking, vol. 1, no. 1, pp. 1-144, Apr, 2006.doi:[[[10.1561/1300000001]]]
21 S. Boyd, L. vandenberghe, Convex optimization, Cambridge U.K.: Cambridge University Press, 2004.custom:[[[-]]]
22 W. Wu, Q. Yang, B. Li, K. Kwak, "Adaptive cross-layer resource optimization in heterogeneous wireless networks with multi-homing user equipemnts," J. Commun. Netw., vol. 18, no. 5, pp. 784-795, Oct, 2016.custom:[[[-]]]
23 M. Neely, "Energy optimal control for time-varying wireless networks," IEEE Trans. Inf. theory, vol. 52, no. 7, pp. 2915-2934, July, 2006.doi:[[[10.1109/TIT.2006.876219]]]
24 D. Bertsekas, R. Gallager, "Data Networks," Prentice-Hall, 1987.custom:[[[-]]]
25 M. J. Neely, "Dynamic optimization and learning for renewal systems," IEEE Trans. Automatic Control, vol. 58, no. 1, pp. 32-46, Jan, 2013.doi:[[[10.1109/TAC.2012.2204831]]]
26 H. Ju, B. Liang, J. Li, X. Yang, "Dynamic power allocation for throughput utility maximization in interference-limited networks," IEEE Wireless Commun. Lett., vol. 2, no. 1, pp. 22-25, Feb, 2013.doi:[[[10.1109/WCL.2012.100912.120512]]]

Received: January 9 2020

Revision received: April 12 2021

Accepted: May 7 2021

Published (Electronic): August 31 2021

Corresponding Author: Yong Liu , liuyongly201@sina.com

Yong Liu, School of Electronic and Information Engineering, Zhongyuan University of Technology, No.41 Zhongyuan Road, Zhengzhou, 450007, Henan, China, liuyongly201@sina.com

Fei Liu, School of Computer Science, Zhongyuan University of Technology, No.41 Zhongyuan Road, Zhengzhou, 450007, Henan, China, liufeilf201@foxmail.com

Guang Zhu, School of Computer Science, Zhongyuan University of Technology, No.41 Zhongyuan Road, Zhengzhou, 450007, Henan, China, 6672@zut.edu.cn

Xiaolei Wang, School of Electronic and Information Engineering, Zhongyuan University of Technology, No.41 Zhongyuan Road, Zhengzhou, 450007, Henan, China, 418423314@11.com

Yuechao Jiao, School of Electronic and Information Engineering, Zhongyuan University of Technology, No.41 Zhongyuan Road, Zhengzhou, 450007, Henan, China, 6671@zut.edu.cn