Downlink Spectrum Sharing of Heterogeneous Communication Systems in LEO Satellite Networks

Jihyeon Yun; Taegun An; Haesung Jo; Bon-Jun Ku; Daesub Oh; Changhee Joo

doi:10.23919/JCN.2022.000031

ISSN: 1976-5541

Volume 24, No 6 (2022), pp. 722 - 729

10.23919/JCN.2022.000031

Jihyeon Yun , Taegun An , Haesung Jo , Bon-Jun Ku , Daesub Oh and Changhee Joo

Downlink Spectrum Sharing of Heterogeneous Communication Systems in LEO Satellite Networks

Abstract: We study downlink spectrum sharing in low earth orbit (LEO) satellite networks, where wide beam coverage and side lobe beam make it hard to manage mutual interference between satellites. As a first step, we consider a two-satellite system of control satellite and interference satellite, which share the same frequency spectrum range and are located close enough to have interference each other. We investigate resource allocation strategy for the control satellite to maximize its throughput performance while satisfying interference constraints. Assuming that the control satellite has no prior knowledge about the behavior of the interference satellite, we develop online frequency allocation algorithms that successfully manage the interference by employing a learning-based design with UCB index. Under the proposed algorithms, we can maximize throughput of the control satellite while constraining interference at the interference satellite as well as at the control satellite, without any direct information exchange with the interference satellite. Through simulations, we demonstrate that the proposed schemes achieve high throughput performance satisfying interference constraints.

Keywords: Interference management , LEO satellite , spectrum sharing

I. INTRODUCTION

CURRENT 5G cellular systems are facing inherent limitation for ubiquitous instant communications, which are essential to realize virtual and augmented reality (VR/AR), Internet of things (IoT), and super-intelligence. The nextgeneration 6G networks are expected to remedy the drawbacks by providing significantly faster and seamless service [1]. Satellite communication is one of the key elements in 6G networks since terrestrial cellular networks are not enough to manage massive data traffic due to limitation of geographical environment and economic resources. To this end, low earth orbit (LEO) satellite has attracted much attention, since it can settle orbital position flexibly, requires low energy for placement, and has low communication latency compared with other types of satellites. However, relatively-new LEO satellite networks often suffer from insufficient frequency resources, since most of them are already allocated for different communication systems. This leads to high demand on the development of efficient spectrum sharing mechanism for LEO satellite networks.

In our work, we consider the downlink spectrum allocation problem in a LEO satellite network with an interference constraint. In the network, there is a LEO satellite providing users in its main-lobe area (or main beam area) with downlink service by sending signals to the ground. However, the mainlobe area of the satellite may overlap with main-lobe or sidelobe (beam) areas of other nearby satellites, causing significant interference to each other. The problem deteriorates as more spectrum resources are reused in geographical proximity, which is common in satellite communication systems due to scarce resources [2]. We aim to control the LEO satellite to share the same frequency spectrum with the interference satellites and allocate time-frequency channels for maximum throughput under interference constraints. In heterogeneous environment, however, the statistics for frequency channel usage of different communication networks is unknown in general. To this end, we exploit online learning techniques to find the best frequency allocation by inferring the channel statistics through communication history.

In related works in this domain, a high-throughput satellite (HTS) system architecture has been proposed for interference-aware resource management in multi-beam satellite system [3]. The impact of interference management between coordinated access scheme and random access scheme has been compared in unmanned aerial vehicle (UAV) relay satellite networks [4]. AI-based approach such as Q-learning has been adopted to provide optimal dynamic channel allocation strategy for LEO satellite networks to accelerate the convergence speed [5]. An efficient resource allocation system for satellite IoT (SIoT) that consists of LEO satellites has been studied using deep reinforcement learning in [6]. It formulates the problem as a Markov decision process (MDP) while considering LEO specific concerns. In [7], the authors extended single-agent deep reinforcement learning to a multiagent model to reduce complexity, which helps to find optimal bandwidth allocation scheme. A non-orthogonal mltiple access (NOMA) scheme for the downlink service of satellite networks has been investigated to enhance the resource efficiency taking into consideration quality of service (QoS) requirements [8].

We consider the problem of frequency channel allocation in an LEO satellite network under interference constraints. For ease of explanation, we denote the LEO satellite under our control as control satellite, and satellites of other networks as interference satellites. The interference satellites can be either LEO satellite, GSO satellite, or even a terrestrial communication system. When the control satellite and the interference satellite transmit signals over the same frequency channel at the same time, the corresponding user of either system may not receive signals successfully due to the interference. We investigate the problem of efficient resource allocation of the control satellite without prior knowledge about the behavior of the interference satellite, while satisfying interference constraints. Different from the aforementioned works, we make use of UCB index, which has been used in cognitive radio networks [9], [10]. Further, we satisfy the interference constraint at the interference satellite without explicit information exchange as well as the interference constraint at the control satellite.

The remainder of the paper is organized as follows. We describe the system model and our objective in Section II. We develop three UCB-based algorithms under different interference constraints in Section III. In Section IV, we evaluate the performance of the proposed algorithms through simulations. Finally, we conclude our paper in Section V.

II. SYSTEM MODEL

We consider a downlink scenario with two interfering satellites. We aim to allocate time-frequency resources to the control satellite, while managing the interference from/to other satellite, denoted by the interference satellite, which can be either another LEO satellite or a GSO satellite. We assume that two networks provide downlink services for their own users through shared frequency spectrum. We also assume that the resource allocation decision of the interference satellite is unknown and made independently. The two satellites travel around the earth in its orbit and the interference occurs at a specific time when the main-lobe area of the control satellite and the main-lobe or side-lobe area of the interference satellite overlap. Since the time scale of signal transmission is much smaller than the change of interference relationship due to satellite orbiting, we assume quasi-static position of the satellites and investigate effective frequency allocation strategy. Thus, we focus on static interference environment in this work.

Detailed model can be described as follows. We assume that the main-lobe area of the control satellite is located in the side-lobe area of the interference satellite. We mainly pay attention to the first side-lobe area due to its relatively higher interference level than the other side-lobes. Overlapping of main-lobe or second/third side-lobe can be considered similarly with different impact on the received signal. We assume that time is slotted and the shared frequency range is equally quantized into N blocks. At each time slot, a satellite can provide its service for a user in its main-lobe area using one of N frequency blocks. We assume that the control satellite can provide its service for up to one user at each time, and the interference satellite can provide service for up to N users at a time. The extension to multi-user control satellite is straightforward.

We consider the following procedure of satellite transmission and feedback.

1) At each time t, the interference satellite selects a set of frequency blocks for data transmission: Each block i is chosen with probability [TeX:] $$p^i$$ in an i.i.d. manner, across blocks and times, where [TeX:] $$p^i \text { 's }$$ are unknown to the contro¹> satellite. At the same time t, the control satellite independently selects one block for its data transmission1. The task of the control satellite is to select the frequency block such that it maximizes the throughput while conforming to given interference constraints. The control satellite also has an option of ’no transmission at the time’. Once both the satellites made their decision, then they transmit signals during the time slot. The task of the control satellite is to select the frequency block such that it maximizes the throughput while conforming to given interference constraints. The control satellite also has an option of ’no transmission at the time’. Once both the satellites made their decision, then they transmit signals during the time slot.

¹ In the case of K users, the control satellite will select K blocks. We assume K = 1 for ease of exposition.

2) The corresponding users for the transmitted data decode the received signals during the time slot. When the two satellites use the same frequency block, then the control-satellite user (and the interference-satellite user) experiences substantial interference due to the sidelobe signal of the other satellite, and may or may not successfully decode the data, depending on the received signal strength as described soon.

3) The satellites will receive the feedback of whether the transmitted data is successfully decoded (transmission success) or not (transmission failure), which can be sent through direct uplink transmission by the user or through a separate feeder-path transmission. We assume that the feedback information is delivered without error.

The transmission success or failure is determined by the CINR (Carrier to Interference and Noise Ratio) of received signal at the user, which can be computed by [TeX:] $$C I N R= P_w /\left(P_u+P_n\right) \text {, where } P_w, P_u, P_n$$ denotes the power of wanted signal, the power of unwanted signals, and the noise power, at the user, respectively. The signal strengths will be determined by several design parameters, e.g., effective isotropic radiated power (EIRP) density, antenna gains, attenuation, etc [11]. We assume that the user successfully decodes the data if the CINR is beyond a certain threshold [TeX:] $$\gamma,$$ and it does not, otherwise. In the latter case, we say that a signal collision occurs and the frequency block at the time slot is wasted.

We now formally formulate our problem. Let [TeX:] $$\mathbf{E}_t= \left[e_t^1, \cdots, e_t^N\right]$$ be the vector for usage of frequency blocks for the interference satellite at time t, where [TeX:] $$e_t^i=1$$ if the interference satellite uses the frequency block i at time t and [TeX:] $$e_t^i=0$$ otherwise. At each time, the interference satellite uses frequency block i with probability [TeX:] $$p^i,$$ and let [TeX:] $$\mathbf{P}=\left[p^1, \cdots, p^N\right]$$ be its vector. Note that pi’s can be different depending on the service type or resource allocation strategy of the interference satellite. In this work, we consider only static P. For the transmission of the control satellite, we let [TeX:] $$\mathbf{U}_t=\left[u_t^1, \cdots, u_t^N\right]$$ denote the usage vector of the control satellite at time t. We have the constraints [TeX:] $$\text { of } \sum_i u_t^i \leq 1$$ and [TeX:] $$\sum_i e_t^i \leq N$$ due to the forementioned assumption for the number of users in service for each satellite.

For frequency block i, if one of the two satellites transmits signal over the block, the corresponding user will successfully decode the signal. In contrast, if both satellites transmit signal over the same block [TeX:] $$i \text {, i.e., } e_t^i=1 \text { and } u_t^i=1 \text {, }$$ there is a signal collision due to the side-lobe signal interference, and the corresponding user can successfully decode the signal only if the received CINR is greater than predetermined threshold [TeX:] $$\gamma.$$ Let [TeX:] $$s_t$$ denote a binary for successful transmission at time t, written as

(1)

[TeX:] $$s_t= \begin{cases}0, & \text { if CINR } \leq \gamma, \\ 1, & \text { if CINR }>\gamma\end{cases}.$$

Our objective is to maximize throughput of the control satellite

(2)

[TeX:] $$\operatorname{maximize} \quad \lim _{T \rightarrow \infty} \mathbb{E}\left[\frac{1}{T} \sum_{t=1}^T s_t\right] \\ \text { subject to interference constraints. } $$

We will consider two different interference constraints: one is to limit the rate of collision (i.e., transmission failure) at the control satellite, and the other to limit the collision rate at the interference satellite. In both cases, we develop efficient frequency allocation algorithms for the control satellite without explicitly exchanging the information with the interference satellite.

III. ONLINE LEARNING ALGORITHMS UNDER INTERFERENCE CONSTRAINTS

The control satellite should select the best frequency block for transmissions that yields the maximum throughput. To this end, it infers the transmission probability P of the interference satellite using the feedback of past transmissions. This problem can be projected as a stochastic multi-armed bandit (MAB) problem. We first focus on efficient online learning algorithm for the control satellite’s frequency allocation applying the well-known UCB algorithm [12] with negative reward for a collision. However, it turns out that this basic approach is not suitable to satisfy a given interference constraint. We extend the algorithm by adding additional conditions to achieve high throughput while conforming to the interference constraints.

A. Basic UCB Algorithm with Negative Reward

We first reformulate (2) into a stochastic MAB problem as follows. At each time t, the control satellite selects block [TeX:] $$a_t \in[N]$$ for data transmission, or selects no block [TeX:] $$\left(a_t=\right. N+1)$$ for not-transmitting. Using block [TeX:] $$a_t,$$ the control satellite transmits signal and gets a reward [TeX:] $$r_t.$$ We have [TeX:] $$u_t^i=0$$ for all [TeX:] $$i \in[N] \backslash\left\{a_t\right\}, \text { and } u_t^{a_t}=1 \text { if } a_t \in[N].$$

When the reward [TeX:] $$r_t$$ is stochastic, it is not easy to figure out the block with the highest average reward without losing transmission opportunity. One of the most well-known method to quickly find the best performing block is the UCB index algorithm [9]. It achieves asymptotic optimal performance by providing balance between exploration and exploitation and keeps the worst-case regret performance loss within a constant factor of the minimax regret lower bound [9], [12]. In the MAB setting, the option (frequency block in our problem) controller is often called as an arm. We use the terms of block and arm interchangeably.

(3)

[TeX:] $$\mathrm{UCB}_t^i=\eta_{t-1}^i+\sqrt{\frac{2 \log t}{\tau_{t-1}^i}},$$

Algorithm 1

basic-UCB with negative reward

The UCB index of arm i at time t is defined as where [TeX:] $$\eta_t^i$$ is an empirical average reward using arm i and [TeX:] $$\tau_t^i$$ is the number of selections on arm i. At the beginning of each time t, we evaluate the UCB index of each arm and select the arm with highest UCB index value, i.e., [TeX:] $$a_t=\operatorname{argmax}_i \mathrm{UCB}_t^i.$$

A straightforward extension of the UCB algorithm to our problem is to set the reward such that it has unit positive value for a successful transmission and a negative value for a collision (i.e., an interference) since we aim to maximize throughout while avoiding the interference. To this end, we can design the reward as

(4)

[TeX:] $$r_t= \begin{cases}0, & \text { if } a_t \notin[N], \\ 1, & \text { if } a_t \in[N], u_t^{a_t}=1 \text { and } e_t^{a_t}=0 \\ -\theta, & \text { if } a_t \in[N], u_t^{a_t}=1 \text { and } e_t^{a_t}=1\end{cases}$$

with some constant [TeX:] $$\theta>0.$$ Then, we find an optimal arm [TeX:] $$a^*$$ that leads to the maximum average reward [TeX:] $$\lim _{T \rightarrow \infty} \mathbb{E}\left[\frac{1}{T} \sum_{t=1}^T r_t\right].$$ Combining it with the UCB algorithm, we develop basic-UCB scheme shown in Algorithm 1, in which DECISION(t) is used in 1) of the transmissionfeedback procedure in Section II and UPDATE [TeX:] $$\left(\mathrm{U}_t, r_t\right)$$ is used in 3) of the procedure in Section II.

basic-UCB:

In DECISION(t), initially it selects each arm (block) once, and then after [TeX:] $$t>N+1,$$ it selects an arm with the largest UCB index.

In UPDATE [TeX:] $$\left(\mathbf{U}_t, r_t\right),$$ it updates parameters that used to compute the UCB index.

Without any information about P, the basic-UCB algorithm quickly finds the best arm that achieves the largest average reward, and achieves asymptotically optimal performance with algorithm complexity of O(N) [9], [12]. However, from its construction, it is limited to find the best performing arm including the arm of ‘no transmission’. As a result, after initial period of learning, the control satellite will continuously select the best-performing arm and may lead to substantial amount of collisions as shown in Section IV. Also, if the negative reward value [TeX:] $$\theta$$ is set too large, it will eventually select no-transmission arm [TeX:] $$\text { (i.e., } a^*=N+1 \text { ) },$$ which results to zero throughput. To avoid such non-intuitive behavior, we introduce additional conditions for collision rates in the following subsections.

B. Constraint on Collision Rate of the Control Satellite

In this section, we consider problem (2) with constraint on the collision rate. There are two different ways to constrain the collision rate: Constrain the total collision rate over all the frequency blocks, or constrain per-block collision rate. The design choice depends on the demand for protection from the interference. In this work, we consider the latter per-block constraint, since, in the former case of total collision constraint, a particular block may suffer from severe interference while it successfully manages overall collision rate below a certain level. It is not difficult to extend our algorithm to the constraint of total collision rate.

Let [TeX:] $$\mathbf{C}_{\mathbf{t}}=\left\{c_t^1, \cdots, c_t^N\right\} \text { and } \overline{\mathbf{C}}=\left\{\bar{c}^1, \cdots, \bar{c}^N\right\}$$ denote the collision rate of the control satellite at time t and the maximum allowable collision rate for each block i, respectively. On (2), we impose the hard constraint [TeX:] $$c_t^i \leq \bar{c}^i$$ for all block i.

We now develop a UCB-based solution that conforms the collision constraint by modifying the basic-UCB algorithm as follows.

Control_Sat-Constrained-UCB:

Replace line 7 of Algorithm 1 with [TeX:] $$u_t^i=\mathbb{1}\left\{a_t=i \text { and } c_{t-1}^i<\bar{c}^i\right\}$$ for each [TeX:] $$i \in[N].$$

In UPDATE [TeX:] $$\left(\mathbf{U}_t, r_t\right),$$ after line 11, add the following: compute collision rate [TeX:] $$c_t^i$$ for each [TeX:] $$i \in[N].$$

The idea of Control_Sat-Constrained-UCB is that it (i) monitors the collision rate of each block i (the total number of collisions at block i divided by the number of selections [TeX:] $$i=\operatorname{argmax}_j \mathrm{UCB}_t^j$$ up to time t) and (ii) restricts the signal transmission if the collision rate is beyond the predetermined target collision rate [TeX:] $$\bar{c}_i.$$ In this manner, we can refrain from transmission and satisfy the constraint condition [TeX:] $$\overline{\mathrm{C}}$$ for all frequency blocks, which will be confirmed through simulations in Section IV.

Although Control_Sat-Constrained-UCB provides the perblock QoS guarantee in terms of collision rate, it tries to constrain the collision rate only from the control satellite’s perspective. However, it is likely that the interference satellite might have a higher priority to the shared frequency blocks, in which case the regulation should be made from the perspective of the interference satellite [13]. This motivates us to extend our results to constrain the collision rate of the interference satellite, and we achieve this without explicit information exchange.

C. Constraint on Collision Rate of the Interference Satellite

We now develop a resource allocation scheme for the control satellite that maximizes throughput performance, but under the constraint of collision rate at the interference satellite. Let [TeX:] $$\mathrm{C}_{\mathbf{t}}^e=\left\{c_t^{e, 1}, \cdots, c_t^{e, N}\right\} \text { and } \mathrm{C}^e=\left\{\bar{c}^{e, 1}, \cdots, \bar{c}^{e, N}\right\}$$ denote the collision rate of the interference satellite at time t and the maximum allowable collision rate for each block i, respectively. The problem can be formulated as in Section III-B but with [TeX:] $$c_t^{e, i} \leq \bar{c}^{e, i},$$ and the solution can be obtained by replacing [TeX:] $$c_t^i$$ and [TeX:] $$\bar{c}^i \text { with } c_t^{e, i} \text { and } \bar{c}^{e, i} \text {, }$$ respectively. However, this approach requires the knowledge of [TeX:] $$C_t^e,$$ which cannot be obtained by the control satellite, unless it directly communicates with the interference satellite. Thus, we need to estimate the collision rate of the interference satellite, [TeX:] $$\mathrm{C}_{\mathrm{t}}^{\mathrm{e}},$$ which is the key of this approach.

Fig. 1.

CINR of received signal with and without side-lobe interference.

At time t, let [TeX:] $$p_a^i \text { and } p_e^i$$ be the probability that the control satellite and the interference satellite transmit their signal using frequency block i, respectively. From independent transmissions of the two satellites, we have the collision probability [TeX:] $$c_t^{e, i}$$ at the the interference satellite (i.e., conditional collision probability at block i given that the interference satellite transmits over it) as

(5)

[TeX:] $$c_t^{e, i}=\frac{p_a^i \cdot p_e^i}{p_e^i}=p_a^i .$$

Hence, we can control the per-block collision rate of the interference satellite below the given level by satisfying

(6)

[TeX:] $$t \cdot p_a^i=\sum_{x=1}^t u_x^i \leq t \cdot \bar{c}^{e, i}.$$

The following describes the proposed Interference_Sat- Constrained-UCB algorithm.

Interference_Sat-Constrained-UCB:

Replace line 7 of Algorithm 1 with [TeX:] $$u_t^i=\mathbb{1}\left\{a_t=i \text { and } \frac{1}{t-1} \sum_{x=1}^{t-1} u_x^i<\bar{c}^{e, i}\right\} \text { for } i \in[N].$$

Note that we can also control the collision rate of the interference satellite over the entire blocks (i.e., not per-block collision rate). In this case, the constraint is given as overall collision rate for all N frequency blocks, which leads to the following condition:

(7)

[TeX:] $$\frac{\sum_{i=1}^N\left(p_a^i\left(1-p_c^i\right)\right)}{1-\prod_{i=1}^N\left(1-p_c^i\right)} \leq \bar{c}^e.$$

where [TeX:] $$\bar{c}^e$$ is the collision-rate constraint. We also omit the details due to the space limit.

IV. SIMULATION

We evaluate the performance of proposed algorithms through simulations. In the simulations, we employ a LEO satellite as the control satellite and a GSO satellite as the interference satellite. We set the radius of the main lobe area to 400 km for GSO and 50 km for LEO.We assume that the beam of GSO is centered at (0, 0) and that of LEO at (450, 0) on the ground, where the unit is km, which means that the mainlobe area of the control satellite is located in the side-lobe area of the interference satellite. For satellite communications, we consider the weather loss and scintillation loss for the atmospheric losses due to the radio propagation. For the ease of presentation, we consider only three downlink frequency blocks from 7.055 GHz that are shared by two satellites. The extension to many blocks is straightforward. Each frequency block takes 1.23 MHz communication bandwidth. We use the typical parameters of satellite systems as shown in Table I [14]. We also refer to [14] for the parameters to calculate CINR.

TABLE I

SIMULATION PARAMETERS.

Parameter	LEO	GSO
height (km)	1.414	35.786
downlink eirp (dBW)	−5	3.8
side lobe attenuation (dB)	−14	−14
the number of beams	1	1
beam radius (km)	50	400
weather loss (dB)	0.5	0.5
scintillation loss (dB)	0.3	0.3

Fig. 2.

Frequency block selection under basic-UCB.

The interference satellite has three users located in its mainlobe area and also in the side-lobe area of the control satellite. We consider only the first side-lobe area assuming that the spatio-temporal interference signal of the other subsequent side-lobes is negligible. We assume the first side-lobe signal is attenuated by [TeX:] $$\alpha$$ dB with respect to the main-lobe signal. For simulations, we set [TeX:] $$\alpha=14,$$ which is a typical value according to [14]. At each time, the interference satellite transmits data to user i through block i with probability pi. We set transmission probability vector to [TeX:] $$\mathbf{P}=\left[p^1, p^2, p^3\right]=[0.3,0.5,0.7].$$ The control satellite has one user, and at each time it selects one frequency block for data transmission. If both the control satellite and the interference satellite simultaneously transmit signals using the same frequency block, the corresponding users suffer from an interference (or a collision) due to the side-lobe signal from the other satellite. The goal of the control satellite is to maximize its throughput while avoiding interference, by selecting the best frequency block, including the option of no transmission.

The overall procedure of frequency block allocation of the control satellite and the interference satellite, signal transmission, and feedback follow as described in Section II. We first observe the significance of the side-lobe interference to the other satellite system. Then, the control satellite and the interference satellite transmit signal. The CINR of received signals feed back to the corresponding transmitter satellite, and the procedure repeats. Fig. 1 shows the CINRs of received signals at the user associated with the control satellite. The CINR is about 17.16 dB without interference, and about 13.28 dB with interference. We set threshold [TeX:] $$\gamma$$ to 15 dB, with which a user fails to decode its wanted signal when the other satellite’s signal presents. Actual value of of [TeX:] $$\gamma$$ may change according to the level of coding and target block error rate [15].

We run the developed algorithms during 10, 000 time slots and measure throughput and collision rate of the control satellite and the interference satellite. We first evaluate the performance of basic-UCB with negative reward [TeX:] $$\theta=0.5.$$ Fig. 2 shows the frequency block selection under basic-UCB algorithm, where x-axis denotes time step. A bar between i and i + 1 in y-axis means that a satellite selects block i + 1 at that time, where the selection of block 4 implies no transmission [TeX:] $$\text { (i.e., } \sum_{i=1}^3 u_t^i=0 \text { ). }$$ The control satellite’s block selection is denoted by green bar (c_sat) and the interference satellite’s block selection by blue bar (i_sat). A red bar indicates a collision (interference), i.e., the control satellite and the interference satellite choose the same frequency block simultaneously. The result shows that the control satellite mostly chooses block 1 due to the lowest collision probability [TeX:] $$\left(p^1=0.3\right)$$ despite high collision rate, which is also the optimal one in this setting.

Fig. 3.

Performance of basic-UCB.

Fig. 3(a) present overall throughput of the control satellite, and overall collision rate of the control satellite (col_c) and the interference satellite (col_i). It shows that, during 10, 000 time slots, the control satellite achieves average throughput of about 0.69 and the collision rate of about 0.3, and the interference satellite also experiences the collision rate of about 0.34. Fig. 3(b) shows per-block throughput of the control satellite. Since the control satellite mostly uses frequency block 1 as shown in Fig. 2, the throughput for block 1 almost coincides with the overall throughput. Throughput for other blocks becomes zero soon after the beginning. Fig. 3(c) and 3(d) show the per-block collision rate of the control satellite and the interference satellite, respectively. We can observe that, for the mostly selected block 1, both satellites suffer from high collision rates of 0.3 for the collision satellite and 0.97 for the interference satellite. Note that the high collision rates of block 2 and 3 for the control satellite in Fig. 3(d) is artificial. Since we define the collision rate based on the number of selections, if a block is rarely selected, its collision rate is hardly updated.

Our results show that although basic-UCB algorithm could successfully finds the optimal block for the maximum throughput (i.e., block 1), it cannot manage the interference and may fail to control the collision rate of the control and interference satellites.

Fig. 4.

Freq. block selection under Control_Sat-Constrained-UCB.

Fig. 5.

Performance of Control_Sat-Constrained-UCB.

Next we evaluate the performance of Control_Sat- Constrained-UCB algorithm that aims to manage the collision rate of the control satellite for each block. The collision constraint is set to be [TeX:] $$\overline{\mathrm{C}}=\{0.2,0.2,0.2\}.$$ Fig. 4 illustrates that, in most times, the control satellite (c_sat) selects either block 1 for high throughput or chooses no transmission to satisfy the collision constraint. As a result, the collision (interference) at block 1 reduces significantly, which is also confirmed in Fig. 5. In Fig. 5(a), we can observe that overall throughput of the control satellite is about 0.45, which is smaller than that of basic-UCB since the control satellite transmits signals less frequently. Also, the collision rate of the control satellite is about 0.2, and the collision rate of the interference satellite is about 0.22 at t = 10, 000 as shown in Fig. 5(a). The throughput of the control satellite for each block is shown in Fig. 5(b), which is similar to Fig. 3(b) except the lower throughput for block 1 (about 0.45). Fig. 5(c) illustrates that Control_Sat-Constrained-UCB successfully manages the collision constraint for each block. We observe that the collision rates for block 1, 2, and 3 are satisfied as 0.2 at t = 10, 000. On the other hand, Fig. 5(d) shows that Control_Sat-Constrained-UCB still suffers from high collision rate at the interference satellite, although it is much smaller than that of basic-UCB. It is shown that the collision rate for block 1 at the interference satellite is as high as 0.65.

Fig. 6.

Freq. block selection under Interference_Sat-Constrained-UCB.

Fig. 7.

Performance of Interference_Sat-Constrained-UCB.

In a nutshell, our results show that Control_Sat-Constrained- UCB achieves its goal and successfully controls the collision rate for each block at the control satellite. However, it may still suffer from high collision rate at the interference satellite.

Finally, we evaluate the performance of Interference_Sat- Constrained-UCB that aims to handle the collision constraint of the interference satellite. We set the collision constraint at the interference satellite as [TeX:] $$\overline{\mathbf{C}}^{\mathrm{e}}=\{0.2,0.2,0.2\}.$$ Fig. 6 illustrates the block selection of Interference_Sat-Constrained- UCB, in which block 2 and 3 are selected more frequently than Control_Sat-Constrained-UCB and the collisions are distributed over all blocks. In Fig. 7(a), we can observe that overall throughput of the control satellite is about 0.3, the collision rate of the control satellite is about 0.3, and the collision rate of the interference satellite is about 0.34 at t = 10, 000. The throughput of the control satellite for each block is shown in Fig. 7(b). Different from basic-UCB and Control_Sat- Constrained-UCB, the control satellite under Interference_Sat- Constrained-UCB achieves higher throughput for blocks 2 and 3. For blocks 1, 2, and 3, the throughput is about 0.14, 0.1, and 0.06, respectively. Fig. 7(c) shows the collision rate of the control satellite for each block. At the end, the collision rates for blocks 1, 2, and 3 are about 0.14, 0.3, and 0.67, respectively, which is strongly related with the interference satellite’s transmission probability P. Finally, Fig. 7(d) shows that Interference_Sat-Constrained-UCB satisfies the collision constraint for each block as intended. At t = 10, 000, the collision rate of the interference satellite for blocks 1, 2, and 3 are about 0.2.

In summary, the simulation results show that the control satellite under Interference_Sat-Constrained-UCB successfully manages the interference of the interference satellite without any direct information exchange, satisfying the per-block collision constraint of the interference satellite. This is a great advantage since an information exchange may take a time and be costly in space communication system.

Remark: It is hard to directly compare throughput performances between the aforementioned algorithms, since they have different constraints. In general, the control satellite under tighter interference constraint achieves less throughput. In this work, we show that a learning-based scheme can achieve efficient resource utilization while satisfying different interference constraints without explicit message exchange between different communication systems.

V. CONCLUSION

We investigated the resource allocation problem for LEO satellite networks to avoid interference due to the side-lobe interference considering different constraints. We take the learning approach to achieve high performance without explicit information exchange between satellites. We first formulated our problem as a stochastic MAB problem considering available frequency blocks as arms, and developed basic-UCB with negative reward to find the best arm quickly. Although basic-UCB made the resource allocation for the control satellite efficient, the interference level is hard to control under basic-UCB. We then developed two interference-constrained schemes, Control_Sat-Constrained-UCB and Interference_Sat- Constrained-UCB, which constrained the collision rate of the control satellite and the interference satellite for each frequency block, respectively. Through simulations, we demonstrated that our algorithms could successfully manage the interference satisfying the constraints, while maximizing the throughput of the control satellite.

Biography

Jihyeon Yun

Jihyeon Yun received the M.S. degree from the School of ECE, Ulsan National Institute of Science and Technolodgy (UNIST), in 2019. Currently, she studies in the Department of CSE, Korea University, toward the Ph.D. degree. Her research interests include remote estimation and sensor networks.

Biography

Taegun An

Taegun An is currently a Ph.D. student at the Korea University. He received B.S. degree in Computer Science from Ulsan National Institute of Science and Technology (UNIST). His research interests are reinforcement learning (RL) and neural architecture search (NAS).

Biography

Haesung Jo

Haesung Jo is a M.S. student at the Korea University. He received B.S. degree in Computer Engineering from Konkuk University in 2021. His research interests are machine learning and reinforcement learning.

References

1 W. Saad, M. Bennis, and M. Chen, "A vision of 6G wireless systems: Applications, trends, technologies, and open research problems," IEEE Netw., vol. 34, no. 3, pp. 134-142, 2020.doi:[[[10.1109/mnet.001.1900287]]]
2 J. Choi and V . W. S. Chan, "Resource management for advanced transmission antenna satellites," IEEE Trans. Wireless Commun., vol. 8, no. 3, pp. 1308-1321, 2009.doi:[[[10.1109/twc.2009.071131]]]
3 R. Zhang, Y . Ruan, Y . Li, and C. Liu, "Interference-aware radio resource management for cognitive high-throughput satellite systems," Sensors, vol. 20, no. 1, p. 197, 2020.doi:[[[10.3390/s20010197]]]
4 C. Joo and J. Choi, "Low-delay broadband satellite communications with high-altitude unmanned aerial vehicles," J. Commun. Netw., vol. 20, no. 1, pp. 102-108, 2018.doi:[[[10.1109/jcn.2018.000010]]]
5 F. Zheng, Z. Pi, Z. Zhou, and K. Wang, "LEO satellite channel allocation scheme based on reinforcement learning," Mobile Inf. Syst., vol. 2020, 2020.doi:[[[10.1155/2020/8868888]]]
6 B. Zhao, J. Liu, Z. Wei, and I. You, "A deep reinforcement learning based approach for energy-efficient channel allocation in satellite Internet of things," IEEE Access, vol. 8, pp. 62197-62206, 2020.doi:[[[10.1109/access.2020.2983437]]]
7 X. Hu et al., "Multi-agent deep reinforcement learning-based flexible satellite payload for mobile terminals," IEEE Trans. Veh. Technol., vol. 69, no. 9, pp. 9849-9865, 2020.doi:[[[10.1109/tvt.2020.3002983]]]
8 X. Yan et al., "Delay constrained resource allocation for NOMA enabled satellite Internet of things with deep reinforcement learning," IEEE Internet Things J., 2020.custom:[[[-]]]
9 S. Kang and C. Joo, "Low-complexity learning for dynamic spectrum access in multi-user multi-channel networks," IEEE Trans. Mobile Comput., vol. 20, no. 11, pp. 3267-3281, 2021.doi:[[[10.1109/infocom.2018.8485937]]]
10 D. Park, S. Kang, and C. Joo, "A learning-based distributed algorithm for scheduling in multi-hop wireless networks," J. Commun. Netw., vol. 24, no. 1, pp. 99-110, 2022.doi:[[[10.23919/jcn.2021.000030]]]
11 I. Leyva-Mayorga et al., "LEO small-satellite constellations for 5G and beyond-5G commununications," IEEE Access, vol. 8, pp. 184955184964, 2020.doi:[[[10.1109/ACCESS.2020.3029620]]]
12 P. Auer, N. Cesa-Bianchi, and P. Fischer, "Finite-time analysis of the multiarmed bandit problem," Mach. Learn., vol. 47, no. 2-3, May 2002.doi:[[[10.1023/A:1013689704352]]]
13 FCC, "Enable GSO fixed-satellite service (space-to-earth) operations in the 17.3-17.8 GHz band, to modernize certain rules applicable to 17/24 GHz BSS space stations, and to establish off-axis uplink power limits for extended Ka-band FSS operations," Nov. 2020, iB Docket No. 20-330. (Online). Available: https://docs.fcc.gov/public/attachments/ FCC-20-158A1.pdfcustom:[[[-]]]
14 ITU, "Recommendation ITU-R s.1328-2," 2002. (Online). Available: {https://www.itu.int/dms\_pubrec/itu-r/rec/s/R-REC-S. 1328-2-200001-S!!PDF-E.pdf}custom:[[[https://www.itu.int/dms\_pubrec/itu-r/rec/s/R-REC-S.1328-2-200001-S!!PDF-E.pdf}]]]
15 E. Chu, J. Yoon, and B. C. Jung, "A novel link-to-system mapping technique based on machine learning for 5G/IoT wireless networks," Sensors, vol. 19, no. 5, 2019.doi:[[[10.3390/s19051196]]]

Received: January 15 2021

Revision received: January 15 2021

Accepted: July 25 2022

Published (Electronic): December 31 2022

Corresponding Author: Changhee Joo , changhee@korea.ac.kr

Jihyeon Yun, Department of Computer Science and Engineering, Korea University, Seoul, Korea, jihyeonyoon@korea.ac.kr

Taegun An, Department of Computer Science and Engineering, Korea University, Seoul, Korea, antaegun20@korea.ac.kr

Haesung Jo, Department of Computer Science and Engineering, Korea University, Seoul, Korea, ategry@korea.ac.kr

Bon-Jun Ku, Radio and Satellite Research Division, ETRI, South Korea, bjkoo@etri.re.kr

Daesub Oh, Radio and Satellite Research Division, ETRI, South Korea, trap@etri.re.kr

Changhee Joo, Department of Computer Science and Engineering, Korea University, Seoul, Korea, changhee@korea.ac.kr