Decentralized Linear Transceiver Design in Multicell MIMO Broadcast Channels

This paper studies linear transceiver design in multicell MIMO Broadcast Channels (BCs). In this context, previous works have tried to enhance the conventional Block Diagonalization (cBD) algorithm, such as the proposal of iterative BD (iBD), which has less dimensionality restrictions and accounts for the presence of inter-cell interference (ICI). However, both approaches become interference-limited when the ICI has strong power. In this paper, we take a different direction by using the Weighted Sum-Rate (WSR) as the transceiver design criterion. For that, three different novel algorithms are proposed in this paper, which are based on the alternating optimization technique and guaranteed to converge to a local WSR-optimum. The first algorithm is an interference pricing approach, where each cell maximizes its own utility, which is formed by the local users’ WSR minus the priced ICI leakage. The second algorithm designs transceivers that maximize the network-wide WSR. Interestingly, we prove that the WSR maximization via interference pricing can be made equivalent to the network-wide WSR maximization, whenever the mobile stations are equipped with single-antennas. The third algorithm is an implicit interference pricing approach, where each cell self-prices its ICI leakage and, thus, does not require the feedback of variables from other cells. To facilitate the algorithms’ implementation, a novel Over-the-Air (OTA) signaling scheme based on Time Division Duplex (TDD) mode is proposed, which reduces the signaling overhead and requires no backhaul feedback, as compared to existing schemes.


I. INTRODUCTION
Multiple-Input Multiple-Output (MIMO) technology has great potential to eliminate/manage interference, achieve higher throughput, and enhance system capacity [1].Using multiple antennas, the Base Stations (BSs) can transmit to multiple users simultaneously using linear or non-linear transmission techniques [2] to achieve a linear increase of system throughput in the number of BS antennas.In single-cell networks, the non-linear Dirty Paper Coding (DPC) technique [3] is known to achieve the channel capacity.However, it is widely considered that DPC has limited practical applications, due to its high complexity.Therefore, linear transmission techniques (also called beamforming) have gained more interest and were proven to achieve the same sum rate scaling law as DPC [2], while maintaining low complexity.
A notable scheme in this area is called Block Diagonalization (BD) [4].In single-cell networks, conventional Block Diagonalization (cBD) completely eliminates intra-cell interference by forcing each user to transmit on the null space of the other users.However, in multicell networks, cBD would ignore the inter-cell interference (ICI), which would affect the users' performance.For that purpose, the authors in [5] have proposed enhanced BD (eBD), which uses a whitening filter to reduce ICI effects.Nevertheless, both cBD and eBD algorithms have high dimensionality restrictions, since they both rely on transmit beamforming to eliminate intra-cell interference and ignore the receive beamforming.Motivated by the last observation, we have proposed iterative BD (iBD) in [6], which eliminates the intra-cell interference by jointly optimizing the transmit and receive beamforming matrices and also accounts for the ICI presence.We have shown that iBD has better sum rate performance than both cBD and eBD, while significantly reducing dimensionality restrictions.
However, it is also shown in [6] that all BD approaches become interference-limited in the presence of high ICI power.The main limitation of BD is that each user has an altruistic behavior with regard to other users in the same cell (since the intra-cell interference is completely eliminated) and an egoistic behavior with regard to users in adjacent cells (since nothing is done to reduce the ICI).Thus, the BD approach cannot achieve a good balance with regard to the users' beamforming behavior, which prevents it from achieving the optimum in terms of sum rate [7].
An alternative approach is to jointly design the transmit beamforming of all users in all cells.This approach is named as Coordinated Multi-Point (CoMP) in the literature and can be classified into Joint Processing (JP) [8] and Coordinated Beamforming (CBF) techniques [9].In contrast to JP, each user in a CBF system is served by a single BS and thus, the BSs do not need to share the users' data or to be time and phase synchronized.Therefore, CBF has gained a lot of attention and has been extensively studied in the literature with different optimization criteria.For example, in [9]- [11] for sum-power minimization, in [12] for Signal-to-Interference-plus-Noise Ratio (SINR) balancing, in [13] for sum Mean-Square Error (MSE) minimization, and in [14]- [25] for Weighted Sum-Rate (WSR) maximization.
In this paper, we are particularity interested in the WSR maximization problem.The problem is non-convex and NPhard [20], for which only local optima can be guaranteed via practical methods.Nevertheless, it has some desirable proprieties such as 1) it can prioritize the users and achieve some fairness among them by adjusting the weights, 2) it has an implicit users and streams selection, since, at convergence, the number of active streams is almost always less than or equal to the number of BS antennas, and 3) it is always feasible when only constrained by transmit power.Therefore, the WSR maximization problem has gained a lot of attention and has been extensively studied in the literature, considering different optimization tools and system models.
For single-cell MIMO Broadcast Channels (BCs), the authors in [14] reformulate the problem into an equivalent problem that incorporates a weighted sum-MSE and establish a weighted sum-MSE duality that is solved iteratively using a Geometric Program (GP) formulation.For multicell Multiple-Input Single-Output (MISO) BC, the problem was addressed in [16], [18], [19].In [16], the authors derived the Karush-Kuhn-Tucker (KKT) conditions of the problem and then devised an iterative algorithm to solve them, without the need of resorting to convex optimization methods.In [18], an iterative pricing algorithm was proposed based on gametheory, which is guaranteed to converge to an interference equilibrium that corresponds to a KKT point for the original WSR maximization problem.Among all, the global optimum solution is guaranteed only in [19], where the problem was solved using a branch-reduce-and-bound algorithm.
On the other hand, the authors in [17] considered the multicell MISO-Interference Channel (IC) and established a relation between the WSR and the Virtual Signal-to-Interferenceplus-Noise Ratio (VSINR) by applying the KKT conditions, which led to a distributed and iterative algorithm.Recently, the authors in [20] considered a single-cell MIMO-BC and established a relation between the WSR maximization problem and the Weighted Minimum-Mean-Square Error (WMMSE) minimization problem by applying the KKT conditions.As a result, an iterative algorithm called WSR-WMMSE was proposed, which is based on the alternating optimization technique [26] and solves the quite hard WSR maximization problem indirectly by solving the easier WMMSE minimization problem.This later relation has inspired many extensions, such as to the multicell MIMO-IC [21] and to the multicell MIMO-BC [22]- [25].
In this paper, we consider a multicell MIMO-BC system model and propose three different decentralized and novel WSR maximization algorithms, which are based on the alternating optimization technique [26] and are guaranteed to converge to a local WSR-optimum.The proposed algorithms are summarized as follows.The first algorithm uses an interference pricing approach, the same as in [18], where each BS maximizes its own utility that is formed by the local users' WSR minus the priced-ICI leakage.In [18], the authors assumed single-antenna users and formulated the problem as a relaxed Semidefinite Programming (SDP), whose solution requires each BS to first obtain the transmit covariance matrices, followed by an operation to guarantee and extract the rank-one transmit beamforming vectors.Different from [18], we consider multi-antenna users and the transmit beamforming matrices are obtained directly by investigating the KKT conditions of the problem.The main ingredient is given by Lemma 1, which makes it possible to solve the transmit beamforming directly from the problem cost function, in contrast to the WSR-WMMSE from [20]- [25].Through computer simulations, it is shown that the proposed algorithm can achieve a comparable sum rate performance to WSR-WMMSE, while using fewer iterations.
The second algorithm designs the transmit beamforming that maximizes the network-wide WSR by generalizing the solution steps of the first algorithm.Interestingly, it is proven that the WSR maximization via interference pricing can be made equivalent to the network-wide WSR maximization whenever the Mobile Stations (MSs) have single-antenna, i.e., in the multicell MISO BC.However, the interference pricing approach is shown to have some performance loss when the MSs have multiple antennas, as compared to the network-wide approach.
The third algorithm is an implicit interference pricing approach, where each BS self-prices its ICI leakage to other cells.Through computer simulations, it is shown that the self-pricing approach has negligible performance loss, as compared to the network-wide approach, when the BSs have enough Degrees of Freedom (Dof).In this case, the self-pricing approach is more appealing for practical systems, since it does not require feedback of variables from other cells.
The proposed algorithms are decentralized in the sense that each BS can solve for its transmit beamforming independently, as soon as it has the required information.Here, we assume that each BS can acquire the local Channel State Information (CSI) between itself and all the MSs in the system, as in [20]- [25].An effective technique for obtaining this CSI is the Time Division Duplex (TDD) operation, where uplink training in conjunction with reciprocity simultaneously provides the BSs with downlink and uplink channel estimates [27].Furthermore, the TDD mode is more applicable than Frequency Division Duplex (FDD) to local area deployments and small cells, where the transmit powers, mobile speeds, and channel propagation delays are relatively low [27]- [29].In this paper, we further propose a novel Over-the-Air (OTA) signaling scheme based on TDD mode to facilitate the algorithms' implementation.In contrast to some existing signaling schemes in [23]- [25], the proposed scheme reduces the signaling overhead and requires no feedback of variables between BSs.
The rest of this paper is organized as follows.In section II we present the system model.In section III we review the Block Diagonalization (BD) approach from [6].The proposed algorithms and the Over-the-Air (OTA) signaling scheme for WSR maximization are presented in sections IV and V, respectively.Finally, in section VI we present numerical results and then conclude the paper in section VII.
Notations: Upper/lower boldface letters are used for matrices/vectors.The notations (•) H , • , (•) † , Tr(•), log(•), and • denote the complex conjugate transpose, the standard Euclidean norm, the pseudo-inverse, the trace, the logarithm of base 2, and the determinant, respectively.E(•) denotes the statistical expectation.Bdiag{•} denotes the block-diag operator of a given vector/matrix.[A] [1:N ] selects the first N vectors of a matrix A, while [A] [i] selects the i-th vector.

II. SYSTEM MODEL
We consider a multicell MIMO BC wireless network consisting of M cells, as in Fig. 1.In each cell, there is one BS equipped with N t antennas and K MSs, each equipped with N r antennas.We denote the BS of the n-th cell as BS n and the k-th MS in the n-th cell as MS nk .Let M def = {1, . . ., M } and K def = {K 1 , . . ., K M } denote the sets of all BSs and MSs, respectively, whereas K n denotes the set of MSs associated with BS n .The M BSs are assumed to operate over a common frequency channel and communicate with their K respective MSs using linear transmit beamforming.The scenario under consideration assumes that each MS is served by only one BS.The received signal at MS nk is given as where H mnk ∈ C Nr×Nt denotes the MIMO channel matrix from BS m to MS nk , whose coefficients are independent and identically distributed (i.i.d) complex Gaussian random variables, T nk ∈ C Nt×Ns denotes the transmit beamforming, with N s being the number of data streams, s nk ∈ C Ns denotes the transmitted data vector that is statistically independent with zero mean and E(s nk s H nk ) = I, ∀k ∈ K, and z nk ∈ C Nr denotes the i.i.d complex Gaussian noise vector with zero mean and variance σ 2 nk .To decode the desired signal, each MS nk multiplies its received signal vector y nk by the receive beamforming matrix R nk ∈ C Ns×Nr .Thus, the received data vector ŝnk at MS nk is given as ŝnk = R nk y nk .

III. BLOCK DIAGONALIZATION APPROACH
Theoretically, the cBD algorithm from [4] can be interpreted as the equivalent Zero-Forcing (ZF) algorithm for the MIMO system.The main objective is to completely eliminate intracell interference by forcing each user to transmit on the null space of all other users in the same cell.The cBD optimization problem of BS n , ∀n ∈ M, can be written as where p n is the transmit power threshold, As one can notice, problem P cBD does nothing to deal with the ICI that is being received from the other cells or is leaking to the other cells, since its main objective is to maximize each cell's achievable rate such that all intra-cell interference is eliminated.The main advantage, though, is that P cBD is completely distributed between M cells and has a closedform solution as follows.The transmit beamforming matrix of MS nk is given as where P nk holds on its diagonal the power allocation, G nk holds the orthogonal basis vectors of the null space of the intra-cell users' channels, and F nk holds the right singular vectors of the effective channel of MS nk .Let the intra-cell users' channels of MS nk be given as Then, to calculate G nk , define the Singular Value Decomposition (SVD) of H −k nk as where G nk is the last (N t − l −k nk ) right singular vectors and l −k nk is the rank of H −k nk .Further, let the effective channel of MS nk be given as Then, to calculate F nk , define the SVD of H e nk as where Σ e nk is an [l e nk × l e nk ] diagonal matrix, V e nk contains the first l e nk singular vectors, and l e nk is the rank of H e nk .Therefore, assuming the values of Σ e nk are in a decreasing order, we choose F nk and R nk to be the first N s vectors of V e (1) nk and U e nk , respectively, i.e., With transmit and receive beamforming matrices calculated as above, the BS n rate function is reduced to where Σ n = Bdiag(Σ e n1 , . . ., Σ e nK ) and P n is a diagonal matrix that holds the optimal power loading found using waterfilling method [30] on the Σ n diagonal elements 1 .
From above, it can be seen that one of the main issues with cBD is that it does nothing to reduce the effects of the ICI each user is receiving.This issue has been considered in [5], where the authors proposed the eBD algorithm to account for the ICI presence.The eBD algorithm is summarized as follows.First, to suppress the ICI effects, MS nk uses the whitening matrix Then, the BS n rate function r cBD n can be written as where As in the case of cBD, the transmit beamforming matrix of MS nk is given as nk , where G nk is calculated similar to (5) from The F nk and R nk matrices are calculated similar to (8) from the MS nk effective channel Consequently, the BS n rate function r eBD n given by ( 10) is reduced to From above, one can see that both BD approaches, cBD and eBD, have the same dimensionality restrictions.The expressions given by ( 4) and (11) Moreover, it is important to note that both approaches use only the transmit beamforming T n to eliminate the intra-cell interference, i.e., the receive beamforming R n is not utilized.Motivated by the last observation, one possible way to reduce the dimensionality restrictions is to utilize the receive beamforming matrix when calculating the transmit beamforming matrix [6].To achieve this end, the receive beamforming matrix R nk can be included in (11), then we have Note that which is no longer in function of N r .Calculating the null space from H nk is always satisfied if, and only if, the number of data streams transmitted by a BS is less than or equal to its number of transmit antennas, i.e., the condition of [N t − (K − 1)N s ] ≥ N s should be satisfied.The following steps are much similar to the ones above.The transmit beamforming is given as nk , where G nk is calculated similarly from H −k nk .The F nk and R nk matrices are calculated from the MS nk effective channel Since the transmit and receive beamforming matrices are now coupled, the BS is required to conduct some iterations in Algorithm 1 iterative BD (iBD).
nk .5: Repeat steps 2-4 (until convergence) order to achieve BD.Therefore, we refer to this approach as iterative BD (iBD) and summarize it in Algorithm 1.
At the first step, Algorithm 1 initializes the transmit and receive beamforming matrices for all users.For instance, T (1) nk can be initialized using the Maximum Ratio Transmission (MRT) approach and R (1) nk = I.At the t-th iteration, each BS transmits pilot signals precoded with T (t) nk at step-2 so that each MS nk can calculate the ICI covariance matrix, i.e., Υ (t) nk , and feed it back to its serving BS via feedback channels.After that, each BS updates the transmit and receive beamforming of its users at step-4.The aforementioned steps are repeated until convergence.Note that the transmit and receive beamforming matrices of all users are calculated at the BSs.Therefore, at convergence, each BS would forward the receive beamforming matrices to its users using the feedforward channels.With the transmit and receive beamforming matrices calculated as given by Algorithm 1, the BS n rate function r eBD n given by ( 10) is reduced to Note that both equations ( 12) and ( 15) have the same structure.The following theorem indicates their relation.
Theorem 1: If the number of data streams transmitted to any user is equal to the number of its receive antennas, i.e., if N s = N r , then, both eBD and iBD are equivalent and have the same exact performance.
Proof 1: Please refer to Appendix A. It's worth noting that if the system has only one cell, then the eBD algorithm is equivalent to cBD.In this case, the iBD algorithm is also equivalent to cBD, only if N s = N r , which is a straightforward result of Theorem 1.

IV. WEIGHTED SUM RATE MAXIMIZATION APPROACH
In this section, we use WSR as the transceiver design criterion.We assume that each MS employs single-user detection by treating the interference as additive noise.Therefore, the achievable rate of MS nk can be written as where Φ nk denotes the received interference plus-noise covariance matrix for MS nk , which is given as whereas Υ nk is given by ( 2), which denotes the ICI plus noise covariance matrix of MS nk .Here, we assume that each MS nk uses Minimum-Mean-Square Error (MMSE) receive beamforming, which is given as [20] where (19), the MSE-matrix of MS nk is given as [23] which can be equivalently expressed as [20] The latter form of E nk in (21) shows that the rate function given by ( 17) can be equivalently expressed as Note that E nk must be Hermitian, since from ( 21), E nk equals a quantity (right-hand side) that is Hermitian, which means that E nk = E H nk .Furthermore, the following lemma is needed throughout the rest of the paper.
Lemma 1: Given the MSE-matrix E nk as in (20), or equivalently as in (21), the receive beamforming matrix R nk can be written as Proof 2: Please refer to Appendix B.

A. Per-Cell WSR maximization via Interference Pricing
In this section, we consider an interference pricing approach for designing the transmit beamforming.The main idea is to manage the ICI received by a user by pricing the interfering BSs.Similar to [18], [31], we define the interference price as the marginal decrease in the user rate due to a marginal increase in the received interference.Mathematically, the MS nk interference price is given as Using the result of ∇ log X = Tr(X −1 ∇X), where X is a matrix [32], then π nk is given as By observing (25) and the Lemma 1 result, we have the following corollary.
Corollary 1: The MS nk interference price π nk given by ( 25) can be equivalently written as Proof 3: According to Lemma 1, the receive beamforming can be written as Then, the MS nk interference price π nk given by ( 25) is reduced to where (a) is obtained by substituting R nk into the first equality and (b) is obtained by using the results of Tr(XYZ) = Tr(YZX) = Tr(ZXY) [32], which completes the proof.
Let π n = {π mj , ∀m ∈ M\n, ∀j ∈ K m } denote the vector that collects all interference prices of all users in the system except BS n users.Then, define the following MS-specific function where µ nk > 0 denotes the weight associated to MS nk and L nk defines the priced-ICI caused by the MS nk beamforming T nk , which is given as Afterwards, each BS n , ∀n ∈ M, updates its transmit beamforming T nk , ∀k ∈ K n , as the solution to the following interference-priced WSR maximization problem max where we have used an equality power constraint rather than the often used k∈Kn Tr(T nk T H nk ) ≤ p n , since the WSR optimum is reached at maximum transmit power [20].From problem P WSRP , one can see that this approach is different from the BD approach, in the sense that the ICI received by a user is being managed by the interfering BSs and not the serving BS.
In [18], the authors addressed problem P WSRP from a gametheoretic view-point assuming single-antenna users, where the function f nk (π n ) is interpreted as a user utility function that penalizes the user rate by the ICI that he is leaking.The problem in [18], however, was formulated as a relaxed SDP and its solution would require each BS to obtain first the transmit covariance matrices, i.e., Q nk def = t nk t H nk 0, followed by operations to guarantee and extract the rankone transmit beamforming vectors, i.e., t nk .It was proven in [18] that problem P WSRP is guaranteed to converge to an equilibrium point that corresponds to a KKT point for the original problem P WSRP .In the following, we present a different solution to problem P WSRP .The solution is obtained by investigating the KKT conditions of problem P WSRP and with the help of the Lemma 1 result.The solution of P WSRP w.r.t.transmit beamforming for MS nk , ∀k ∈ K, is given by Proposition 1.
Proposition 1: Let the receive beamforming R nk and MSE matrix E nk for MS nk be given by ( 19) and (20), respectively, and by utilizing the Lemma 1 result, then the solution of problem P WSRP w.r.t.transmit beamforming T nk for MS nk , ∀k ∈ K n , is given as where λ n , ∀n ∈ M, are the Lagrange multipliers associated with the P WSRP constraint functions, A nk and B nk are given as Proof 4: Please refer to Appendix C. In (29), the λ n , ∀n ∈ M, are calculated to satisfy the power constraint at BS n , ∀n ∈ M, by using the KKT condition λ n k∈Kn Tr(T nk T H nk ) − p n = 0 and by utilizing the fact that the transmit power is monotonically decreasing with respect to increasing λ n [21].The closed-form solution can be obtained by readapting the approach shown in [20] as

B. Network-Wide WSR Maximization
In this section, we consider the general network-wide WSR maximization problem.Mathematically, the WSR maximization problem can be written as [ Problem P WSRM has been addressed in [20]- [25].For all the algorithms presented in these references, P WSRM was solved by exploiting its relation to the WMMSE minimization problem, which was initially shown in [20].Different from all, in the following, we propose a novel solution that directly solves P WSRM .Similar to P WSRP , the solution is obtained by investigating the KKT conditions of problem P WSRM and with the help of the Lemma 1 result.The solution of P WSRM w.r.t.transmit beamforming for MS nk , ∀n ∈ M, ∀k ∈ K n , is given by Proposition 2.
Proposition 2: Let the receive beamforming R nk and the MSE matrix E nk for MS nk be given by ( 19) and (20), respectively, and by utilizing the Lemma 1 result, then the solution to P WSRM w.r.t.transmit beamforming for MS nk , ∀n ∈ M, ∀k ∈ K n , is given as where λ n , ∀n ∈ M, are the Lagrange multipliers associated with the P WSRM constraint functions, A nk is given by (30), and C n is given as Proof 5: The proof can be shown by generalizing the derivation steps shown in Appendix C and thus we omit them here for brevity.
By observing (33), we can see that it is closely related to (29).Theorem 2 shows the connection between both equations in a special case.
Theorem 2: Both equations T WSRP nk and T WSRM nk , given by ( 29) and (33), respectively, are equal if N r = 1 and the MS nk interference price π nk given by ( 26) is replaced by πnk that is given as Proof 6: When N r = 1, the interference price π nk in (26) reduces to π nk = R H nk E −1 nk R nk , since both terms R nk and E nk are scalars.By substituting π nk into the A nk and C n terms, we have Since A nk is common in both, the only difference is between the B n and C n terms.Now, comparing B n to C n , we can see that both terms are equal if each interference price in B n is replaced by πmj = µ mj π mj , which completes the proof.
The result of Theorem 2 establishes a relation between problems P WSRM and P WSRP .When N r = 1, the problems P WSRP and P WSRM are exactly equivalent.In this case, the receive beamforming and MSE terms are scalars and directly specify the interference prices of the MSs.However, when N r > 1, P WSRP would provide a suboptimal solution to P WSRM .In this latter case, the interference prices cannot exploit the spatial dimension that the receive beamforming brings, since R H nk E −1 nk R nk has a dimension of N r × N r , irrespective of the number of data streams N s , whereas the interference-price π nk = Tr(R H nk E −1 nk R nk ) is represented by a scalar.Therefore, when N r > 1, the C n term given by (34) contains extra information, as compared to B n given by (31), which can be exploited by the BSs to reshape the interference.
Similar to (29), the λ n , ∀n ∈ M, in (33) are calculated to satisfy the power constraint at BS n , ∀n ∈ M. The closed-form solution of (33) can be obtained similar to (32), by replacing the B n term with C n .

C. WSR Maximization Based on Self-pricing
In this approach, we consider a different strategy in the sense that each BS would self-price the ICI it is leaking to other cells.In this regard, when compared to P WSRP , the BSs do not need to collect the interference prices from the users when calculating the transmit beamforming.
In order to show this, let us assume for a moment that the rate function of a given user, say MS mj , j ∈ K m , is mostly degraded by the transmit beamforming from a single interfering BS, say BS n , n = m.This can be translated to many scenarios, such as the BSs other than BS n have a mutual interference that is negligible, or they are using a transmit beamforming strategy that eliminates ICI by any means.Therefore, the interference plus noise covariance matrix of MS mj can be approximated as where the approximation is used due to the assumption that the intra-cell interference as well as the ICI from other interfering BSs than BS n are negligible.By (36), the achievable rate of MS mj is given as Considering the high Signal-to-Noise Ratio (SNR) regime, the rate function of MS mj (37) can be approximated as From (38), it can be noticed that the second term in the right-hand side represents the performance loss of MS mj in terms of rate due to the beamforming at BS n .An important point to observe is that this rate loss at MS mj is already known to BS n , as it denotes the interference leakage from BS n to MS mj .Therefore, BS n can consider an implicit interference pricing approach to reduce the interference leakage towards the MSs of other cells.Let Ψ n = {Ψ mj , ∀m ∈ M\n, ∀j ∈ K m } and define the following BS n specific function Using (39), each BS n , ∀n ∈ M, updates the transmit beamforming T nk , ∀k ∈ K n , as solution to the following optimization problem max The solution of P WSRH w.r.t.transmit beamforming for MS nk , ∀n ∈ M, ∀k ∈ K n , is given by Proposition 3.
Proposition 3: Let the receive beamforming R nk and MSE matrix E nk for MS nk be given by ( 19) and (20), respectively, and by utilizing the Lemma 1 result, then the solution to P WSRH w.r.t.transmit beamforming for MS nk , ∀n ∈ M, ∀k ∈ K n , is given as where A nk is given by (30) and Proof 7: The derivation steps are similar to the ones shown in Appendix C and thus omitted here for brevity.
Similar to (29), λ n , ∀n ∈ M, in (40) are calculated to satisfy the power constraint at BS n , ∀n ∈ M. The closed-form solutions of (40) can be obtained similar to (32), by replacing the B n term with D n given by (41).

D. Algorithm Design and Convergence Analysis
To solve either problem of P WSRP , P WSRM , or P WSRH , an algorithm based on alternating optimization can be used [21]- [25].The basic idea is to optimize each problem with respect to one variable at a time, while keeping the rest of the variables fixed.The proposed algorithm to solve either optimization problem is summarized in Algorithm 2. We refer to this algorithm as WSRP when solving P WSRP , as WSRM when solving P WSRM , and as WSRH when solving P WSRH .
Algorithm 2 WSR Max. via Alternate Optimization., ∀k ∈ K n , using (40).12: end if 13: Repeat steps 2-12 (until convergence) In step 1, Algorithm 2 initializes the transmit beamforming matrices for all users in the system by any means.Afterwards, the algorithm alternates between the following three steps.In steps 2 and 3, all MSs calculate, in parallel, their receive beamforming and MSE matrices, respectively, for the given transmit beamforming.Next, all BSs calculate, in parallel, their transmit beamforming, using either approach, for the given receive beamforming and MSE matrices.If this iterative process converges, it converges to a fixed point that is a stationary point of the WSR-objective function [20].It is worth noting that in a single-cell scenario, all proposed algorithms coincide, where each algorithm differs from the other two in the ICI handling.
For alternating optimization, monotonic convergence of the objective to a stationary (locally optimal) point is guaranteed, if each step has a unique optimum [33,Proposition 2.7.1].The requirement for the transmit beamforming optimization to be unique is that the matrix to be inverted in ( 29), (33) or ( 40) is invertible, i.e., (A nk + B n /C n /D n + λ n I) −1 does exist.One sufficient condition for invertibility is that all the power constraints are active, i.e., k∈Kn Tr(T nk T H nk ) = p n so that we always have λ n > 0, which is the case in the formulation of our problems, since the WSR optimum is reached at maximum transmit power [20].Another condition is that there are at least N t active vectors whose effective channels are linearly independent, i.e., rank In practice, the cases when the matrix is non-invertible, and the optimal beamforming solution is not unique, are very rare [25].Nevertheless, if the matrix is not invertible, the pseudo-inverse may be used to get a solution.However, since the original problems P WSRP , P WSRM , and P WSRH are nonconvex, a globally optimal point cannot be found, in general, via alternating optimization.Moreover, different initializations and iteration orders may converge to different local WSRoptima [21]- [25].

V. WSR MAXIMIZATION SIGNALING SCHEMES
The proposed algorithms above are decentralized, where each BS can calculate its transmit beamforming locally once it has the required information.Here, we assume that each BS n , ∀n, has access to the local CSI, i.e., H nmj , ∀m, ∀j, as in [23]- [25].TDD operation is an effective technique for obtaining this CSI, where uplink training in conjunction with reciprocity provides the BSs with downlink and uplink CSI simultaneously [27], [28].In the following, we propose a novel OTA signaling scheme based on TDD mode to facilitate the algorithm implementation.
We assume that 1) each BS and MS has orthogonal pilot symbols (training) in the downlink and uplink direction, respectively, for the OTA signaling, 2) each TDD frame is divided into two parts; signaling and data parts, as shown in Fig. 2, where the signaling part is further divided into downlink and uplink sub-parts to facilitate the variables' exchange between BSs and MSs, and 3) all exchanged variables are perfectly estimated at each iteration.
At the downlink, we assume that each BS n transmits pilot signals that are precoded with the transmit beamforming T nk , ∀k ∈ K n .Thus, each MS nk can estimate the downlink equivalent channels H mnk T mj , ∀m, ∀j, and update its receive beamforming R nk and MSE-matrix E nk .On the other hand, to update the transmit beamforming, each algorithm has different signaling needs.Therefore, we propose the following two signaling schemes.

A. Signaling Scheme A
In this scheme, we assume that each MS nk transmits an uplink pilot signal that is precoded with the receive beamforming R nk .Thus, each BS n can estimate the uplink equivalent channels R mj H nmj , ∀m, ∀j, and calculate E nk , ∀k ∈ K n , using (20).This information is sufficient to calculate A nk , ∀k ∈ K n , which is common in the three algorithms.
For WSRH, A nk , ∀k ∈ K n , is all that is needed to update the transmit beamforming T nk , ∀k ∈ K n , where the second term D n can be calculated locally.However, for WSRP, each BS n would require vector π n to calculate B n , which collects all interference prices from the users of other cells.The direct approach, as assumed in [18], is to let each MS nk calculate its interference price π nk and feed it back to its serving BS, i.e., BS n .Then, all BSs perform broadcast-and-gather operation of their interference prices using the backhaul.Different from [18], we assume that each BS n first recalculates the receive beamforming as R nk = (I − E nk )(H nnk T nk ) −1 , ∀k ∈ K n , using local information (see Appendix B), and then calculates the interference prices as assumed in [25], is to let the BSs exchange them using the backhaul.
From above, we can see that signaling Scheme A is best applicable to WSRH, since no further variables feedback is required.However, for WSRM (WSRP), the feedback of matrices (scalars) between BSs is required.To reduce the signaling overhead of WSRM and WSRP, we further propose the following signaling scheme.

B. Signaling Scheme B
In this scheme, we assume that each MS nk transmits a pilot signal that is precoded with [23], [25].Thus, each BS n can estimate the uplink equivalent channels √ µ mj E − 1 2 mj R mj H nmj , ∀m, ∀j, which are sufficient to calculate A nk , ∀k ∈ K n , and B n or C n .However, with WSRM and WSRP, each BS n still requires R nk , ∀k ∈ K n , to calculate the uplink equivalent channels µ nk R nk H nnk , ∀k ∈ K n .One possible way, as proposed in [23], is to let each MS nk transmit two consecutive uplink pilot signals; one precoded with nk R nk and another precoded with R nk .However, this approach would unnecessarily increase the signaling overhead.In the following, we propose an alternative approach, where the main idea is to let each BS n recalculate R nk , ∀k ∈ K n , using only local information and thus reduce the signaling overhead. Let nk R nk H nnk denote the uplink equivalent channel with MS nk estimated at BS n .Substitute R nk = (I − E nk )(H nnk T nk ) −1 into X nk and simplify the resulting expression, then we have where Y nk = (H nnk T nk ) −1 H nnk .Right multiply both sides of the latter equation by Y † nk and again simplify the resulting expression, then we have where ), which is formed using local information.Then (43) can be solved for E nk iteratively as given by Algorithm 3.

Algorithm 3
Recalculating R nk at the BS for Scheme B.
In step 1, Algorithm 3 constructs the local matrix M nk and randomly initializes E (1) nk .Given those initial matrices, the algorithm alternates between steps 2 and 3 at each iteration.At the t-th iteration, the algorithm solves for E . Then, at step 3, the algorithm solves for E (t+1) nk given Ē(t) nk .Those two steps are repeated until convergence.If matrix M nk is assumed perfect, the algorithm is able to recalculate the MSE-matrix E nk perfectly.Then, using E nk at step 5 we can calculate for the receive beamforming matrix R nk .Algorithm 3 convergence behavior is shown numerically in the next section.A proof of convergence is open and we leave it for a future work.

VI. NUMERICAL RESULTS
In this section we evaluate the performance of the proposed algorithms by means of simulation.We consider a flat Rayleigh fading scenario with uncorrelated channels between antennas, i.e., each element of H nnk , ∀n ∈ M, ∀k ∈ K n , is an i.i.d.complex Gaussian random variable with zero mean and unit variance.For each simulated algorithm, we initialize the transmit beamforming matrices using the MRT approach, i.e., T where V nk denotes the matrix holding in its columns the right singular vectors of H nnk arranged in a decreasing order w.r.t their singular values.Moreover, we assume that the noise variance σ 2 nk = σ 2 = 1.For comparison, we show simulation results of the WSR-WMMSE algorithm from [21]- [25], MRT, and the Matrix Orthogonal Projection (MOP) approach [34].For the MOP approach, the transmit beamforming for MS nk is given as where , P nk is an [N s × N s ] diagonal matrix holding the N s power allocations found using the water-filling method [30] over the largest denotes all channel matrices from BS n to all MSs except MS nk .

A. Example 1: Algorithm 3 Convergence
In this example, we show simulation results to evaluate the convergence behavior of Algorithm 3. Fig. 3 shows the logscale convergence results of Algorithm 3 for the first user, i.e., MS 11 , in terms of the Absolute Error that is defined as where E 11 is the perfect MSE-matrix of MS 11 and 11 is the obtained MSE-matrix at the t-th iteration.Each simulated point is averaged over 1,000 channel realizations.
From Fig. 3, it can be seen that Algorithm 3 has a fast convergence rate, where it is able to obtain the perfect MSEmatrix using a few iterations.Note that, when N t increases, N s decreases, and M K decreases, the algorithm has faster convergence rate.We note that all simulated channel realizations have converged to the perfect MSE-matrix, although the convergence of some channel realizations is not necessarily monotonic.

B. Example 2: Convergence behavior of Algorithms 1 and 2
In this example, we show simulation results to evaluate the convergence behavior of Algorithms 1 and 2. Fig. 4 shows the averaged sum rate convergence results assuming SNR = 10 dB.
From Fig. 4, we can see that all iterative algorithms have a fast convergence rate, within 1-to-2 iterations for iBD and within 10-to-15 iterations for the other algorithms.It's worth noting that WSRM has a slightly faster convergence rate than WSR-WMMSE, although both algorithms seem to converge almost to the same point.However, the convergence speed of either algorithm varies for the individual channel realizations.For some channel realizations, WSRM appears to converge slightly faster and to a higher sum rate than WSR-WMMSE and vice-versa for the other channel realizations.

C. Example 3: Sum Rate Performance for Single-Cell Case
In this example, we show simulation results to evaluate the sum rate performance of Algorithms 1 and 2 in the single cell case, i.e., M = 1.Fig. 5 shows the average sum rate results for a range of SNR values, assuming µ nk = µ = 1.Note  that, when M = 1, algorithms WSRM, WSRP, and WSRH are all equivalent, since the B n , C n and D n terms are all identity matrices and all algorithms share the A nk calculation.Therefore, for this example, we only show WSRM results.
From Fig. 5, it can be seen that when [K, N r , N s ] = [3, 1, 1] (solid-lines), both iBD and MOP have very close performance to both WSR-WMMSE and WSRM, which seem to have the same sum rate performance.However, for the other simulated scenarios, when K and/or N s increases while keeping N t fixed, both iBD and MOP have large performance loss as compared to WSR-WMMSE and WSRM, although they maintain the same multiplexing gain.None of the iBD, MOP, and MRT algorithms can achieve a good balance between the altruistic and egoistic behaviors of users.While iBD and MOP have complete altruistic behavior, MRT has complete egoistic behavior.Consequently, they have performance loss, as compared to WSR-WMMSE and WSRM.
To examine the impact of user-weights, Fig. 6 shows the simulation results for a system with equal and unequal userweights.From Fig. 6, it can be seen that when all users have equal weights, they achieve equal performance, in average.However, when a user has a larger weight than others, MS 3 in this case, the algorithm favors him and, thus, achieves better performance.In terms of sum rate performance, the system with equal users' weights has better performance than otherwise.The reason behind this is that when the algorithm favors one user over the others, the user(s) with lower weight would have a degradation in his(their) performance, MS 1 in this case.In general, the increase of one user's rate does not compensate the loss of the other users' rate.Thus, the algorithm would lose in terms of sum rate.

D. Example 4: Sum Rate Performance for Multicell Case
In this example, we show simulation results to evaluate the sum rate performance of Algorithms 1 and 2 in the multicell case, i.e., M > 1, assuming µ nk = µ = 1.Fig. 7 shows the average sum rate results for a range of SNR values.
From Fig. 7, we can see that WSRM and WSRP have the exact sum rate performance when N r = N s = 1 (solid-  lines), since both algorithms are equivalent as it is shown in Theorem 2. However, when N r = N s = 2, we can see that WSRP has some performance loss, as compared to WSRM, and the performance loss increases as the SNR value increases.Furthermore, MOP (not feasible when N r = N s = 2) has the same multiplexing gain as WSRM, but with very large performance loss.On the other hand, both iBD and MRT have a flat performance as the SNR increases, due to severe ICI.For WSRH, we can see that it has close performance to WSRM for the entire SNR range with a small performance loss when N r = N s = 1.However, increasing N s , the performance loss increases, as well, since N t is fixed.WSRH is a selfpricing algorithm that is distributed between cells.Thus, the algorithm has less transmit coordination than WSRM and WSR-WMMSE.
In Fig. 8, we show sum rate performance while varying the number of MS antennas N r and fixing the other parameters.From Fig. 8, we can see that all algorithms have better sum rate performance as N r increases.However, WSRP has a much  slower increase rate than others, which is translated to a higher rate loss, as compared to WSRM.On the other hand, as N r increases, iBD starts to have better sum rate performance than MRT, as compared to results from Fig. 7.The reason behind this is that when N r increases, the interference whitening method has better impact on reducing ICI effects, and thus, better performance.
In Fig. 9, we show sum rate performance while varying the number of BS antennas and fixing the other parameters.As discussed above, we can see from Fig. 9 that all algorithms have better sum rate performance as N t increases.Different from Fig. 8, WSRP and WSRH sum rate increases as N t increases, thus reducing the rate loss as compared to WSRM.Meanwhile, iBD has a slower sum rate increase as compared to results from Fig. 8.

VII. CONCLUSIONS
We have considered the Weighted Sum-Rate (WSR) maximization problem in multicell MIMO Broadcast Channel (BC) and proposed three different algorithms, which are based on the alternating optimization technique and are guaranteed to converge to a local WSR-optimum.For all algorithms, the transmit beamforming matrices are obtained by investigating the Karush-Kuhn-Tucker (KKT) conditions of the problems with the help of Lemma 1.In contrast to the WSR-WMMSE algorithm from [21]- [25], which solves the WSR maximization problem indirectly by solving the Weighted Minimum-Mean-Square Error (WMMSE) minimization problem, the proposed algorithms in this paper provide a direct solution to the WSR maximization problem.Using computer simulations, it was shown that the proposed algorithms achieve comparable sum rate performance to the WSR-WMMSE algorithm, while using fewer iterations.Further, it was shown that the networkwide WSR maximization can be equivalently solved using an interference pricing approach if 1) each Mobile Station (MS) is equipped with a single-antenna and 2) the users' weights are included in the interference prices.Furthermore, two different signaling schemes based on Time Division Duplex (TDD) mode were also proposed to facilitate the implementation of the algorithms.Different from existing schemes, the proposed signaling schemes reduce the signaling overhead and require no feedback of variables between Base Stations (BSs).

A. Proof of Theorem 1
At first, one can note that H −k nk given by ( 14) can be written in function of H −k nk given by (11).To show this, let us define Therefore, we have H −i nk ∝ H −i nk .This end result proves that both matrices are proportional to each other.Consequently, their individual null spaces are also proportional to each other, i.e., G nk ∝ G nk .Therefore, the singular values calculated using (7) assuming H e nk given by ( 15) are exactly equal to the ones calculated assuming H e nk given by (12), which completes the proof.

B. Proof of Lemma 1
In the lemma, we claim that R nk = E nk T H nk H H nnk Φ −1 nk .To prove this, assume N s = N r and solve for R nk from (20) as R nk = (I − E nk )(H nnk T nk ) −1 .Then, our claim is that where (a) is obtained by left-multiplying both sides by E −1 nk and (b) is obtained by simplifying (a).From ( 21), E −1 nk = (I Nr + T H nk H H nnk Φ −1 nk H nnk T k ).Substitute E −1 nk into (b) and when simplifying the resulting expression we have which completes the proof.

C. Proof of Proposition 1
From the KKT conditions, a local optimum must satisfy ∇ T nk L = 0, ∀k ∈ K n , where ∇ T nk L defines the complex gradient operator of L with respect to T nk and L defines the Lagrangian function of problem P WSRP , which is given as The gradient is a matrix with the [p, q]-th element defined as [∇ T nk L ] [p,q] = ∇ [T nk ] [p,q] L .In order to calculate ∇ T nk L , we need to calculate first ∇ T nk r nk , ∇ T nk r nj , ∀j ∈ K n \k, and ∇ T nk L nk by utilizing the following results from [32]: ∇ log X = Tr(X −1 ∇X) and ∇(X −1 ) = −X −1 (∇X)X −1 , where X is a matrix.
First, ∇ [T nk ] [p,q] r nk = Tr(E nk ∇ [T nk ] [p,q] E −1 nk ).Here, ∇ [T nk ] [p,q] E −1 nk = e q e H p H H nnk Φ −1 nk H nnk T nk , where e p (e q ) is a vector of N t (N r ) dimension with one at the p (q) -th element and zeros elsewhere.Then, we have Since ∇ [T nk ] [p,q] r nk = [∇ T nk r nk ] [p,q] , then Furthermore, ∇ [T nk ] [p,q] r nj , ∀j ∈ K n \k, is given as First, ∇ [T nk ] [p,q] E −1 nj is given as From (48), it can be seen that it is not possible to solve for T nk directly.However, according to Lemma 1, we can write R H nk = Φ −1 nk H nnk T nk E nk (note that E nk = E H nk ).Then, the gradient function (48) can be written as where From (49), we can solve for T nk directly as Thus, we have the result given in proposition 1.

Francisco
R. P. Cavalcanti received his B.Sc. and M.Sc.degrees in electrical engineering from UFC in 1994 and 1996, respectively, and his D.Sc.degree in electrical engineering from the State University of Campinas, São Paulo, Brazil, in 1999.Upon graduation, he joined UFC, where he is currently an associate professor and holds the Wireless Communications Chair with the Department of Teleinformatics Engineering.In 2000, he founded and since then has directed GTEL, which is a research laboratory based in Fortaleza that focuses on the advancement of wireless telecommunications technologies.At GTEL, he manages a program of research projects in wireless communications sponsored by the Ericsson Innovation Center in Brazil and Ericsson Research in Sweden.He has produced a varied body of work including two edited books, conference and journal papers, international patents, and computer software dealing with subjects such as radio resource allocation, cross-layer algorithms, quality of service provisioning, radio transceiver architectures, signal processing, and project management.He is a distinguished researcher of the Brazilian Scientific and Technological Development Council for his technology development and innovation record.He also holds a Leadership and Management professional certificate from the Massachusetts Institute of Technology, Cambridge.
R nK }, and Υ n = Bdiag{Υ n1 , . . ., Υ nK }, where Υ nk denotes the ICI plus noise covariance matrix of MS nk , which is given as nk and feed it back to BS n .4: BS n , ∀n: Calculate R 1: Initialize R nk , ∀k ∈ K and set t = 1.2: BS n , ∀n: Transmit data using T nk , ∀k ∈ K n .3: MS nk , ∀k: Calculate Υ and exchanges them with the other BSs using the backhaul.Thus, we do not need the feedback from the MSs.On the other hand, for WSRM, each BS n would require µ mj E mj , ∀j ∈ K\K n , to calculate C n .Using this signaling scheme, one possible way, 1 nj ]H nnj T nj , whereas ∇ [T nk ] [p,q] Φ nj = H nnj T nk e q e Hp H H nnj .Combining all results together, we have∇ [T nk ] [p,q] rnjTherefore, ∇ T nk r nj , ∀j ∈ K n \k, is given as From above, ∇ T nk L is given as∇ T nk L =µ nk H H nnk Φ −1 nk H nnk T nk E nk − Ãnk T nk − B n T nk − λ n T nk , ∇ T nk rnj = −H H nnj Φ −1 nj HnnjTnjEnjT H nj H H nnj Φ −1 nj HnnjT nk .Finally, ∇ T nk L nk is given as∇ T nk L nk = m∈M\n j∈Km µ mj H H nmj H nmj T nk .