Least Squares Channel Estimation for Analog Network Coding Over Frequency Selective Fading Channels

Two-way relay networks can have their throughputs improved by adopting physical-layer network coding. To work properly, these systems need to know the channel impulse responses involved. Most previous works on channel estimation for physical-layer network coding systems consider flat fading or frequency-selective fading together with orthogonal frequency division multiplexing modulation. However, for single carrier systems under frequency-selective channels, the estimators proposed in the literature cannot be applied directly. In order to solve this problem, a least squares channel estimator for these scenarios is proposed here. Simulations are performed to evaluate the performance of this channel estimator and the results show the effectiveness of the proposed technique.


I. INTRODUCTION
The first work on communication systems that use twoway channels for exchanging information between two nodes was presented by Shannon in 1961 [1].In his work, Shannon discussed several channels through which it was possible to exchange information between two nodes in both directions at the same time.One of these channels would allow the two nodes to send their information at the same time and frequency, and both nodes would receive the XOR mapping of the bits sent by each user.Shannon did not show how such a system could be implemented but developed the mathematics necessary to understand them.This subject has gained a renewed interest when a technique called Physical-layer Network Coding (PNC) was proposed in 2006 [2], [3].PNC adopts a Two-way Relay Channel (TWRC) network and allows two nodes to send data simultaneously to the relay, improving the throughput of the network [3].
The PNC systems take advantage from the electromagnetic interference between the signals sent by two or more nodes at the same time and frequency.Specifically in the TWRC, two nodes send information simultaneously to the relay node, that must perform the PNC mapping [2] in order to transform its received signals originated from both users into one that can be recognized by the end nodes.It is shown in [4] that we can consider the signal interference itself as a PNC mapping, and, therefore, the relay only needs to amplify the received signal and send it to the end nodes.The nodes must, therefore, be able to extract the information of interest from this signal.This protocol is known as Amplify-and-Forward (AF), and the system is said to use Analog Network Coding (ANC), since the relay does not need to perform any detection, and the mapping occurs in the analog domain.
Most works on PNC systems consider that the channel impulse responses (CIR) of each channel are perfectly known at the relay and at the end nodes.However, in practical situations, these CIR are not known a priori.Hence, channel estimation techniques are essential for practical deployment of PNC systems.Because of the easy implementation of the ANC systems, most works on channel estimation focus on these systems.Furthermore, it can be shown that for ANC systems, channel estimation is only necessary for detection at the end nodes, i. e., only the impulse responses of the cascaded channels needs to be estimated, while knowledge of the individual channels is not required [5]- [10].
Therefore, the literature on channel estimation techniques focus on the TWRC network and ANC systems.The work in [6] derives the maximum likelihood (ML) estimator and the Linear Maximum Signal-to-Noise Ratio (LMSNR) estimator, considering flat fading channels.It also proposes the optimal training sequence that reduces the mean squared error (MSE) associated with those estimators.In [7], the authors developed Least Squares (LS) estimators for Orthogonal Frequency Division Multiplexing (OFDM) systems, by using either one OFDM symbol as training sequence, or by using pilot subcarriers.The Zadoff-Chu training sequence for TWRC systems was proposed in [8] for the LS estimator and OFDM modulated systems, in order to reduce the peak-to-average power ratio (PAPR).The Linear Minimum Mean Squared Error (LMMSE) estimator for OFDM systems was addressed in [9], showing that it has superior performance than the LS estimator, in terms of MSE.The work in [10] proposes a different channel estimation technique: it uses the cyclic time shifting property of the Discrete Fourier Transform to make the channels separable at the relay.However, while previous works estimated the channels at the end nodes, in [10] they are estimated at the relay, and the end nodes are assumed to receive the estimated CIRs through a perfect feedback channel.
The majority of works on channel estimation for PNC systems, included the ones cited previously, considers only flat fading environments [5], [6], or frequency-selective fading environments combined with OFDM modulation [7]- [10], since OFDM is a robust technique against the frequency selectivity of wireless channels.
However, in single carrier (SC) communications systems, such as GSM [11], and satellite networks with CDMA [12]- [14], PNC technique could also be applied.In the same way, several communications protocols and standards for IoT and Wireless Sensor Networks make use of SC modulation and their importance has increased in recent years.For instance, the binary-phase shift keying (BPSK) modulation is used in several open standards such as 6LoWPAN and ZigBee, and the differential BPSK in SigFox [15].This systems are also subject to the effects of the wireless channel.Therefore, this work proposes an LS channel estimation technique for ANC systems using an SC modulation over frequency-selective fading channels.The works in [6], [7], [9] also develop specific training sequences to minimize the MSE of the estimator.However, it is shown that there is no significant reduction on the MSE when compared to a random training sequence.
This work improves the work in [16] presenting a detailed system modeling and derivation of the estimator, its MSE expression, and the design of an optimal training sequence.Furthermore, it also presents results with some modifications in the scenarios presented, and a more detailed discussion and insights on the obtained the results.
The remainder of this paper is organized as follows: Section II presents the system model that is used in this work; Section III develops the proposed least squares channel estimator, while simulation results are presented in Section IV; finally, conclusions are made in Section V.

II. SYSTEM MODEL
A simple TWRN typically has two source nodes (1 and 2), also referred to as user nodes, and one relay node (R), as shown in Figure 1, where node 1 sends information to node 2, and vice-versa, using node R to assist their communications.
The traditional network using Time-Division Multiple Access (TDMA) would need four transmission stages to exchange data between nodes 1 and 2, as shown in Figure 1: the first stage is used to send data from node 1 to node R; the second, to send this data from node R to node 2; the third, to send data from node 2 to node R; and the fourth one to send data from node R to node 1.By applying the Network Coding (NC) concepts [17], it is possible to reduce to three the number of time intervals needed to exchange data between nodes 1 and 2, as shown in Figure 2. The relay node detects the signals from both user nodes separately, and performs the NC of the bits of each one of them before transmission.To illustrate how NC could be implemented in a TWRN consider, for instance, that node 1 transmits bit s 1 and the node 2 transmits bit s 2 .The simplest way to do NC is to perform a bitwise XOR operation s R = s 1 ⊕ s 2 at the relay.It will then transmit the bit s R to both users, each of which will perform this same operation on the received bit and the bit itself transmitted.At node 1, for example, the information transmitted by node 2 can be estimated as: Therefore, node 1 is able to recover the bit sent by node 2 from the bit s R sent by the relay.
Finally, when applying the PNC technique, it only takes two transmission stages to exchange data between the two user nodes, as shown in Figure 3.In this case, nodes 1 and 2 send their information simultaneously to node R.This stage is called Multiple Access (MAC).In the next stage, R amplifies the signal by a factor α and forwards the resulting signal to both nodes 1 and 2. This factor can be used to adjust the power of transmitted signal at the relay.This stage is called Broadcast (BC).The relay protocol presented above is called AF, and the user nodes just need to remove their self-information to obtain the information sent by the other user.The AF protocol is also called ANC, as discussed before.This works will focus specifically on this protocol.

R 2 BC Stage MAC Stage
The signal model used in this work is shown in Figure 4. Let x i (n) be the baseband symbol sent by the node i at the discrete time instant n, h iR (n) be the CIR between the nodes i and R, and w R (n) be the additive white Gaussian noise (AWGN) with distribution N(0, σ 2 w ) at node R, and * denotes the convolution Relay operation.The discrete-time baseband signal received at R can be written as: Due to the symmetry of this network, the performance can be analyzed just at node 1, since it is equivalent for the node 2. Let h Ri (n) be the CIR between nodes R and i, and w 1 (n) be AWGN at the node 1, also with distribution N(0, σ 2 w ).The signal received at the node 1 at the BC stage can be written as: The cascaded channels can be defined as , and the equivalent noise as w(n) = αw R (n) * h R1 (n) + w 1 (n), which will also be a random variable with Gaussian distribution with zero mean and variance that will depend upon the channel power profile of h 1R (n) and the variances of w R (N) and w 1 (n).Then, substituting (2) into (3), the signal received at node 1 at the BC stage can be rewritten as: Detection at node 1 can be performed as [18]: It means that, given y 1 (n), x 1 (n), a(n) and b(n), it is possible to obtain an estimate of x 2 (n) through maximum likelihood data detection, which, in this case, turns out to be the value of x 2 that minimizes the squared of the absolute difference between the received signal y 1 (n) and a version of the transmitted signal αy R (n) without noise.
A practical way of performing this at node 1 is to first extract its self-information from y 1 (n), i. e: Then, the signal x2 needs to be equalized.Linear equalizers such as Zero-forcing (ZF) or algorithms such as the Maximum Likelihood Sequence Estimation (MLSE) can be used for this purpose [11].
The model described above shows the importance of estimating the cascaded CIR a(n) and b(n) at the end nodes for the self-information extraction and equalization.It also shows one advantage of the ANC scheme: it only needs the knowledge of the CIR at the end nodes, and not at the relay node.Thus, the next section is devoted to the development of a channel estimator for PNC systems over frequency-selective fading channels.

III. LEAST SQUARES CHANNEL ESTIMATION
As mentioned in Section I, most works on channel estimation for PNC systems consider frequency-selective channels with OFDM modulation.In this work, an LS estimator in time domain is proposed to estimate frequency-selective and time invariant CIR for PNC systems using SC transmission.
Assuming that the nodes 1 and 2 send N training symbols x 1 (n) and x 2 (n), respectively, (4) can be interpreted as a linear model that can be written in matrix form.To do so, a matrix X can be defined as: where X 1 and X 2 are convolution matrices, which have the structure shown in (8), containing the training symbols sent by nodes 1 and 2, respectively.
x i (1) In this matrix, the element x i (n) is the symbol sent by user i at the discrete time n.Considering that the channels have length N C H , the cascaded channels a(n) and b(n) have length L = 2N C H −1. Therefore, X has dimension N ×2L.Although it has been considered that all channels have the same length, the estimator can be easily generalized for channels with different sizes.
Let h be the column vector that contains the coefficients of both concatenated channels as: where T be a vector containing N samples of w(n), and y = [y 1 (0) y 1 (1) • • • y 1 (N − 1)] T be a vector of N samples from the received signal at node 1, where y i (n) represents the received sample at instant n at node i.
Then, it is possible to write (4) in matrix form as: The LS estimate of h can be obtained from [18]: Thus, the LS channel estimator may be obtained by minimizing: By applying the distributive property and Hermitian product of matrices [18], the last equation can be rewritten as: Differentiating (13) with respect to h and setting it equal to zero yields: Thus, the solution to ( 11) is given by [18]: where ĥ contains the estimates of a and b and X † denotes the pseudoinverse matrix of X, which is given by [19] It is worth noting that the estimated cascaded channels in (15) will be used for self-extraction and equalization at the end nodes.
It is also important to highlight that, although the estimator ( 15) is based on classic equations, some adaptations in the model, which were not previously reported in the literature, were necessary to employ it with ANC system.
The relationship between the estimate ĥ and the real channel h can be found by substituting (10) into (15), resulting in: where (X H X) −1 X H X = I, being I an identity matrix with appropriated dimensions.Then, it is possible to write: Defining the error vector as e = ĥ − h, the covariance matrix of the estimation error is given by cov(e) = E ( ĥ − h)( ĥ − h) H . Substituting (18) into this equation gives: The noise component is given by w = αH R1 w R +w 1 for node 1, where H R1 is convolution matrix associated with channel h R1 (n), w R and w 1 are vectors containing the samples from the noises received at nodes R and 1, respectively.As the noises w R and w 1 are independent, the error covariance matrix becomes: As X, H R1 , w R and w 1 are all independent, (20) can be simplified to: where: with σ 2 R1,l being the variance of the l-th coefficient of the channel between nodes R and 1.
Rearranging the terms in (21), the covariance matrix of the estimation error is given by: The MSE is given by MSE = tr{cov(e)}, where tr denotes the trace operator, thus resulting in: Therefore, it is clear that the MSE depends on the noise variance, the channel power profile, and tr{(X H X) −1 }.So, it is possible to design a training sequence to minimize this last term, which is the only term the designers can control, and thus, minimizing the MSE.
Assuming a power constraint P on the transmitted signals, i.e: where || • || 2 F denotes the Frobenius norm, the optimal training sequence can be obtained by solving the following: It is possible to show that the minimum is achieved when [20]: In other words, X H X must be a scaled identity matrix.This is achieved for any matrix X with orthogonal lines, and with norm P/2L.There might be additional constraints on the form of the training matrix X i or on P. For instance, in OFDM modulated systems [7], the training matrix must be circulant and it is also desirable to choose a training matrix that reduces the peak-to-average power ratio (PAPR) [8].Since this work considers an SC modulated system operating in frequencyselective fading channels, the training sequence is restricted to the form shown in (8).
To satisfy these requirements, the following training sequence is proposed: recalling that i denotes the number of the user node and δ(n) denotes the Kronecker delta function.
In order to properly compare the performance of this training sequence against that of a sequence containing randomly generated BPSK symbols, it is necessary to adjust P so that the power of both sequences are equal.This can be done by setting: and, by doing so, the factor that defines amplitude of the symbols in the optimal training sequence becomes: Although these sequences minimize the estimator MSE, they do not improve significantly the MSE compared to that of a random training sequence.Consequently, the BER will not have a perceptible reduction, as will be shown in section IV.This result agrees with others already presented in the literature for optimal training sequence design for OFDM modulated PNC systems.

IV. SIMULATION RESULTS AND DISCUSSION
To evaluate the performance of the proposed LS channel estimator, an ANC system is simulated using BPSK modulation.
The gain α is set in a way that the power of the received signal at the relay is equal to the power received at nodes 1 and 2, so that the signal-to-noise ratio (SNR) is the same at all channels.To do that, P R is set to P R = P 1 + P 2 , being P R the power of the signal transmitted by the relay, P 1 and P 2 the powers of the signals transmitted by nodes 1 and 2, respectively.Thus, the gain α can be set to: where with σ 2 iR,l being the variance of the l-th coefficient of the channel between the nodes i and R.Then, the SNR can be defined as SNR = P R /σ 2 w , where σ 2 w is the noise power.In the simulations, whose results are presented in the sequel, the channels have lengths N C H = 5 and their coefficients are randomly generated in each realization.The coefficients are independent and identically distributed zero mean and unit variance complex Gaussian random variables.In other words, the channel magnitudes follow a Rayleigh distribution and the channel phases are uniformly distributed between 0 and 2π.The cascaded channels a and b have length L = 9.
The MSE between the original and estimated channels â and b, and the bit error rate (BER) when the system uses the estimated CIR and when it has perfect knowledge of the CIR are computed through Monte Carlo simulations for different scenarios.Each realization simulates the transmission of 10 3 information bits.For each SNR, a total of 10 4 realizations are performed for averaging.The training sequence, that consists of N random BPSK symbols, is concatenated in the beginning of the block of information bits.The estimate â is used to compute the self-extraction operation shown in (6).Then, the estimate b is used to perform the equalization through an MLSE equalizer.Both estimates were obtained from (15).The estimation MSE is computed in each realization by: and, at the end of the simulation, it is averaged by the number of trials.A similar expression can be used to compute the MSE for channel b.
In the first scenario the MSE and BER performances of the proposed LS estimator are evaluated for different lengths of the training sequence.Figure 5 shows the MSE performance.As the MSE for both concatenated channels are equal, only the MSE for channel â is shown.As expected, the estimator performance degrades as the length of the training sequence decreases.For an SNR of 10 dB, the training sequence of length N = 50 has an MSE performance of only 0.378, while for a length of N = 100 training symbols it is 0.132.For N = 500, the MSE is 0.022, and for N = 1000 the system achieves an MSE of 0.011.
Figure 6 shows the BER for the same training sequence lengths simulated in Figure 5.As observed in Figures 5 and 6, although the MSE is better for a longer training sequence, it does not always improve significantly the BER.For instance, for N = 500, N = 1000 and perfect CIR, the BER are practically the same, and the differences can be due to errors in numerical simulation.For N = 50, N = 100 and N = 500, it is possible to see an improvement in the BER.However, using N = 500 instead of N = 100 saves less than 1 dB in SNR to achieve almost the same BER.For an MSE of 10 −2 there is a difference of approximately 12 dB between the PNC system that uses a N = 1000 (the block length) and the PNC system that uses N = 100, i. e. 10 % of the block length.However, this difference does not impact significantly the BER.Therefore, the proposed estimator can be deployed without the need of a high number of training symbols.Furthermore, it is possible to see in Figure 6 that an error floor happens near the 20 dB SNR region even for the perfect knowledge of CSI.This floor is due to the depth of the traceback of the Viterbi algorithm used to implement the MLSE equalizer, that was set to 5L.As the length of transmitted symbols block is 10 3 , the algorithm could not remove all the intersymbol interference (ISI).In order to reduce this BER floor, the length of the block of transmitted symbols can be increased.The second scenario evaluates the impact of erroneous channel length L at the estimator, i.e., when it considers a channel length smaller (or greater) than the actual channel length L. For this simulation, it was used a training sequence of length N = 100.Figure 7 shows that the BER does not change significantly when L > L, since the extra coefficients given by the estimator are nearly zero, but decreases considerably when L < L. For L = 5, there are almost no reduction in BER as the SNR increases.This happens because the ISI generated by the frequency selective wireless channel are not totally mitigated.First, the self removal given by (6) are not performed correctly because the estimate â does not consider all the channel coefficients.Second, as the equalizer uses the information b, that was obtained considering less coefficients than the actual cascaded channel, its output will have residual ISI, which degrades the overall performance of the system.
In the third scenario, a comparison between the proposed estimation technique with the LS estimator proposed in [7] for OFDM-based PNC is done.Although it is an ANC based on

MSE BPSK SC OFDM
Fig. 8. MSE comparison between the LS estimator for OFDM system and the LS estimator for SC system.
OFDM modulation, the channel estimator estimates the CIR after removing the Cyclic Prefix (CP) and before computing the Discrete Fourier Transform, i. e., it works in the time domain, making it a fair comparison for the MSE.The OFDM system uses 64 carriers and CP length of 16.In this scenario, the proposed technique uses N = 64 training symbols to match with the length of the sequence used for the OFDM system, once it uses one OFDM block as training sequence, i. e., 64 training symbols.It can be seen from Figure 8 that the performances of both estimators are equivalent, and minor differences are due to numerical errors in simulation.Although these estimator are based in the same criterion, they are distinct estimators, since the construction of the matrix given by ( 7) is different.This is because the estimator proposed here deals with SC modulation, rather than OFDM as in [7].
Finally, the MSE obtained by using the optimal training sequence, derived in this paper, is compared to the MSE obtained by using a random training sequence in Figure 9.This figure shows that the optimal training sequence provides a better MSE.However, it can be seen that for N = 1000, the impact is smaller than for N = 100.For an MSE of 10 −2 , a random sequence of length N = 100 needs an SNR of around 21 dB, while the optimal one needs around 19.7 dB.It thus represents an 1.3 dB difference.However, for a training sequence length of N = 1000, changing from the random sequence to optimal sequence reduces the required MSE from around 10.3 dB to near 9.7 dB, which consists in a reduction of less than 1 dB.
The better performance of the optimal training sequence happens because the squared Frobenius norm ||X|| 2 F = tr{X H X} is smaller for the random sequence than for the optimal sequence, although they have the same power.This happens due to the structure of the convolution matrix in (8).As the Frobenius norm is smaller for the random sequence, tr{(X H X) −1 } will be larger and, therefore, the MSE given by (24) will be larger.For longer training sequences, this difference is smaller and, therefore, the impact of the optimal sequence in reducing the MSE is also smaller.

V. CONCLUSIONS
In this paper, a Least Squares channel estimation technique is proposed for PNC communication systems that uses single carrier modulation and operates under frequency-selective fading channels.
It is shown that a longer training sequence yields better performance in terms of MSE.However, it is possible to see that a highly accurate estimation does not necessary imply a better and, therefore, shorter training sequences can be used.It can be seen from the results shown in section IV that with a training sequence of length N = 100, the proposed estimator performs, in terms of BER, nearly as well as a system considering the perfect knowledge of the CIR.
Furthermore, simulations considering channel lengths at the estimator different from the real one shows that using a longer channel length at the estimator does not bring any penalty to the system performance when compared to the real length, keeping an equivalent BER.On the other hand, by considering a channel length smaller than the real one, the BER performance degrades considerably.Hence, it is important to develop an accurate estimator for the channel length at the end nodes.
The proposed estimator was also compared to the estimator presented in [7].Although the latter considers an OFDM system and has different mathematical constructions, this estimator works in time domain, so the MSE comparison is fair.It is shown that both estimators have equivalent MSE performance.
Moreover, the design of an optimal training sequence, in terms of the channel estimator MSE, is presented.It is shown that it provides a better performance compared to a random training sequence.However, for larger training sequence lengths, the improvement is smaller than for shorter sequences.

Fig. 4 .
Fig. 4. Scheme for the signal model of a PNC system.

Fig. 9 .
Fig. 9. MSE comparison between the random and the optimal training sequences for different lengths.

Pedro
Ivo da Cruz is currently pursuing his PhD in Information Engineering at Federal University of ABC (UFABC).He received the MSc and the BSc degrees in Information Engineering in 2017 and 2014, respectively, and the BSc in Science and Technology in 2013, from the same university.His research interests include wireless communications, adaptive and statistical signal processing, Wireless Physical-layer Network Coding systems and Wireless Physical-layer Security.Murilo Bellezoni Loiola received the titles of Electrical Engineer (2002), Master in Electrical Engineering (2005) and Doctor in Electrical Engineering (2009) from the University of Campinas (UNICAMP), Brazil.Currently, he is an Associate Professor at Federal University of ABC (UFABC).His main research interests lie in the areas of adaptive and statistical signal processing, wireless communications, wireless physical-layer security, and machine learning.