On Geostatistical Methods for Radio Environment Maps Generation under Location Uncertainty

Radio environment map (REM) can provide important information for designing and optimizing the performance of wireless communication networks. However, the location uncertainty related to the measurements used to build the REM can considerably deteriorate the accuracy of such map. This paper addresses this problem by proposing a modified approach of a classical geostatistical prediction tool, named Kriging method, which incorporates the location uncertainty and is able to improve the REM accuracy without adding significant complexity. Finally, we also show through simulation results that the average path loss and covariance parameter estimation play an important role and should be considered when the location errors occur in the wireless communication systems.


I. INTRODUCTION
R ECENTLY, there has been growing interest in the gener- ation and use of Radio environment map (REM) in wireless communications systems [1][2][3][4][5][6][7].Part of this is due to the attempt to make better use of geolocation information such as efficient use of spectral resources in wireless communications systems [8][9].Initially, the REM concept, proposed in [10][11], was defined as a database with geolocalized information applied to the dynamic spectrum access used in cognitive radio systems.Ever since, the focus on REM has expanded because of its potential applicability in telecommunications.In fact, REM can be used to extract relevant information about the behavior of the wireless channel as well as for making decisions related to different applications in wireless communications, such as: planning and optimization coverage tasks [3], [6][7][8][9]; minimization of drive tests (MDT) in mobile networks [37][38][39][40]; analysis and decision-making in cognitive radio systems involving: resource allocation [28], detection of coverage opportunities [5] and interference analysis in heterogeneous networks [29][30]; essential public safety services related to mobile networks such as 911 calls [8] and other specific applications, related to high speed trains [31][32], radar systems [33] and location-estimation-problems [34][35][36].
It is of fundamental importance to realize that the way each application relates to REM depends on how the location information is used in the system model and what kind of quantity constitutes the REM concept.In this work, we follow the approach considered in [3][4][5], in which REM consists Ricardo Augusto and Cristiano Panazio are with Laboratory of Signals and Communications of the Department of Telecommunications and Control Engineering of the Escola Politécnica of the University of São Paulo, SP, Brazil.E-mail contacts: {rasilva,cpanazio}@lcs.poli.usp.brDigital Object Identifier: 10.14209/jcis.2018.8 of a map that indicates a physical quantity of the radio environment, the most common being the signal intensity, i.e., radio signal power.Specifically, from a limited set of measurements collected in the radio environment, we will use spatial predictions for the REM generation.The applicability of spatial predictions to wireless communications systems is directly related to the spatial statistical characterization of the radio environment, which is complex because of the several propagation mechanisms of the wireless channel [21][22][23][24][25][26][27].In fact, these mechanisms will influence the received signals and, consequently, may affect the REM generation process.Thus, the processing of geolocalized measurements for reliable generation of coverage maps presents itself as a challenge.Moreover, modern wireless communication systems require accurate and robust spatial predictions against adversities.On the latter, a problem that has recently been investigated consists in the study of the impacts of the location uncertainty on the performance of the spatial predictions [41][42][43][44].
In a wireless communication system, the location information depends mainly on the measurements sent by the network devices as well as the network-based location estimation methods, which gives an error between 50 and 300 meters, and the global positioning system (GPS), with an error between 5 and 30 meters [8], [41].Despite the evolution of positioning techniques, due to channel adversities leading to imperfect location estimation as well as inaccuracy of measurements performed by network devices, it is reasonable to assume that the location information involved in the wireless communication system is not entirely accurate.Besides that, with the growth in the use of geolocation services, especially on the high accuracy required in public emergency services, we can mention that there is a fundamental need to incorporate the location uncertainty into the analysis of spatial predictions tools applied to the wireless communications systems.
In this context, we are looking for low-complexity spatial predictors (e.g., linear predictors), that are also robust to the channel adversities.On this aspect, a trade-off arises through two different approaches applied to the wireless spatial predictions: Bayesian and classical.In the Bayesian framework, a priori statistical distributions can be incorporated into the model so that the predictive posteriori distribution can be obtained, considering the uncertainties of the model [12][13], [16][17].This is a promising and interesting feature, since it allows to incorporate different types of uncertainties into spatial predictions such as location uncertainty.However, complex integration methods are probably involved in the search for optimal predictors in some statistical sense (e.g., the conditional mean of the posterior distribution) and, conse-quently, two important points emerge as obstacles to practical prediction Bayesian methods [17]: First, the requirement of a priori knowledge about the system model distributions as well as the analytical complexity of the parameters related to the conditional distributions necessary for the formulation of the Bayesian approach.Second, the high computational cost in the evaluation of the integrals to convert a specified model and prior to a posterior or Bayesian predictive distributions.A discussion about treatability is made in [42], in which a unified view of the Gaussian processes and the Bayesian approach is applied to the prediction problem with location uncertainty.
In a alternative way, the classical approach is characterized by the conventional estimation methods with semivariogrambased geostatistical framework used by Kriging spatial predictors, which allows us to find a flexible and low complexity solution for the REM generation.Indeed, when the model is characterized by Gaussian processes, the optimal conditional mean predictor of Bayes posterior distribution converges to the classical Kriging predictor.However, as we shall see later, the location uncertainty affects the Gaussianity of the model.In this sense, although the Bayesian approach can lead to nonlinear predictors that may be more robust, our attention is focused on two specific points: the use of low complexity linear predictors considering the treatment of the location uncertainty; and dealing with the lack of knowledge about the covariance parameters related to the radio environment.
The main contributions of our work can be summarized as follows: We extend our work in [76] by applying a geostatistical model for REM generation and discussing the main peculiarities and limitations of this approach; We incorporate the location uncertainty into the radio environment model and use the methodology in [69][70] to manipulate intractable quantities that arise when location errors are considered in the spatial prediction problem.This allowed us to make adjustments in the conventional Kriging technique to take into account the statistics of the location errors in the spatial predictions, i.e., Kriging adjusted for location error (KALE).Moreover, we show that the covariance parameter estimation subject to location errors plays an important role and significantly affects the performance of KALE predictor.In this sense, we propose a new method that modifies the estimation of the semivariogram, (i.e., semivariogram adjusted for location error (SALE) in order to improve the performance of the predictor KALE when there is lack of knowledge about the covariance parameters of the radio environment.
This paper is structured as follows.Sections II and III presents the system model of the radio environment and the geostatistical framework for REM generation.Section IV presents the Kriging spatial predictor.Section V incorporates the location uncertainty into the system model and propose a method for dealing with uncertainties.Numerical results and discussion are given in Section VI.Finally, Section VII points out the conclusions and proposals for future research.

II. RADIO ENVIRONMENT MODEL
The radio environment model is based on a wireless communication system composed of mobile devices and a radio base station (RBS).It is assumed that such devices are capable of performing the measurements of the received signal level and send them to the RBS, which coordinates the entire operation of the wireless communication system network.This means that the RBS is responsible for the processing of the measurements collected and for the REM generation.Specifically, the propagation mechanisms of the radio environment, such as diffraction, reflection, refraction as well as average path loss, will influence the intensity of the signals received in the devices and consequently can affect the process and accuracy related to the REM generation.
In this work, we considered that the radio environment is interpreted as a spatial random process P, which is composed by two main components and is defined in all spatial locations s = (x, y), according to where µ(s) consists of the spatial process drift, which is called trend in geostatistical terminology, ξ(s) represents the zero-mean spatial random fluctuations of the model and D represents a coverage area of interest that compose the spatial domain in two dimensions.In this model, the trend component is related to the wireless channel path loss while the spatial random fluctuations of the model depict shadowing effects of the wireless channel 1 .In fact, the spatial random process P(s) consists of a collection of random variables P(s i ) which are reception power measurements whose statistical properties depend on µ(s) and ξ(s).Therefore, the main challenge on spatial predictions is to capture the effects of the components that form the model in (1) from measurements in the radio environment.First, the following subsections address the component related to the spatial trend.Then, the concepts about spatial random fluctuations are presented.

A. Trend Modeling
Trend modeling is characterized by the mean component µ(s) and is based on average path loss of the wireless channel.The mean component consists of the average reception power measurements and can be described by a set of base functions with corresponding coefficients, according to where P tx is the RBS transmission power (dBm), f k (s) are the base functions of the model and α k are the unknown constant trend coefficients.Propagation tools for planning wireless systems and models for path loss have been extensively investigated in the literature and many papers discuss different models and methods on this subject (e.g., [45][46][47][48]).In this work, the model considered for average path loss is the wellknown log-distance model, denoted as where PL (d 0 ) is the attenuation (dB) at a reference distance d 0 (meters), also known as critical distance, PL (d) is the attenuation (dB) suffered by the signals transmitted due to average the path loss at a given distance d (meters) and α which consists of the propagation coefficient of the path loss model [49].This model was chosen due to its widespread use and since it allows to formulate the trending estimation as a linear estimation problem.It is important to note that the logdistance path loss model depends on the distance d between the RBS, with fixed location s tx = (x tx , y tx ), and the devices in network.By making some simplifications without loss of generality, (with PL (d 0 ) = 0dB and d 0 = 1m) and algebraic manipulations, it becomes possible to transporting this model to two dimensions as follows: where d(s tx , s) is the distance function with two dimensional coordinates.The model in (4) suggests how the average path loss of the radio environment fluctuates as a function of distance.In this sense, the value of the α parameter implies different decays of the logarithmic function of the path loss.
Table I shows the association of some known scenarios and the respective values for the propagation coefficient [50].By using the base functions of the model in (2) and considering one-parameter α path loss model (p = 1), it is possible to apply the average path loss in the trend model in (2) and express the average power reception, according to At this point, it is possible to highlight two information that will be useful: i) An inspection of the log-distance model allows to observe that this model is linear in the α parameter and ii) Although it is linear in α, the model is not linear on the spatial coordinates s.These characteristics will influence the parameter estimation and spatial predictions processes internal to the geostatistical methods.

B. Spatial Random Process Modeling
The model in (1) indicates that the spatial random fluctuations of ξ(s) occur around the average reception power.In a wireless communication system, this is due to the large buildings, mountains and large-scale obstacles that adorn the receivers in the network.In fact, a signal transmitted by a wireless propagation channel undergoes random variations due to obstructions along the path of the receiver.Since the location, size and dielectric properties of these obstruction objects are unknown, statistical models for describing such fluctuations become fundamentally necessary in this analysis.One of the most commonly used models in the literature is the log-normal shadowing [50][51][52].This model is characterized by a Gaussian random process in the dB scale and has been confirmed empirically to accurately model the variation in received power in both outdoor and indoor radio propagation environments [51][52].Thus, the log-normal shadowing is used to characterize the random fluctuations ξ(s) and is incorporated into the system model by adding a Gaussian spatial random process to the average path loss trend (4), according to where ξ(s) is a zero-mean Gaussian spatial random process with standard deviation σ, representing the shadowing in the spatial position s.Also, ξ(s) is characterized by structured parametric spatial covariance C(s i , s j ).In other words, the spatial dependence relationship between the collections of random variables that form the spatial random process P arises in the form of a structured covariance C(s i , s j ).On the geostatistical perspective, these are important presuppositions of our work and favor the choice of the Kriging predictor.
In fact, the central idea of geostatistics is to explore this relationship of spatial dependence to generate spatial predictions at unknown locations.Thus, the spatial predictors must be followed preceded by steps that have the objective of extracting the covariance parameters of ξ(s).On this aspect, two spatial functions are of fundamental importance, namely covariance and semivariogram.Formally, the covariance between two random variables in the coordinate pairs C(s i , s j ), ∀s i , s j ∈ D, is defined as where µ(s i ) is the mean of the random variable P(s i ) and E{•} is the expectation operator.The semivariogram function is defined as the variance of the differences of the random variables that compose P(s), Two properties are of fundamental importance for these functions, especially when it is assumed that the spatial random process are wide sense stationary: First, the expected value of the spatial random process exists and is constant, i.e., E{P(s)} = µ(s) = µ.Note that in the case of the radio environment, the average reception power depends on the spatial position related to the RBS.Consequently, the presence of the trend affects the stationarity and, therefore, the trend must be estimated.Second, the covariance and semivariance for any pair of points in the spatial random process is finite and its value depends only on the separation vector h, i.e., C(s i , s i + h) = C(h), where the vector h = (h x , h y ) consists of a spatial separation distance vector, h ≡ s i − s j , [12], [69].Thus, in these circumstances the structured covariance does not depend specifically on the spatial positions analyzed, but on the separation h of such coordinates.In this way, we can rewrite the theoretical expressions of covariance and semivariogram considering the invariance to the translations If the direction of the vector h does not influence the expected values in (9), the covariance and semivariogram will only depend on the distance, i.e, |h| = |s i − s j |.This isotropic property is useful in the application of the covariance and semivariogram functions in the characterization of correlation model of the log-normal shadowing of the wireless channels.In fact, one of the most used models for shadowing correlation characterization is the Gudmundson model [51][52][53][54][55]. Specifically, this model belongs to the class of correlation models that take into account the separation distance |h| between measurement collected in a radio environment.Exhaustive tests and measurement campaigns indicate an exponential behavior for the decay of the spatial correlation as a function of distance of the log-normal shadowing of the wireless communication channels [51][52][53][54][55][56].Therefore, in this paper, an option was made for isotropic-exponential structured covariance for the spatial random process generation, given by where m represents the variance of the spatial random process and is called sill variance while the parameter r is known as range and reflects the exponential decay of the covariance function.Under wide sense stationarity assumption there is a direct relationship between the covariance function and the semivariogram function [12], C(s i , s j ) = C(0) − γ(s i , s j ).Thus, the isotropic-exponential semivariogram model is given by Obtaining reliable estimates of reception power P(s) at unknown spatial positions will depend on the ability of the spatial predictors to capture the spatial variations of the radio environment.Thus, the semivariogram and covariance functions are fundamental for the spatial prediction problem of the REM generation process.

III. RADIO ENVIRONMENT SPATIAL PREDICTION MODEL
In this section, we present the spatial prediction method based on geostatistical approach applied to the radio environment modeling.Fig. 1 shows the diagram of the proposed model.First, the trend modeling is performed to generate the average reception power measurements as a function of the path loss model in (4).The spatially correlated shadowing generation characterizes the spatial random process modeling.Different techniques can be used to generate log-normal shadowing, and it is possible to highlight the techniques based on Cholesky decomposition, autoregressive filtering, sum of sinusoids and Fourier transformations of the shadowing correlation models (with higher computational cost) [22], [57][58][59][60][61][62].In this work, the Cholesky decomposition is chosen since it is directly based on the covariance matrix of the spatial random process and has a direct relation with the semivariogram.The combination of the trend with spatial random process forms the large-scale component of the wireless communication channel.
In a wireless communication system, only an observable part of the radio environment can be captured.Specifically, a spatial sampling is performed and N network devices send the reception power measurements P rx to the RBS composing a set of N power measurements.The spatial collection of measurements from the network devices characterizes the spatial sampling process applied to the radio environment.For simplicity, it is assumed that P tx = 0dBm.Thus, based on the system model ( 6), we can write where P = [P rx (s 1 ) ... P rx (s N )] T are the set of the power measurements, X = [−10 log d(s tx , s 1 ) ... − 10 log d(s tx , s N )] T are the vector of base functions, α is the path loss exponent and ξ = [ξ(s 1 ).... ξ(s N )] is the zero-mean spatial correlated Gaussian random vector.Initially, we emphasize an important assumption: In this section, the vector model ( 12) is locationerror-free as location uncertainty will be incorporated later on.This means that the measurements of the radio environment are taken exactly in the spatial coordinates (s) of the twodimensional space and will be processed by geostatistical methods to generate spatial predictions.
The geostatistical approach used in the work is characterized by three estimation stages: i) trend estimation; ii) experimental semivariogram (covariogram) estimation and iii) estimation of covariance parameters.These stages form the well-known learning phase of the channel parameters and precede the Fig. 1: Diagram with the model for generating spatial predictions based on the geostatistical approach.spatial prediction phase of the radio environment.In other words, the estimates of spatial predictions depend directly on the information about wireless channel parameters.Thus, we are particularly interested in proposing techniques that provide support for spatial predictors when there is location uncertainty in wireless system as well as the lack of knowledge about wireless channel parameters.
The first process to be performed is the trend estimation and it is important for two reasons: i) the presence of a trend in the radio environment affects the stationarity of the spatial random process influencing the results of the spatial predictors and ii) the experimental estimates of the semivariogram or the covariogram, if performed directly on the measurements, can lead to biased results [12].In this sense, the objective of the trend estimation process is to obtain the regionalized variable (RV) [12], [15], [18], which only characterizes the spatial random variations of the wireless channel shadowing.The logdistance model allows us to formulate the estimation problem (α estimation) linearly allowing the use of the minimum variance unbiased estimator through methods such as least squares (LS) techniques.Thus, after trend estimation of the radio environment (performed with least squares fitting), RV is obtained by removing the estimation results from the trend of the measurements collected, i.e., Z = P − μ characterizing the detrending process.
The next stage of the learning phase is the experimental semivariogram estimation process that is obtained from the RV.The experimental semivariogram (and covariogram) consists of the empirical estimate of the theoretical semivariogram (and covariance) function required for the spatial predictions.The method of moments of Matheron [12][13][14][15], [23] is used to obtain the experimental semivariogram from the regionalized measurements and can be described by where N h represents the number of pairs of points distant of h obtained from RV.Hence, the semivariogram estimates are obtained through the mean of the semivariance values of all pairs of points that are distanced themselves from h.In practice, for the calculated experimental semivariance to have a statistical meaning, it is necessary that N h should be sufficiently large, i.e., a reasonable number of measurements must be taken for each distance h.However, the RV is composed of a finite set consisting of N measurements.Facing this practical situation, the semivariogram estimation is based on pre-calculated separation distances, denoted as lag distances.
This work makes the semivariogram estimation through the methodology of [76] which is based on the quantization of the separation distances and Matheron method (details can be seen in [12], [14][15][16][17][18]).Fig. 2 shows a conceptual example of the parametric-analytical semivariogram model (solid curve) and the results of experimental semivariogram estimation (points).The analysis of the semivariogram estimation aims to provide a description of the spatial variability of the spatial random process.It is important to note that the analytical models and experimental parts have different definitions.The experimental semivariogram (covariogram) has no parameters while the semivariogram (covariance) analytical model is parametric.Also, there is an interest in the estimation of three parameters: the upper bound parameter m, called sill, the r parameter, called range and the nugget c 0 parameter, which is related to the nugget effect [66].These parameters are grouped in the form of a vector θ = [m, r, c 0 ].
In the wireless communications context, the parameter m consists of the variance of the spatial random process generated for wireless shadowing characterization and, therefore, can be interpreted as a measure of the shadowing severity.The parameter r is related to the slope of the semivariogram (decay of the covariance model) and allows to measure the separation distances from which there is a low spatial correlation between the captured measurements.Then, it can be interpreted as a measure of the de-correlation distance d corr of the wireless shadowing.Specifically, the distances between measurements that are smaller than r present a higher degree of spatial correlation, resulting in spatial dependence zones.Otherwise, measurements of the radio environment that distances beyond the range are weakly correlated characterizing the zones of spatial independence.The parameter c 0 is related to the nugget effect, which implies an abrupt change in the spatial variability of the random process at small separation distances (less than the smaller spatial sampling intervals considered) and may result from measurements inaccuracies, short-range variations, or superimposed noise [23][24].
The covariance parameter estimation can be performed using different techniques, e.g., Maximum Likelihood (ML) and Least Squares (LS) [12], [15][16][17][18][19][20].The Gaussian nature of the wireless channel shadowing makes the ML method better suited for the estimation problems.However, as we will see later, the location uncertainty affects the Gaussianity of the model [41], [69], [72].Also, although the ML method has superior performance, the estimation can be computationally challenging for large data sets, as each likelihood evaluation requires a Cholesky factorization of covariance matrix or equivalent operations, which is O(N 3 ) [72].On the other hand, the least squares estimators uses the separation distances h and the experimental semivariogram to find the values of θ that lead to the minimization a cost function by using search algo-rithms (e.g., Trust Region and Levenberg-Marquardt [67][68]).In fact, the LS estimation presents a practical geostatisticalbased solution in which no probabilistic assumptions are made about the measurements and, for this reason, (associated with ML convergence problems) it is used in this work.Particularly on this aspect, the spatial sampling pattern is conditioned to the type of application analyzed (e.g., drivetests campaign or measurement report systems).In a planning for the execution of drive-tests, there is more control over the locations where the measurements will be captured through routes that are predefined and usually accompany the roads and streets of the cities.The spatial sampling with routes is common in works where some practical drive-test activity was performed as in [41].Otherwise, an operation scenario where the measurements are sent from users devices, there are no predefined routes but a spatial pattern movement of the devices in the network, which will determine the type of the spatial distribution.In fact, the use of uniform distribution in REM occurs in applications related to distributed sensors networks [43], cognitive radio networks [29][30] and also in wireless communications systems [5][6].In this context, the uniform spatial sampling hypothesis is plausible in scenarios where there is no a priori control over measurement locations and, therefore, is used in this work.Even so, it is important to note that the minimum control in the execution of the sampling strategy is enough to find alternatives to choose the most appropriate strategies (e.g., [80][81][82]).Moreover, for analysis simplification, we also consider the use of ominidirectional antennas as in [41][42].The incorporation of directional antenna modeling is possible to make the model more practical, but it increases the simulation complexity.Actually, it is important to mention that the directional antenna transmission impacts the radio parameter estimation and can consequently affect the performance of spatial predictions [79][80].The trend estimation is presented in Fig. 3(d), which shows a comparison between the theoretical average path loss model and the result of trend estimation along with average path loss measurements contaminated with shadowing of the wireless channel.The experimental semivariogram and covariogram are presented in Figs.3(e) and 3(f).The amount of combinations of pairs of points associated with each lag distance is also shown (boxes).It is possible to note that the semivariogram and the covariogram reveal important information about the parameters used in the generation process of the radio environment.First, an upper bound is revealed by the estimates of semivariance in Fig. 3(e).It is verified that such an upper bound occurs from distances of the order of the range.This result can also be seen in the behavior of the covariogram, where a more pronounced covariance drop occurs around 20 meters of separation.The solid lines in Figs.3(e

IV. SPATIAL PREDICTIONS -KRIGING METHOD
The formulation of the spatial prediction problem in wireless communications is to obtain a reliable value of received power in a coverage area from a limited set of measurements sent by the network devices.A spatial predictor is intended to solve this problem and Kriging is a best linear unbiased predictor (BLUP) in the sense that it minimizes the variance of the prediction error [12], [14][15][16][17][18][19][20].
Depending on the characteristics of the problem, different Kriging methods are better suited to perform the spatial prediction.In circumstances where the specific value of the mean µ is known a priori, the simple Kriging (SK) method is used.In the case where µ is not necessarily known, but constant, the ordinary Kriging (OK) method is used, since it requires only the constant behavior of µ along the D domain.In practice, the mean of the spatial random process can vary along the space (as in the radio environment), compromising the stationarity assumption.In this situation, the universal Kriging (UK) method can be used if the trend can be decomposed into a set of base functions represented by low-degree monomials [12].In other words, if the trend is linear at coordinates or with powers that normally do not exceed the power of two, the UK can be used.Despite this, the average path loss trend in the radio environment is modeled with non-linear functions in the coordinates (such as square root and logarithms) with powers that are usually greater than two indicating that the use of the UK in this case requires complex transformations for the decomposition of base functions to be achieved.Then, in order to overcome this problem, the OK predictor combined with detrending process is the chosen tool to perform spatial predictions in this work, where is the scalar result of spatial prediction in a given unknown spatial coordinate s 0 , obtained through the linear combination with the Kriging weights column vector λ, with size N × 1, and column vector of measurements Z, with size N × 1.
It is important to mention that the unknown locality s 0 is taken into account indirectly in the OK predictor (14) through the calculation of the Kriging weights λ i .Specifically, the values λ i are calculated with the objective of minimizing the variance of the prediction error, i.e., Var{Z OK (s 0 ) − Z(s 0 )}, based on the condition that the predictor in ( 14) is unbiased.In this sense, the optimal Kriging weights consists in the search for the solution of a system of equations, given by where Γ is the semivariance matrix between the measurements that is based on covariance parameters and β is the semivariance vector between the RV measurements and the target coordinate of spatial prediction s 0 .The details of the Lagrange mathematical procedure for obtaining the Kriging system of equations as well as the optimal weights can be found in [12], [14][15][16][17][18][19][20], [76][77][78].It is worth noting that the matrix inversion may lead to some numerical problems, especially in application where the amount of data is significantly high [78], due to ill-conditioning.One way to overcome this if it may happen is to use a fixed-rank Kriging method [77], which decomposes the matrix into low order matrices.The spatial prediction Z OK (s 0 ) generated by OK predictor is a result of the linear combination of the Kriging weights with the trend-removed vector of measurements Z.Thus, the sum of Z OK (s 0 ) with the the average received power estimation in s 0 , denoted as μ(s 0 ), allows to obtain the prediction value of the received power at unknown location according to Fig. 4 shows the spatial prediction results, i.e., REM, based on the measurements captured from the spatial random process previously shown in Fig. 3.In Fig. 4(a) the true reception power map subject to the average path loss and shadowing effects is present while the REM generation, composed of trend estimation and spatial predictions, is shown in Fig. 4(b).The comparison between the wireless shadowing and OK spatial predictions can be observed in Figs.4(d) and 4(e).Through a single realization and difference maps, it is possible to verify that the Kriging spatial predictor was able to capture the spatial variability of the radio environment.However, it is noted in Figs.4(c) and 4(f) that the difference between the values of spatial predictions and the true values of the radio environment fluctuates spatially in the map.Investigations carried out in our simulations also indicated that as important as the number of measurements is the representativeness of such measurements, i.e., how efficient these measurements are in capturing the spatial variability of the random process that characterizes the radio environment.Therefore, the spatial sampling pattern and density (i.e., the relationship of N with the size of the coverage area) and the de-correlation distance of Furthermore, it is important to highlight that these results refer to a single realization of the spatial random process (Section VI brings the performance analysis with more realizations).At thit point, it is intuitive to think that the incorporation of uncertainties on the devices positions can influence spatial prediction results.In the next section, we introduce the location error in our system model and proposes a spatial predictor that can mitigate its harmful impacts.[69][70][71] is applied to the wireless communication system.It is assumed that the network devices are uniformly distributed within the coverage area and that each device is capable of making reception power measurements.Figure 5 illustrates the problem of location uncertainty in the wireless environment showing how Gaussian location errors can affect the positions reported by devices in the wireless network and by consequence how it will imply in some perturbations when creating the REM.Through network location methods, the devices send their location information to RBS, forming the set of intended sites {s 1 , ..., s N }, marked as small circles in Fig. 5.However, the incorporation of location uncertainty corrupts this information, since the devices are positioned in the set of realized sites {r 1 , ..., r N }, which are the true unknown positions, marked as small crosses.Thus, the reception power measurements are taken in r and mistakenly associated with s coordinates.This mismatch between the power measurements and the set s characterizes the location uncertainty model,

V. DEALING WITH LOCATION UNCERTAINTY In this work, the uncertainty location uncertainty model proposed by Cressie and Gabrosek in
where p(s) ≡ r − s = (u x , u y ) consists of the two dimensional location error vector related to the spatial coordinates and follows a given probability distribution g(•).In this model, it is assumed that the location error p(s) is independent in the spatial coordinates, that is, a correlation between location errors in the radio environment is not assumed a priori.Two types of statistical distributions for the location error are evidenced in [69][70][71]: , where σ p is a standard deviation of location error, measured in meters, I dN is identity matrix with size of dN × dN and d is the dimension of spatial domain (d = 2).(ii) i.i.d.uniform location errors that is the random vector Gabrosek in [70] investigated the influence of location errors with Gaussian and uniform distributions on several spatial covariance models and pointed out that both distributions caused similar effects on the covariance of the spatial random process.In this work, we chose the Gaussian distribution to characterize the random effects of location errors of uncertainty model, since it is plausible to assume this distribution to model location errors in the context of wireless communications systems [41][42].In this way, by incorporating the uncertainty model ( 17) into the system model of the radio environment ( 6), it becomes possible to write where we denote the set of received power measurements with location errors as {P g (s 1 ), ... , P g (s N )} and s, r ∈ D. Also, it is assumed that the set {Z g (s 1 ), ... , Z g (s N )} represents the RV obtained from the removed trend estimation process subject to location errors.In this context, some difficulties arise, especially in aspects related to the characterization of the spatial random process with location uncertainty and the influence of such errors in the trend and covariance parameters estimation.To verify that the spatial random process of the radio environment has its Gaussian characteristics affected when subjected to location uncertainty, the equation below shows the cumulative distribution function of the location corrupted process Z g (s), where g(u) is the probability density distribution of location error vector p(s) = u = (u x , u y ), which is responsible for affecting the Gaussian characteristic of the spatial random process.This means that the covariance of the spatial random process subject to location uncertainty is different when compared to the covariance of the location-error-free random process and, consequently this difference will have some influence on the estimation of the covariance parameters.Two important effects of the location uncertainty are shown in Fig. 6.First, the estimation and removal of the trend is influenced by location errors.This can be observed in Fig. 6(a) where the dispersion (variance) of the RV increases as the severity of the location errors also increases.As a consequence, some residue of the trend contaminates the RV that will be processed by the next estimation steps.The effects of location errors on the spatial covariance are shown through experimental semivariograms in Figs.6(b) and 6(c), obtained from the measurements Z(s) (without location errors) and Z g (s) (with location errors), respectively.Through the simulations, a change in the structured covariance was observed as well as a lifting effect in the vicinity of the semivariogram origin (flattening effect on the covariogram origin).These observations agree with the results presented in [69][70][71][72][73].
Specifically in the works [69][70][71], the influence of the location errors on the structured covariance is also investigated and an adjustment procedure to take into account the location uncertainty in the spatial predictions is proposed.In this work, we propose a method to deal with location uncertainties and that complements the methodology of [69], with the objective of applying it to the radio environment of wireless communications systems.The methodology of [69] is based on the adjustment of the covariances of the Kriging system through the use of the location error statistics g(•).Thus, the adjusted covariances between intended sites s, denoted as C g (s i , s j ), and intended sites s and target coordinate s 0 , denoted as C g (s i , s 0 ), are obtained through where u = (u x , u y ) is a location error vector in position s, and v = (v x , v y ) is a location error vector in s + h.These integrals are analytically intractable, even for simple combinations of covariance and statistical distributions of location errors.In an attempt to solve this problem, as pointed out in [69] [72], we adopted the Monte Carlo Integration approach to evaluate (20) and reach approximations for adjusted covariances.
The practical implementation to solve such integrals is accomplished through an approximation of the covariances between the measurements from the generation of M i.i.d.location error sample vectors with distribution g(•) and described by The use of data with location errors {Z g (s 1 ), ... , Z g (s N )} combined with Kriging weights obtained from covariance C(•) without adjustments and applied to the spatial predictor in (14) characterize the Kriging ignoring location error (KILE) predictor.In another way, the use of data with location errors combined with Kriging weights that were obtained from the adjusted covariance C g (•) in the system of Kriging equations is denoted as a Kriging adjusted for location error (KALE).The system of adjusted Kriging equations is constructed from adjusted covariances, given by Γ g λ g = β g .Thus, the KALE optimal weights are obtained from the matrix inversion  calculation λ g = Γ −1 g β g .The compact form of both predictors can be described according to: With spatial predictors in (22) and the average received power estimation, the prediction of the received power in unknown coordinates can be obtained according to On the spatial predictions in (22)(23), two important points should be highlighted: i) the covariance adjustments of the KALE predictor are performed in the the system of Kriging equations and ii) the covariance parameters used in this system of equations can be perfectly known or estimated, and in the latter case, they are affected by the location uncertainty.This means that the lack of knowledge about covariance parameters θ will have some impact on the Kriging systems and, consequently, on the spatial predictions.In this sense, this work proposes a method that complements the KALE methodology considering the location uncertainty knowledge when the covariance parameters are unknown.Specifically, we propose that the statistical distribution g(•) is also taken into account in the method of moments related to the experimental semivariogram, which we called the semivariogram adjusted for location error (SALE) method.In other words, we proposed additional modifications to the experimental semivariogram estimation procedure to include location error statistics.With the knowledge of the probability distribution of location errors, g(•), the location error sample vectors are used to perform a random perturbation in the distance matrix used in the estimation of the experimental semivariogram.The algorithm and quantization procedures for obtaining the Lag distances, based on [76], are applied on the disturbed distances matrix, resulting in adjusted Lag distances and semivariances.The results using the SALE method are shown in Fig. 6(d) and although the adjusted semivariogram is still different from the semivariogram obtained without location uncertainty, it is more smoothed when compared to the semivariogram that was corrupted by the location errors, shown in Fig. 6(c).
Our main motivation for using the SALE method combined with the KALE predictor, namely S-KALE predictor, is presented in Fig. 7.The histograms of Figs.7(a) and 7(b) show that the sill estimation was not significantly affected by the location uncertainty while the LS estimation related to the range parameter, shown in Fig. 7(c), was significantly affected.On the other hand, it is possible to observe in the histogram of Fig. 7(d) that the SALE method allowed to reach a result closer to the optimal value for the range parameter estimation.Similar results were obtained with several radio environment configurations (e.g., coverage sizes) and different values for the sill and range parameters.This motivates us to use the S-KALE predictor in the circumstances where covariance parameter estimation is required.Although we are considering only the large-scale effects (c 0 = 0), it was observed that when the spatial random process has higher nugget values, comparable to the values of the sill variance (i.e., pure nugget effect with extremely discontinuous semivariograms), the SALE method performs equivalent to the case where the estimated semivariogram is not adjusted.

VI. SIMULATION RESULTS
This section presents the Monte Carlo simulation results related to the performance of KILE, KALE and S-KALE spatial predictors.The simulations are based on L = 10,000 independent realizations through the following steps: (i) For each realization, an initial spatial sampling strategy with uniform distribution is generated with the intended sites {s 1 , ..., s N }.Location errors p(s) are generated according to a Gaussian distribution with location error standard deviation σ p and corrupt the intended locations resulting in the set of realized sites {r 1 , ..., r N }.(ii) For each realization, the radio environment is generated by combining the trend (with average path loss effects) and spatial random variations (shadowing effects) in the realized spatial coordinates r and a target coordinate s 0 is randomly chosen for the performance evaluation of the spatial predictors.(iii) For each realization, the Kriging method is used to predict the channel shadowing value in s 0 , using: (a) the intended sites and isotropic-exponential covariance without adjustment, i.e., the KILE predictor; (b) the intended sites and the adjusted isotropic-exponential covariance, i.e., the KALE predictor and (c) the intended sites with adjusted semivariogram and isotropic-exponential covariance, i.e., the S-KALE predictor.(iv) The average received power estimation is added to the spatial predictions (i.e., Z KILE , Z KALE and Z S-KALE ) and, therefore, the results can be compared with the true values of the spatial random process according to where MSPE is the mean square prediction error and it is empirically estimated through the average over L independent realizations The objective is to verify the performance of the spatial predictors under analysis in relation to the number of measurements required to achieve a given performance as well as the robustness to the location uncertainty.Although it is not necessarily an optimal metric, MSPE is widely used in the context of performance analysis of REM spatial predictions, especially in wireless communications systems as in [29][30], [41][42].In fact, MSPE is a mean indicator of the dispersion of spatial prediction errors in REM generation process.First, the a priori knowledge of the covariance parameters θ of the wireless shadowing is assumed and a performance analysis is conducted to verify whether taking into account the location uncertainty in spatial predictions can yield better results.Therefore, the first performance analysis is made on the KILE and KALE spatial predictors.Typical wireless communications settings are chosen with average path loss with α = 3, wireless shadowing with known parameters θ = [49, 100, 0] and coverage size of 500m × 500m.
Initially, the variable to be analyzed consists of the number of measurements N while the location errors are incorporated into the simulation with Gaussian distribution and fixed standard deviation, σ p = 40m.The perfect and imperfect average path loss estimation are considered, characterizing Scenarios A and B, respectively.The results are shown in Fig. 8 in which it can be seen that taking into account the location error statistics on spatial predictions produced better performance.Although the increase in N favored both predictors, the location errors significantly affected KILE performance, while KALE benefited from more measurements.For situations where N > 100, the MSPE squared reduction of KALE over KILE was more than 7dB with imperfect trend estimation.Therefore, we can mention that taking into account the location error statistics is more efficient than ignoring them as N increases.Moreover, the ability of predictors to achieve reasonable performance with the least number of measurements as possible (i.e., predictor efficiency) is the key factor in choosing the predictor for REM generation, especially in systems whose architecture does not allow to send many measurements due to bandwidth or energy resources constraints.Another important aspect is the robustness of the spatial predictor against the severity of the location errors.In fact, despite the notorious evolution of network-based location estimation methods, channel adversities make it difficult to obtain accurate location estimates, resulting in an uncertainty in reported positioning of network devices including the possibility of generating outliers and invalid location information in the wireless network.Thus, the robustness of the spatial predictor against the negative effects of the location errors is also an important feature for REM design.
In this sense, a comparative analysis is performed considering the gradual increase of the standard deviation of the location errors σ p , assuming a fixed set consisting of N = 100 measurements collected in the radio environment for both KILE and KALE predictors.The same configuration settings as the previous simulation were chosen.The results are shown in Fig. 9 and the cases in which the perfect and imperfect trend estimation occurs are represented by the Scenarios C and D, respectively.The first conclusion that can be drawn from the plots is that KALE has proved to be more robust against location errors when compared to KILE.It is possible to notice MSPE squared reductions of the order of 4dB when the location errors have σ p = 40m and more than 5dB when the path loss trend estimation is imperfect.This reaffirms that ignoring location errors while attempting to capture the spatial variability for the Kriging predictor is not an efficient decision from the point of view of the accuracy of the predictions.In fact, what is occurring is that the difference between the spatial covariance with and without location uncertainty increases as long as the location error severity is also increased.Therefore, ignoring location uncertainty fails significantly to capture the spatial variability of the radio environment.
At this point, it is important to note that the results obtained consider the complete knowledge about covariance parameters of the wireless communication channel.In practice, the lack of knowledge about θ consists of a more realistic aspect to be applied on the system model.A simulation is conducted incorporating the covariance parameter estimation through the semivariogram and least squares estimation to verify the impact on both KILE and KALE predictors as well as the performance of proposed S-KALE predictor.In addition, for comparative purposes with other REM method, we compare the Kriging methods to IDW technique [1], [4], which is essentially based on the deterministic calculation of the inverse of the distance between the measurements collected and the target coordinate s 0 .The results are shown in Fig. 10 and the two cases of perfect and imperfect average path loss estimation are represented by the Scenarios E and F, respectively.It is possible to note that when the covariance parameters are estimated first, the difference in performance between KALE and KILE is reduced.Also, the difference between the Scenarios E and F shows that the trend estimation affected the performance of all the predictors.In fact, both KILE and KALE were significantly affected and the IDW technique had the worst performance, except for KILE in a regime of high location uncertainty.For instance, in Scenario F with σ p = 40m, the KILE is 3dB worse than the IDW and KALE has about the same MSPE performance, while the S-KALE is 4dB better than it.In other words, it is still advantageous to proceed with the adjustment on KALE, but the performance improvement is not substantial and depends significantly on the results of the covariance parameter estimation.These results corroborate the investigations carried out in [72] and show that the covariance parameter estimation plays an important role on spatial predictions subject to location uncertainty.The performance results point out that the use of the location error statistics in the semivariogram estimation (SALE) related to the S-KALE overcomes the performance of KALE in the circumstances where the covariance estimation is required.Our simulation analyzes also revealed the superiority of S-KALE in different configurations of the radio environment (e.g., different coverage areas and number of measurements) as well as different values of the parameters range and sill of the wireless communication channel.Although the different approach in the estimation process of covariance parameters, these results corroborates the conclusions presented in references [41] and [42].Specifically, regarding our work, [42] presents some differences, including: i) the use of the ML technique to estimate the wireless channel parameters considering location uncertainty at this step, whereas our work incorporates the location uncertainty in the classical geostatistical framework based on the semivariogram and the least squares estimators; ii) the use of closed-form expressions for Gaussian covariance functions with location uncertainty, whereas our work uses the Monte Carlo integration methods to accommodate different spatial covariance combinations with statistical distributions of location errors in the Kriging predictor and iii) all performance analyzes are based on one-dimensional environments, while our work considers two-dimensional radio environments maps.
Still in relation to other works, the system model of [41] presents different characteristics in relation to our work due to the Bayesian approach and the shadowing decomposition of the system model with several spatial base functions to perform the fixed rank Kriging method.The ML estimation is also used and a new strategy is proposed taking into account the posterior distribution of the observations using modifications based on the stochastic approximated Expectation-Maximization (EM) algorithm for parameter estimation.This brought the possibility of the conditional expectation predictor being obtained, which is non-linear and more complex, and presents better results compared to the BLUP predictor used in our work (albeit with a different system model), especially in a regime with high severity for location errors.Indeed, dealing with the location uncertainty in the covariance parameter estimation is a plausible initiative, but as pointed out by Diggle and Fanshawe [73] and in [74][75], the intractability of the integrals involved in parameter estimation implies analytical and computational challenges when the positioning uncertainty is present in the system model.
In this sense, the S-KALE predictor proposed in this work points out an alternative and low complexity solution applied for large-scale radio environments, since the performance results were considerably better in several cases when compared with deterministic techniques, such as IDW, and conventional Kriging methods.In other words, the potential of our work lies in the use of techniques that reduce the complexity of the channel parameter estimation in the geostatistical model when there is no a priori knowledge of the covariance parameters of wireless channel subject to location uncertainty.

VII. CONCLUSIONS
In this manuscript, we have applied the Kriging spatial predictor, a well-known technique in geostatistics, to obtain a REM.After formulating the spatial prediction problem, we have shown the necessary steps to generate the REM through the geostatistical based framework.We introduce the location errors in the system model to make the prediction problem more realistic.Moreover, we consider an important assumption on the lack of knowledge about the covariance parameters of the wireless channel.In this sense, it was possible to verify that the average path loss trend and covariance parameter estimation processes plays an important role when subjected to location errors and significantly affect the KILE and KALE performance.To deal with these problems, we propose the the S-KALE predictor, which took into account the location error statistical distribution in parameter estimation, through the adjusted semivariogram procedure.The simulation results showed that the S-KALE predictor was able to provide superior performance when compared to KALE and KILE predictors when covariance parameter estimation is required, subject to location uncertainty.On future research directions, practical research on REM (e.g., [4], [26], [83][84]) and investigations on the influence of location uncertainty on structured spatial covariance are important points in spatial prediction problems applied to the wireless communications.

Figure 3
Figure 3 presents the initial simulation results of radio environment spatial prediction model related to the learning phase.Typical configuration settings were used and the radio environment simulation is composed of: the average received power due to the average path loss effects, depicted in Fig.3(a), and the wireless channel shadowing with a isotropicexponential covariance analytical model, shown in Fig.3(b).The combination of maps in Figs.3(a) and 3(b) results in the spatial random process presented in Fig.3(c).The RBS is characterized by a transmitter with Equivalent Isotropically Radiated Power (EIRP = 0dBm) positioned in the location s tx = (20, 20), within a coverage area with size given by 100m × 100m.The propagation coefficient of the path loss model was simulated with the value α = 3.The log-normal shadowing of the radio environment has standard deviation σ = 5dB (m = 25) and zones of spatial dependence equal to 20 meters (r = 20).Also, we assume that abrupt spatial variations do not occur at small separation distances (c 0 = 0), since we are particularly interested in modeling the large-scale effects of the channel only with average path loss and shadowing.The spatial sampling is performed by capturing N = 200 power measurements which were uniformly distributed.
) and 3(f) are the semivariogram and covariance analytical functions with the LS estimated parameters θ = [m, r, c 0 ] = [31.3,23.8, 0].The difference between the estimated covariance and the true parameters θ = [m, r, c 0 ] = [25, 20, 0] is due to the estimation performance of the sequential methods (i.e., trend, semivariogram and covariance parameter estimation) used in this geostatistical approach.

Fig. 4 :
Fig. 4: Simulation results of REM: (a) true power reception; (b) REM obtained from trend estimation and OK predictions; (c) difference map: power reception and REM; (d) wireless shadowing; (e) OK predictions; (f) difference map: shadowing and OK prediction.

Fig. 6 :
Fig. 6: Location error effects: (a) dispersion of RV due to the location errors; (b) semivariogram without location errors; (c) semivariogram with location errors; (d) semivariogram adjusted for location error (SALE).

Fig. 7 :
Fig. 7: Histogram of covariance parameter estimation: (a) sill estimation with location errors; (b) sill estimation based on SALE method; (c) range estimation with location errors; (d) range estimation based on the SALE method.

Fig. 8 :
Fig. 8: Influence of N on the performance of KILE and KALE spatial predictors with perfect (Scenario A) and imperfect (Scenario B) average path loss estimation and known covariance parameters.

Fig. 9 :
Fig. 9: Influence of location uncertainty on the performance of KILE and KALE spatial predictors with perfect (Scenario C) and imperfect (Scenario D) average path loss estimation and known covariance parameters.

Fig. 10 :
Fig. 10: Influence of location uncertainty on the performance of KILE, KALE and S-KALE spatial predictors with perfect (Scenario E) and imperfect (Scenario F) average path loss estimation and unknown covariance parameters.

Ricardo
Augusto received Engineer's and Master's degrees in Telecommunication Engineering from the National Institute of Telecommunication (Inatel), Brazil, in 2010 and 2013, respectively.Currently, he is a PhD student at Escola Politécnica of the University of São Paulo (USP), Brazil.Since 2010, he has been working as a specialist in communications systems in the training and consulting area of Inatel.His research interests include wireless communications systems, statistical signal processing, geolocation, and machine learning techniques.Cristiano Panazio received Engineer's and Master's degrees in Electrical Engineering in 1999 and 2001, respectively, from State University of Campinas and a PhD degree in Telecommunications in 2005 from the Conservatoire National des Arts et Métiers.Currently, he is assistant professor at the Laboratory of Communications and Signals of the Escola Politécnica of the University of São Paulo.He is a member of the Brazilian Telecommunications Society (SBrT) and has served as an Associate Editor of the IEEE Transactions on Circuits and Systems -I from 2015 to 2017.His research interests are adaptive equalization (data aided and blind), synchronization techniques, mobile communications, multicarrier modulation and space-time transceivers.