Heuristic for Hardware Dimensioning Considering Tidal Effect

The recent increase in the volume of services and applications, in addition to the accelerated growth in demand for wireless access, represent significant challenges for the fifth generation (5G) of mobile networks. The daily large-scale migration of people to urban centres is another aspect of this trend, as it entails what is known as the “tidal effec”. This effect leads to natural fluctuations in traffic throughout the day and makes it difficult to conduct network dimensioning, control, and management, thus resulting in the inefficient use of network resources. A heuristic with two approaches for provisioning resources (one based on the aggregate throughput and the other on the number of connected users) is proposed in this paper. This is based on data extracted from the mobile subscriber movement in the current network architecture, where a database with geolocation information from the city of New York is used. The aim of this heuristic is to meet the imminent network demands of the future in light of the expected lack of available hardware resources in future mobile networks. Our results suggest that the network provisioning strategy meet the requirements of traffic variability by reducing the number of active antennas by 13% and the network-blocking probability by 3.7%, as well as by maximizing the efficiency of the baseband unit (BBU) and quantifying the small cells (SCs) needed to meet network demands.

communication flow across different network topologies. This phenomenon is known as the "tidal effect" [6] and can result in the inefficient use of network resources, the overloading of the network in places of high demand or the deployment of underutilized devices, thus increasing the costs of deployment and maintenance.
As network traffic is no longer distributed uniformly, there can be a huge gap between the maximum and minimum volume of traffic. This scenario poses a problem for the operation and management of networks. In countries such as China, India, and Brazil, intense urbanization has resulted in megalopolises with over 10 million people. In developed cities, such as Paris or London, metropolitan areas also have high population densities. Thus, the growth in mobile traffic combined with metropolitan overcrowding intensifies tidal traffic congestion, making it one of the key factors that influences the dimensioning and control of telephone operators [7].
However, the tidal effect is a consequence of predictable patterns of human movement along with their mobile devices, and in view of this, it is essential for operators to be able to recognize these patterns so they can take the measures required to reduce congestion during periods of idleness [8]. Related to this, an architecture that can efficiently handle network load fluctuations is necessary, and this is one of the goals of the centralized radio access network (C-RAN).
By splitting base station (BS) hardware into a remote radio head (RRH) and a baseband unit, the C-RAN also increases flexibility and enables mobile network operations to be more dynamic than network operations without the C-RAN. In a C-RAN, resources can be dynamically allocated and redistributed across a given geographical area while being adjusted to a time-varying traffic load, and the C-RAN thus benefits from the statistical multiplexing gain achieved by adapting it to traffic fluctuations [9]. In traditional RANs, baseband capacity is statically assigned to each cell, meaning that the resources are allocated regardless of the users' movements.
Network operators are preparing the ground for the migration from the distributed radio access network (D-RAN) to the C-RAN as a means of exploiting the benefits of this new architecture, and this migration should be carried out in a way that is transparent to the users of the network. Since the D-RAN remains the dominant deployment architecture, research endeavours should seek to ensure a smooth transition towards the 5G ecosystem [10]. The migration between these architectures naturally imposes physical constraints on the new hardware, which must be dimensioned effectively so that it can meet the needs of new applications.
A policy for resource management is needed to maximize the gains that can be obtained by installing a C-RAN, and this policy should be able to deal with tidal effect scenarios, since these are the main causes of network load imbalance and wasted resources (energy and hardware), both of which affect the network at both the metro and core levels [11]. Therefore, this paper establishes a heuristic for the selection and quantitative determination of antennas to meet the capacity requirements of a predetermined scenario. The heuristic employs real user mobility data (collected from a database) by adopting two approaches; the first is based on the throughput rate of small cells (SCs), while the second depends on the number of users connected to the SCs.
The paper is structured as follows: Section II outlines the related work with regard to the aforementioned research area. Section III discusses the system model and key information regarding the subject, architecture, evaluation methodology, formulation of the research problem and designed heuristic. In Section IV, the results are shown and discussed. Finally, Section V summarizes the conclusions of this study and makes recommendations for further research in the field.

II. RELATED WORK
Predictions about the 5G network suggest that there will soon be traffic that is five times higher than that of the current generation, and this traffic will require more efficient management [12]. To meet these requirements, processing units must be centralized and implemented in the C-RAN, since this facilitates the dynamic sharing of resources and makes the network adaptable. In this architecture, resource provisioning can be implemented with demand recognition, which involves the resources (radio or hardware) being employed to meet the mobile load in the network, or by maximizing the efficiency of the BBU pool and minimizing the effects of variability in traffic monitoring data caused by the tidal effect [13] and [14].
Some authors have employed methods for network sizing by dynamically adapting it to the capacity required by varying network demands, as seen in [15]. In this study, a selforganizing C-RAN is proposed, where the BBU pool and RRH are scaled semi-statically based on the concept of cell differentiation and integration (CDI), while dynamic load balancing is formulated as an integer-based optimization problem with constraints.
The authors in [16] relied on a) handover indicators to dimension the number of active cells and lower the frequency of cell-state changes (active and inactive) and b) a genetic algorithm to assist in reducing the costs and minimizing the number of active cells. Minimizing the active cells lowers the operating costs of a heterogeneous centralized radio access network (H-CRAN), which is a combination of HetNets (heterogeneous networks) and a C-RAN. This allows for the installation of dense and heterogeneous networks that can perform cooperative signal processing when there is a high load imbalance.
In the 5G network, the distribution and mobility of users have an increasing influence on the distribution of traffic in metropolitan elastic optical networks (EONs). The tidal effect in this context represents the distribution of traffic when there is a significant imbalance in the dimensions of time and space. In view of this, the authors of [6] studied dynamic mapping algorithms for the RRH-BBU combination in the C-RAN architecture based on spectrum allocation, resource allocation in frequency slots in EONs and maximization of bandwidth. A simulation was carried out to assess the performance of the allocation scheme and the blocking probability.
In [18], it was assumed that technologies that proliferate with 5G, such as HetNets, are necessary to ensure load balancing without leading to a decline in the network's quality of service (QoS). However, the authors designed a solution to obtain an effective utilization model for hardware, which can alleviate the problem of having a shortage of spectrum in the wireless network. This involved scaling the number of access points capable of handling the heavy traffic of users in the LTE-A (long term evolution-advanced) network. The method relies on the available capacity of the channels to estimate the number of antennas needed to support the requirements of the LTE network for efficient traffic.
By leveraging the full potential of the C-RAN architecture, the authors in [17] carried out a novel elastic resource provisioning strategy to reduce power consumption both at the cell sites and in the cloud while addressing the problem of fluctuations in demanded per-user capacity. In their planned model, the RRHs and their corresponding BSs were divided into clusters, within which the active RRH densities, transmission powers, and sizes of virtual machines (VMs) were adjusted dynamically. The performance gain achieved by the elastic model obtained from their experiment over that of the traditional installation is notable in terms of energy efficiency and the use of radio resources.
In the opinion of [19], small cell deployments cause serious electromagnetic interference and energy efficiency problems for the traditional RAN model, especially when there is a temporal fluctuation in the demand for network capacity. As a result of C-RAN virtualization technologies, new reconfigurable solutions can be found, and the access network must dynamically adapt to them. In this study, the authors set out to form an adaptation mechanism in which a new elastic framework for resource utilization is able to take advantage of the software-defined wireless network (SDWN) paradigm and adjust the height of virtual BSs (VBSs) and the transmission power.
The potential for resource virtualization in a C-RAN was investigated by [20], and their study included the following: a) wireless interface virtualization through an algorithm, b) traffic-aware joint scheduling, which is responsible for the contracts between the virtual operators (VOs) and infrastructure providers (InPs), and c) the application of collective programming to maximize spectral efficiency. Interface resources are dynamically allocated between different VOs by means of hyper-vision, which takes the impact of the schemes for the transmission/data plan into account.
With regard to hardware resource provisioning at the level of the RRH-BBU combination in the 5G network, several papers have provided solutions that are also focused on the QoS, the amount of blocked calls and the number of physical resource blocks (PRBs) [21]. In [22], a dynamic mapping algorithm for the RRH-BBU combination in the C-RAN architecture was studied, and a simulation was carried out to determine the performance of the allocation scheme. The authors of [23] sought to solve the problem of RRH mapping and optimization so that the network power consumption could be reduced in a user-centred C-RAN based on a multiple-input and multipleoutput (MIMO) system.
As discussed in this section, several papers have been concerned with load balancing in both centralized and decentralized architectures, data flow fluctuation and the dimensioning models of the network. However, there are still gaps in the scientific literature, and several areas have still not been fully investigated. These include resource constraints in hybrid architectures, the management of radio resources in the baseband unit and the dynamic provisioning of resources (in the literature, this is characterized by the shutdown of antennas, especially with regard to network traffic flexibility in an area of varying flow with RRHs).
The main research contribution of this paper is to design a heuristic for dimensioning hardware resources in the current network architecture, where the efficiency of BBUlevel resources is maximized and the blocking probability is reduced by provisioning resources in accordance with user demand. The heuristic is evaluated through simulations that are carried out with real data to ensure that the method is suited to scenarios with high data traffic variability. In addition, it is able to implement efficient resource provisioning through redistribution between adaptive antennas, thus mitigating the effects of data load imbalance and the wasted resources in the network caused by the tidal effect.

III. SYSTEM MODEL
It can be assumed that the fluctuation of data traffic throughout the day is a natural phenomenon brought about by user mobility over time; this means that it is a crucial factor for studies on network load dimensioning and resource provisioning. In this model, a tool that stores the locations of users through social networks, or location-based social networks [24], was used to store user check-in information in New York and create a database with the fluctuations so that traffic could be analysed at times of high/low data flow. The collection was compiled from April 2012 to February 2013 and resulted in a sample of 227,428 positions (longitude and latitude).

A. System Topology
The mobile network load gives rise to a fluctuation in the BS throughout the day (as seen in Fig. 1). The tidal effect can be noted in any network architecture, including the C-RAN. Since the BBU pool concentrates processing power, it must be located in a position where the number of RRHs that can be served by it is maximized, thus ensuring that the use of the hardware resources is optimized by reducing the number of idle processing units.
The dynamic behaviour of data traffic in the network depends on a set of distributed resources that meets all the network requirements. These include the operation of RRHs to cover extensive territorial areas, as well as small areas where there is a minimum amount of resources wasted, to maximize the dynamic allocation process and reduce offloading in the BS.
The purpose of the model is to provide resources to all the urban centres in New York City by responding to user demand; moreover, the model is based on the required data flow, and it dynamically allocates resources. The hardware adaptation layer is a mechanism that optimizes network functions in the sense that the infrastructure is adapted to meet the demands of high/low traffic density, avoid bottlenecks and optimize the use of resources. Each sector has features, such as areas that have a great volume of users (commercial centres, food courts, shopping centres, among others), that affect the number of users throughout the day (Fig. 1).

B. Evaluation Methodology
In radio access networks, a dense deployment of RRHs provides a possible means for the network to adapt to traffic demand. By centralizing processing units in the BBU pool, there can be reductions in power consumption and RRH complexity that significantly reduce infrastructure costs and increase network capacity [25]. This study includes data on the positioning of UE (user equipment) in 24 periods, and by converting the data units into hours of the day, a more meaningful sample can be obtained than that from the data in units of days, and accurate information about demographic density can be provided, as seen in (Fig. 2). Fig. 2 represents the average number of connected users in New York over 24 hours during the evaluated period. The objective of the survey is to evaluate the variation in the number of users in the network and, based on this variation, to assess the provisioning of resources. The demand for resources in certain regions and at peak times is high, resulting in many active antennas, while in less dense regions, fewer BSs need to be activated, and thus, the use of idle resources can be avoided.

C. Formulation
The path loss can be defined as the difference between the transmitted signal strength and received signal strength at varying distances between the transmitting antenna and the receiving antenna. One of the main failings of propagation is that due to the increase in the distance between transmission nodes, the signal power may be reduced (depending on the type of environment in which the network is deployed [26], such as cities with large buildings, shade and cell interference).
In mobile network planning, the path loss for the deployment should be estimated, while the cell coverage can be determined on the basis of macro BS, micro BS, effective isotropic radiated power, radio frequency modulation and coding techniques [27]. Large-scale path loss can be estimated with the aid of the Stanford University Interim (SUI) model [28] for carrier frequencies above 2 GHz [29]. The downlink signal-to-interference-plus-noise ratio (SINR) for a given subcarrier N assigned to user k in the SC to which it is connected is expressed as: Where P k, b(k) is the power received in the subcarrier N of user k by the BS b (k) that serves it, σ 2 is the thermal noise power and I k is the intercellular interference of the SCs. It is assumed that all the SCs are transmitted with maximum power P . The power received by the user k of the BS b (k) can be calculated by means of Equation 2, which expresses the transmitted power and the fading of the signal. (2) In Equation 2, the value of P k, b(k) is a function of the three values calculated by the following equations: In Equation 5, d is the distance from the antenna to the measured point in metres (d o is equal to 1 metre according to [23]); λ is the wavelength in metres; γ is the path loss exponent; h b is the height of the base station, which can be between 10 and 80 metres; A, B and C are constants that depend on the type of terrain used in the scenario (C was used; A = 3.6, B = 0.005 and C = 20); S is the shading effect, which can be between 8.2 and 10.6 dB.
In Equation 6, B is the system bandwidth. It is assumed that each user reaches the Shannon capacity limit, that is, the data rate for user k as expressed in [23].

D. Heuristic
The selected heuristic is divided into 3 algorithms. The first (Algorithm 1) carries out a UE-RRH assignment with a channel capacity calculator. The second stage (Algorithm 2) shows the quantification of the ports (equivalent to the average number of RRHs needed for offloading), which is required to cover the aggregate throughput established by Algorithm 1. Based on this information, Algorithm 3 displays the allocated UEs, the aggregate throughput, and the number of UEs covered by the macro.

Algorithm 1: UE-RRH Assignment
Require: List of RRHs (S t ), User (u); 1: for all r ∈ S t do 2: Allocate UE in the nearest r; 3: end for 4: for all u ∈ UE do 5: Updates SINR u and calculates Data Rate (Shannon) of u according to [23]; 6: end for 7: for all r ∈ S t do 8: Calculate Aggregate Throughput A r ; 9: end for 10: return Aggregate Throughput A for each A r |r∈ S t The method for allocating the RRHs (S t ) and users (u), and the maximum capacity of each UE when noise interference is taken into account can be seen from lines 1 to 3 of Algorithm 1. The distribution of users is incorporated into the algorithm based on the behaviours of the users of New York, and this distribution characterizes the tidal effect on the network. At the end of this stage, the available resources in each RRH (the PRBs) are distributed equally between all the UEs connected to that RRH, which is adjusted according to the propagation model.
Following this, the values of the SINR and the data rate (DR) are calculated for all RRHs and UEs in (lines 5 to 6) of Algorithm 1. The aggregate throughput (A r ) calculated for this scenario can be seen in (lines 7 to 8). At the end of this stage, a list is of the available RRHs and their aggregate throughputs are compiled (line 10). The data output from Algorithm 1 serves as an input parameter for Algorithm 2, which is described below. Sort S t by A in descending order or 3: Sort S t by the number 4: of UEs in descending order; 5: while RRH AggregateThroughputTEMP < DefinedAggregateThroughput(%) do UE-RRH-Macro Detection (S t , S m ); 10: until All UEs are allocated 11: return Output Two criteria are adopted to determine which RRHs must be connected: (i) the approach in which the RRHs with the highest aggregated throughputs are prioritized; and (ii) the approach where the RRHs with the largest numbers of users are prioritized. Both approaches must respect the boundaries of the PRBs in each RRH available in the scenario. In Algorithm 3, the number of ports and the scenario that is being studied are taken as inputs. Here, all the RRHs are deployed, and the UEs can be allocated to any of these RRHs; at this stage, network dimensioning is carried out on the basis of user demand (lines 1 to 2). The aggregate throughput is calculated again since a new distribution has been generated (lines 3 to 5).
The RRHs with the highest numbers of users or with the highest aggregated throughputs are selected on the basis of the number of ports taken as input. Then, the RRHs are eliminated, and a new allocation of UEs and RRHs is carried out (lines 6 to 9). The UEs not covered by RRHs must be covered by the macro BS (line 10). After the execution of all the algorithms that make up the heuristic, the cycle ends in a way that is displayed in the flowchart shown in Fig. 3.
The flowchart illustrates how the various stages of the heuristic are interconnected (Fig. 3). The process is divided into 3 phases with specific steps starting with Algorithm 1, where the scenario is created with all its specifications and instantiated components.
Step 2 corresponds to Algorithm 2, which handles the processing and execution of the management policies in the scenario, including the connection of RRHs with the most active users or the higher aggregate throughputs. After it has met the required conditions and stayed within the limits of the PRBs, the process enters its final phase (Algorithm 3), in which the RRHs are deployed and the UEs are served.

IV. RESULTS
The dimensioning of hardware resources is assessed through mathematical modelling, which is carried out using MATLAB R Software. A computer configured with an Intel (R) Core (TM) i5-3210 M dual-core CPU @ 2.50 GHz (containing 4 logic processors) and 10 GB of RAM is used for the simulations.
The baseline scenario is normalized to the 4km extension and has the same percentage of users as that provided by the New York City database. One hundred SCs are distributed evenly across the scenario together with a macrocell (D-RAN) and 500 UEs that share the same QoS requirements. Table  I shows the configuration of all the parameters used in the modelling process. This work is based on the movements of users and the coverage of these users over a period of 24 hours, with each hour representing an average period of the experiment. Since the users find themselves constantly in motion in New York, this means that the data flow is constantly changing as a result of user migration.

A. Tidal Effect Approach
The tidal effect naturally leads to variability and fluctuations in network traffic. A New York metadata assessment provided an estimate of the average value of the number of UEs connected over 24 periods by measuring the average value of the number of UEs covered every hour (Fig. 4). The thresholds representing the total capacity of the scenario were weighted at 20%, 40%, 60%, 80%, and 100% of the aggregated throughput. Fig. 4 shows the behaviours of the users in terms of the user approach, where the number of connected users fluctuated greatly throughout the day. A probabilistic analysis was conducted as a part of this approach to determine the relation between the increase in data traffic (the consequence of the tidal effect), with the variation of the data rate in the worst case scenario (with 100% aggregate flow) (see Fig. 5). A 95% confidence interval was calculated, and on the basis of this, it was determined that in times of high data traffic, there was a greater distance from the average than in times of low data traffic, which means that the traffic variation had been intensified along with the variability of the data rate.
As shown in Fig. 6, between 8, 15, and 21 hours, there was a significant increase in the number of active UEs in the network. For the second approach, the same parameters were adopted, and a similar variation was obtained, although fewer users were included in each aggregate throughput. For the aggregate throughput approach case (Fig. 6), with regard to the average behaviour of users over a period of 24 hours, it should be noted that the peak times remained the same as in the previous approach, although there was a fall in the number of active UEs, specifically for the cases using 60%, 80% and 100% of the aggregate throughput. For this approach, a probabilistic analysis was conducted to obtain a confidence interval, as seen in Fig. 7. The results show that the user approach serves the most subscribers and meets the minimum QoS requirements. The gap between the interval and the average in the figure can be explained by the variation in the data rate of the users connected during that period (Fig. 7). In the case of the throughput approach, the behaviour of the confidence interval was most accentuated at peak times, which means that the user rate had high variability, reflected either in the service used or in the tidal effect on the network.

B. Provisioning Approaches
The aforementioned architecture was evaluated to determine whether there is a need for increased investment in the deployment of RRHs and thus to allow for the optimization of the network planning process. This involved defining how many SCs are necessary to support the dynamic traffic of the network, since the network suffers from a high degree of interference caused by the tidal effect and is therefore of paramount importance. Fig. 8 shows the average number of connected users per SC for each aggregate throughput level.
It should be noted that to cover the high capacity required in this scenario, a large number of antennas were activated, thereby reducing the average number of UEs per antenna. In the scenarios with low aggregate throughputs (20%, 40%, and 60%), it was most significant that the user approach maintained an average number of connected users per antenna.
When there was an increase in aggregate capacity (above 60%), the throughput approach achieved the best performance because it had a higher average rate of users per antenna than the user approach. With regard to the maximum load in the network, the approaches obtained the same average number of connected users and thus, all the antennas were activated. The efficiency of the allocation scheme can be evaluated based on a heuristic that has two provisioning approaches (user and throughput). If their respective performances are analysed, it can be seen that when seeking to obtain the same capacity as that of the previous scenarios, the flow approach led to a reduction of, on average, 3% of the number of active antennas (Fig. 9). It should be noted that at the full load (100% of the aggregate throughput), it was necessary to activate all the SCs in the scenario in both approaches analysed (Fig. 9). In general, the traffic load fluctuates over time, especially when the network is operating under high traffic conditions. If there is a reduced load, the dynamic reallocation of resources is more efficient (through the adaptive allocation scheme for the other RRHs with the highest traffic loads at that time), and as a result, the problem of the load imbalance caused by the tidal effect can be solved.

C. Blocked User Probability (Disconnected Users)
The blocked user probability approach was adopted to determine the percentage of users that were not served by the C-RAN. For a more careful evaluation and comparison of these approaches, the blocked user probability was estimated (Table II) by analysing the performance of each approach with regard to its effectiveness. As seen in Table II, the user approach was adopted in a way that was satisfactory, since it showed an average reduction of 3.7% compared with that of the throughput approach. This shows that although both approaches aimed to offload a percentage of the macro load (20%, 40%, 60%, and so on), different users were chosen and, as a result, different average data rates were obtained (as shown previously). For scenarios with high densities or 100 % capacity, both approaches obtained equal P B values due to various factors (such as having all antennas active and covering a high number of covered users).

V. CONCLUSION
The hardware dimensioning procedure in the migration process from the D-RAN architecture to the C-RAN architecture plays a key role in (and has a direct effect on) resource provisioning for future mobile networks, especially in the planning and operational phases. Therefore, this paper analysed the behavior of network traffic in New York City and found that traffic variability creates challenges for hybrid architectures. A heuristic with two approaches (one based on the aggregate throughput and the other on the number of connected users) was recommended for the dynamic provisioning of hardware resources in response to demand.
In scenarios with high density, the throughput approach was most efficient and met the network demand with a reduction of 13% in the number of active antennas, while the user approach needed more active antennas to serve the same percentage of users. With regard to the blocked user probability, the user approach was more efficient and achieved a reduction of 3.7% in this analysis. This means that the probability of users being blocked is lower when the number of users logged into the network scenario is taken into account.
The results show that it is possible to dimension the number of SCs to meet network demand by applying a dynamic resource allocation scheme during times of low/high traffic density. In future work, other approaches for developing the resource provisioning scheme, such as a machine learning heuristic, could be adopted to carry out the sizing rules in accordance with user distributions. In addition, scenarios with even wider traffic variability than the ones used in this study should be investigated for future developments together with other parameters, such as energy efficiency, and an operational cost assessment.