TY - JOUR
AU - S. Janota, Sheila de Cássia
AU - Ahmed Ouameur, Messaoud
AU - de Figueiredo, Felipe Augusto Pereira
PY - 2023/09/15
Y2 - 2024/06/25
TI - Reinforcement Learning-based Wi-Fi Contention Window Optimization
JF - Journal of Communication and Information Systems
JA - Journal of Communication and Information Systems
VL - 38
IS - 1
SE - Regular Papers
DO - 10.14209/jcis.2023.15
UR - https://jcis.sbrt.org.br/jcis/article/view/860
SP -
AB - <p>The collision avoidance mechanism adopted by the IEEE 802.11 standard is not optimal. The mechanism employs a binary exponential backoff (BEB) algorithm in the medium access control (MAC) layer. Such an algorithm increases the backoff interval whenever a collision is detected to minimize the probability of subsequent collisions. However, the increase of the backoff interval causes degradation of the radio spectrum utilization (i.e., bandwidth wastage). That problem worsens when the network has to manage the channel access to a dense number of stations, leading to a dramatic decrease in network performance. Furthermore, a wrong backoff setting increases the probability of collisions such that the stations experience numerous collisions before achieving the optimal backoff value. Therefore, to mitigate bandwidth wastage and, consequently, maximize the network performance, this work proposes using reinforcement learning (RL) algorithms, namely Deep Q Learning (DQN) and Deep Deterministic Policy Gradient (DDPG), to tackle such an optimization problem. In our proposed approach, we assess two different observation metrics, the average of the normalized level of the transmission queue of all associated stations and the probability of collisions. The overall network's throughput is defined as the reward. The action is the contention window (CW) value that maximizes throughput while minimizing the number of collisions. As for the simulations, the NS-3 network simulator is used along with a toolkit known as NS3-gym, which integrates a reinforcement-learning (RL) framework into NS-3. The results demonstrate that DQN and DDPG have much better performance than BEB for both static and dynamic scenarios, regardless of the number of stations. Additionally, our results show that observations based on the average of the normalized level of the transmission queues have a slightly better performance than observations based on the collision probability. Moreover, the performance difference with BEB is amplified as the number of stations increases, with DQN and DDPG showing a 45.52\% increase in throughput with 50 stations. Furthermore, DQN and DDPG presented similar performances, meaning that either one could be employed.</p>
ER -