Reinforcement Learning-based Wi-Fi Contention Window Optimization

Main Article Content

Sheila de Cássia S. Janota
Messaoud Ahmed Ouameur
Felipe Augusto Pereira de Figueiredo


The collision avoidance mechanism adopted by the IEEE 802.11 standard is not optimal. The mechanism employs a binary exponential backoff (BEB) algorithm in the medium access control (MAC) layer. Such an algorithm increases the backoff interval whenever a collision is detected to minimize the probability of subsequent collisions. However, the increase of the backoff interval causes degradation of the radio spectrum utilization (i.e., bandwidth wastage). That problem worsens when the network has to manage the channel access to a dense number of stations, leading to a dramatic decrease in network performance. Furthermore, a wrong backoff setting increases the probability of collisions such that the stations experience numerous collisions before achieving the optimal backoff value. Therefore, to mitigate bandwidth wastage and, consequently, maximize the network performance, this work proposes using reinforcement learning (RL) algorithms, namely Deep Q Learning (DQN) and Deep Deterministic Policy Gradient (DDPG), to tackle such an optimization problem. In our proposed approach, we assess two different observation metrics, the average of the normalized level of the transmission queue of all associated stations and the probability of collisions. The overall network's throughput is defined as the reward. The action is the contention window (CW) value that maximizes throughput while minimizing the number of collisions. As for the simulations, the NS-3 network simulator is used along with a toolkit known as NS3-gym, which integrates a reinforcement-learning (RL) framework into NS-3. The results demonstrate that DQN and DDPG have much better performance than BEB for both static and dynamic scenarios, regardless of the number of stations. Additionally, our results show that observations based on the average of the normalized level of the transmission queues have a slightly better performance than observations based on the collision probability. Moreover, the performance difference with BEB is amplified as the number of stations increases, with DQN and DDPG showing a 45.52\% increase in throughput with 50 stations. Furthermore, DQN and DDPG presented similar performances, meaning that either one could be employed.

Article Details

How to Cite
S. Janota, S. de C., Ahmed Ouameur, M., & de Figueiredo, F. A. P. (2023). Reinforcement Learning-based Wi-Fi Contention Window Optimization. Journal of Communication and Information Systems, 38(1).
Regular Papers
Author Biographies

Sheila de Cássia S. Janota, National Institute of Telecommunications (INATEL)

Sheila C. da S. J. Cruz received a bachelor’s degree in computer engineering from the National Institute of Telecommunications (Inatel), Brazil, in 2016. She is currently working towards completing her master’s degree at Inatel. Her research interests include digital communications, Wi-Fi, link adaptation, and machine learning.

Messaoud Ahmed Ouameur, Universit´e du Qu´ebec `a Trois-Rivi`eres (UQTR)

Messaoud Ahmed Ouameur received a bachelor’s degree in electrical engineering from the Institute national d’électronique et d'électricité (INELEC), Boumerdes, Algeria, in 1998, the M.B.A. degree from the Graduate School of International Studies, Ajou University, Suwon, South Korea, in 2000, and the master’s and Ph.D. degrees (Hons.) in electrical engineering from the Universit´e du QuéLebec `a Trois- Rivi`eres (UQTR), QC, Canada, in 2002 and 2006, respectively. He has been a Regular Professor at UQTR since 2018. His research interests include embedded real-time systems, parallel and distributed processing with applications to distributed Massive MIMO, deep learning and machine learning for communication system design, and the Internet of Things with an emphasis on end-to-end systems prototyping and edge computing.

Felipe Augusto Pereira de Figueiredo, National Institute of Telecommunications (INATEL)

Felipe A. P. de Figueiredo received the B.Sc. and M.Sc. degrees in telecommunication engineering from the National Institute of Telecommunications (Inatel), Brazil, in 2004 and 2011, respectively. He received his first Ph.D. degree from the State University of Campinas (UNICAMP), Brazil, in 2019 and the second one from the University of Ghent (UGhent), Belgium, in 2021. He has been working on the research and development of telecommunication systems for more than 15 years. His research interests include digital signal processing, digital communications, mobile communications, MIMO, multicarrier modulations, FPGA development, and machine learning.