The defense will take place on February 26, 2026
Title
Reinforcement Learning for Mobility Optimization in Wireless Sensor Networks: Application to Pollution Monitoring Using Drone Fleets
Abstract
Atmospheric pollution remains a major environmental and public health challenge, affecting both large urban areas and industrial zones. While chronic pollution results from persistent long-term emissions, accidental pollution—caused by sudden events such as chemical leaks or explosions—requires precise monitoring and rapid intervention to mitigate its impacts. Traditional methods, including fixed monitoring stations and satellite observations, provide high-quality measurements for continuous and large-scale pollution monitoring. However, their limited spatial and temporal resolution makes them insufficient for detecting and tracking accidental pollution events, which evolve rapidly in both space and time.
To overcome these limitations, research has increasingly focused on the use of sensors embedded in mobile robotic platforms, notably Unmanned Aerial Vehicles (UAVs). These systems offer unprecedented flexibility and active sampling capabilities, enabling the collection of high-resolution spatio-temporal data in areas that are inaccessible or pose high risks to human operators. Advances in sensors and robotics, combined with recent developments in artificial intelligence—particularly Deep Reinforcement Learning (DRL)—have significantly transformed autonomous UAV control. By integrating path planning and intelligent decision-making, these approaches enable adaptive, multi-agent monitoring, simultaneously optimizing coverage, responsiveness, and operational robustness.
This convergence of low-cost sensing, robotics, and DRL motivated the research in this thesis, which explores this synergy to design autonomous systems for real-time mapping of dynamic phenomena, such as accidental pollution plumes. The contributions lie at the intersection of spatio-temporal modeling and multi-agent planning, aiming to optimize mapping quality as well as the deployment and redeployment of mobile agents. They are organized around two main axes: (1) a DRL approach coupled with a probabilistic Gaussian Process model for active mapping, where a fleet of UAVs learns to explore the most informative areas based on uncertainty-reduction rewards and incorporates connectivity constraints to ensure reliable inter-UAV communications; and (2) a DRL approach combined with data assimilation, designed to improve both the accuracy and speed of mapping while accounting for communication constraints and the dynamics of the observed phenomenon.
This thesis introduces a new modular framework that combines the spatio-temporal modeling of dynamic phenomena with real-time anticipatory path planning for cooperative UAVs. By jointly optimizing informativeness and communication through DRL, our approach provides a monitoring strategy that is both robust and generalizable.
Jury
– M. Marcelo DIAS DE AMORIM, Directeur de Recherche – CNRS – Rapporteur
– M. André-Luc BEYLOT, Professeur des Universités – Toulouse INP / ENSEEIHT – Rapporteur
– Mme. Isabelle GUERIN-LASSOUS, Professeure des Universités – Université Claude Bernard Lyon 1 – Examinatrice
– Mme. Christelle CAILLOUET, Maîtresse de Conférences (HDR) – Université Côte d’Azur – Examinatrice
– M. Grégoire DANOY, Chercheur Scientifique (HDR) – Université de Luxembourg – Examinateur
– M. Walid BECHKIT, Maître de Conférences (HDR) – INSA-LYON – Directeur de thèse
– M. Hervé RIVANO, Professeur des Universités – INSA-LYON – Co-directeur de thèse
