Acta Photonica Sinica, Volume. 54, Issue 1, 0106003(2025)
UAV UV Information Collection Method Based on Deep Reinforcement Learning
In recent years, Unmanned Aerial Vehicles (UAVs) have been widely used across various fields due to their high mobility, flexibility, and cost-effectiveness. In the civilian fields, UAVs are utilised for activities such as agriculture, environmental monitoring, and search and rescue operations. Conversely, in the military fields, UAVs are employed for a range of purposes, including surveillance, reconnaissance, precision strikes, and target guidance. Ground-based battlefield reconnaissance sensor systems currently deployed by military forces include battlefield reconnaissance radars, magnetic sensors, infrared sensors, vibration sensors, acoustic sensors, and pressure sensors. UAVs are increasingly playing a crucial role in information collection for these ground sensors. Traditional communication methods for information collection typically rely on radio communication, which can be severely disrupted or rendered unusable in environments with electromagnetic shielding or interference. Solar-blind ultraviolet (UV) light, operating within the 200 nm-280 nm wavelength range, offers virtually no background noise in low-altitude airspace and provides all-weather, non-line-of-sight communication capabilities. This makes it an ideal communication method in electromagnetically challenged environments due to its excellent environmental adaptability, high confidentiality, and strong resistance to electromagnetic interference. Compared to line-of-sight UV communication, non-line-of-sight communication does not require precise alignment between the transmitter and receiver and offers greater flexibility in receiver positioning, making it more suitable for collecting information from ground sensors. However, traditional algorithms often face limitations in handling complex information collection tasks, particularly in terms of computational resources, adaptability, and real-time performance. Deep reinforcement learning (DRL) algorithms, as an emerging intelligent decision-making method, enable UAVs to autonomously complete tasks by learning and experimenting within the environment. This makes DRL an ideal approach for autonomous UAV navigation and data collection tasks.This paper addresses the challenge of UAV information collection in the presence of electromagnetic interference by employing an adaptive elevation angle UV non-line-of-sight communication method and utilizing DRL algorithms to tackle the information collection task. First, a UAV mobility model is established, followed by the proposal of a UV non-line-of-sight air-to-ground communication model with variable transmission and reception angles. Detailed modeling of the UAV's energy consumption is then carried out, considering flight energy consumption, the energy consumption of the electro-optical pod, and communication energy consumption. Subsequently, an information collection model is established. This integrated model balances task execution time, energy consumption, and communication quality during the information collection process. Given that the optimization problem is NP-hard, traditional polynomial optimization algorithms are inadequate for solving it. Therefore, this problem is formulated as a Markov decision process. To enable the UAV to make better decisions regarding flight direction, speed, and UV transmission and reception angles, a reward function tailored to the information collection task is designed. This reward function comprehensively considers time, energy, communication path loss, and the UAV's return to base. The Double Deep Q-Network (DDQN) algorithm, which separates action selection from evaluation, still faces overestimation issues in high-dimensional state and action spaces. This paper proposes that in the information collection scenario, the UAV must consider multiple directions and speeds of movement while adaptively adjusting the UV communication angles during the collection process. Compared to previous discrete environments with smaller action spaces, this scenario requires a larger action space. To better adapt to this scenario, improvements such as dual target networks, prioritized experience replay, and entropy regularization are incorporated into the classical DDQN algorithm, enhancing its adaptability and stability.To verify the effectiveness of the improved DDQN algorithm and explore the impact of different UV parameters, sensor quantities, and UAV flight altitudes on information collection time and energy consumption, comparative simulations with the classical DDQN algorithm are conducted. The proposed adaptive elevation angle DDQN algorithm effectively completes the information collection task, demonstrating at least a 13% improvement in time efficiency and a 14% reduction in energy consumption across multiple scenarios compared to the classical DDQN algorithm.
Get Citation
Copy Citation Text
Taifei ZHAO, Jiahao GUO, Yu XIN, Lu WANG. UAV UV Information Collection Method Based on Deep Reinforcement Learning[J]. Acta Photonica Sinica, 2025, 54(1): 0106003
Category: Fiber Optics and Optical Communications
Received: Jul. 17, 2024
Accepted: Aug. 26, 2024
Published Online: Mar. 5, 2025
The Author Email: ZHAO Taifei (zhaotaifei@163.com)