Deep reinforcement learning algorithms are more and more widely used in UAV trajectory planning tasks, but many studies do not consider complex scenarios of random changes. To address the above problems, this study proposes an improved PP-CMNTD3 algorithm based on TD3, which puts forward a simple and effective prior strategy and draws on the idea of artificial potential fields to design dense rewards. UAVs are better guided to effectively avoid obstacles and swiftly approach target points. Simulation results show that the algorithm improvement can effectively improve the training efficiency of the network and the trajectory planning performance in complex scenarios. At the same time, the strategy can be flexibly adjusted under different initial power levels, achieving an effective balance between energy consumption and rapid arrival at the destination.
[3] Tang MQ, Sheng JW, Sun SY. A coverage optimization algorithm for underwater acoustic sensor networks based on Dijkstra method. IEEE/CAA Journal of Automatica Sinica, 2023, 10(8): 1769–1771.
[7] Tian Y, Zhu XJ, Meng DS, et al. An overall configuration planning method of continuum hyper-redundant manipulators based on improved artificial potential field method. IEEE Robotics and Automation Letters, 2021, 6(3): 4867–4874.
[8] Tyler B. Research on obstacle avoidance path selection of AGV based on improved ant colony algorithm. Computer Informatization and Mechanical System, 2023, 6(2): 1–5.
[11] Moon J, Papaioannou S, Laoudias C, et al. Deep reinforcement learning multi-UAV trajectory control for target tracking. IEEE Internet of Things Journal, 2021, 8(20): 15441–15455.
[13] Grando RB, de Jesus JC, Kich VA, et al. Double Critic deep reinforcement learning for mapless 3D navigation of unmanned aerial vehicles. Journal of Intelligent & Robotic Systems, 2022, 104: 1–20.
[14] Fujimoto S, van Hoof H, Meger D. Addressing function approximation error in Actor-Critic methods. Proceedings of the 35th International Conference on Machine Learning. Stockholm: ICML, 2018. 1582–1591.
[15] Haarnoja T, Zhou A, Abbeel P, et al. Soft Actor-Critic: Off-policy maximum entropy deep reinforcement learning with a stochastic Actor. Proceedings of the 35th International Conference on Machine Learning. Stockholm: ICML, 2018. 1856–1865.