基于改进TD3算法的无人机轨迹规划
作者:

UAV Trajectory Planning Based on Improved TD3 Algorithm
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [15]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    深度强化学习算法在无人机的航迹规划任务中的应用越来越广泛, 但是许多研究没有考虑随机变化的复杂场景, 针对以上问题, 本文提出一种基于TD3改进的PP-CMNTD3算法, 提出了一种简单有效的先验策略并且借鉴人工势场的思想设计了密集奖励, 能够更好地引导无人机有效避开障碍物并且快速接近目标点. 仿真结果表明, 算法的改进可以有效提高网络的训练效率以及在复杂场景中的航迹规划表现, 同时能够在不同初始电量的情况下都能够灵活调整策略, 做到在能耗和迅速抵达目的地之间的有效平衡.

    Abstract:

    Deep reinforcement learning algorithms are more and more widely used in UAV trajectory planning tasks, but many studies do not consider complex scenarios of random changes. To address the above problems, this study proposes an improved PP-CMNTD3 algorithm based on TD3, which puts forward a simple and effective prior strategy and draws on the idea of artificial potential fields to design dense rewards. UAVs are better guided to effectively avoid obstacles and swiftly approach target points. Simulation results show that the algorithm improvement can effectively improve the training efficiency of the network and the trajectory planning performance in complex scenarios. At the same time, the strategy can be flexibly adjusted under different initial power levels, achieving an effective balance between energy consumption and rapid arrival at the destination.

    参考文献
    [1] Raptis EK, Krestenitis M, Egglezos K, et al. End-to-end precision agriculture UAV-based functionalities tailored to field characteristics. Journal of Intelligent & Robotic Systems, 2023, 107(2): 23.
    [2] 张宏宏, 甘旭升, 毛亿, 等. 无人机避障算法综述. 航空兵器, 2021, 28(5): 53–63.
    [3] Tang MQ, Sheng JW, Sun SY. A coverage optimization algorithm for underwater acoustic sensor networks based on Dijkstra method. IEEE/CAA Journal of Automatica Sinica, 2023, 10(8): 1769–1771.
    [4] 顾子侣, 刘宇, 岳广, 等. 基于改进RRT算法的快速路径规划. 兵器装备工程学报, 2022, 43(10): 294–299.
    [5] 赵丽华, 万晓冬. 基于改进A*算法的多无人机协同路径规划. 电子测量技术, 2020, 43(7): 72–75, 166.
    [6] 贺勇, 侯体成, 曾子望. 融合改进A*和动态窗口法的无人机路径规划. 机械科学与技术. https://link.cnki.net/urlid/61.1114.th.20231017.1732.016. (2023-10-19).
    [7] Tian Y, Zhu XJ, Meng DS, et al. An overall configuration planning method of continuum hyper-redundant manipulators based on improved artificial potential field method. IEEE Robotics and Automation Letters, 2021, 6(3): 4867–4874.
    [8] Tyler B. Research on obstacle avoidance path selection of AGV based on improved ant colony algorithm. Computer Informatization and Mechanical System, 2023, 6(2): 1–5.
    [9] 王雷, 王艺璇, 李东东, 等. 基于改进遗传算法的移动机器人路径规划研究. 华中科技大学学报(自然科学版), 2024, 52(5): 158–164.
    [10] 周彬, 郭艳, 李宁, 等. 基于导向强化Q学习的无人机路径规划. 航空学报, 2021, 42(9): 325109.
    [11] Moon J, Papaioannou S, Laoudias C, et al. Deep reinforcement learning multi-UAV trajectory control for target tracking. IEEE Internet of Things Journal, 2021, 8(20): 15441–15455.
    [12] 张森, 代强强. 改进型深度确定性策略梯度的无人机路径规划. 系统仿真学报. https://link.cnki.net/urlid/11.3092.V.20240402.1242.004. (2024-04-02).
    [13] Grando RB, de Jesus JC, Kich VA, et al. Double Critic deep reinforcement learning for mapless 3D navigation of unmanned aerial vehicles. Journal of Intelligent & Robotic Systems, 2022, 104: 1–20.
    [14] Fujimoto S, van Hoof H, Meger D. Addressing function approximation error in Actor-Critic methods. Proceedings of the 35th International Conference on Machine Learning. Stockholm: ICML, 2018. 1582–1591.
    [15] Haarnoja T, Zhou A, Abbeel P, et al. Soft Actor-Critic: Off-policy maximum entropy deep reinforcement learning with a stochastic Actor. Proceedings of the 35th International Conference on Machine Learning. Stockholm: ICML, 2018. 1856–1865.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

牟文心,时宏伟.基于改进TD3算法的无人机轨迹规划.计算机系统应用,2024,33(12):197-209

复制
分享
文章指标
  • 点击次数:68
  • 下载次数: 308
  • HTML阅读次数: 71
  • 引用次数: 0
历史
  • 收稿日期:2024-04-28
  • 最后修改日期:2024-06-17
  • 在线发布日期: 2024-10-31
文章二维码
您是第10652772位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号