智能车辆规划与控制策略学习方法综述

姓名
邮箱
手机号码
标题
留言内容
验证码

doi:10.15918/j.tbit1001-0645.2022.095

龚建伟^1,,
龚乘¹,
林云龙¹,
李子睿^{1, 2},
吕超^1,,

1.
bob手机在线登陆机械与车辆学院, 北京　100081
2.
代尔夫特理工大学, 荷兰, 代尔夫特　2628CN

详细信息

作者简介:
龚建伟（1969-），男，博士，教授，E-mail：gongjianwei@bit.edu.cn

通讯作者:
吕超（1980-），男，博士，副教授，E-mail：chaolu@bit.edu.cn

中图分类号:TP18,U461
计量
- 文章访问数:505
- HTML全文浏览量:136
- PDF下载量:140
- 被引次数:0
出版历程
- 收稿日期:2022-04-10
- 网络出版日期:2022-07-11

Review on Machine Learning Methods for Motion Planning and Control Policy of Intelligent Vehicles

1.
School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081
2.
Delft University of Technology, Delft,Holland,　2628CN

摘要

摘要:智能车辆相关技术已实现了长足的发展，并已能够在有限封闭场景中实现自主行驶的基本功能. 然而，实际道路测试结果表明，目前智能车辆技术仍存在较多局限，而智能车辆在复杂城市与越野环境的大规模应用仍面临较多挑战. 作为智能车辆关键技术之一，运动规划与控制技术已基本建立了完整的理论体系并已得到较多工程应用，但传统方法在实际应用中仍存在动态复杂场景理解能力弱、场景适应性差、模型复杂度高、参数调整难度大等缺陷. 由于机器学习方法具备较强的知识表征与模型拟合能力，其已经在智能车辆的感知与导航技术中得到了广泛的应用. 而为了解决传统运动规划与控制技术存在的泛化性与适用性等问题，许多研究者近年来也开始探索基于深度学习、强化学习等机器学习方法的运动规划与控制方法. 本文将对目前基于机器学习的智能车辆规划与控制方法研究现状进行回顾，从规划与控制策略基本架构、基本学习范式以及基于学习的规划与控制方法三方面对现有智能车辆规划与控制策略学习方法进行分析，最后对研究现状与未来发展方向进行总结与展望.
- 智能车辆/
- 机器学习/
- 运动规划与控制/
- 模型预测控制
Abstract:Intelligent vehicles have achieved a considerable development in technologies and can fulfill the basic functions of autonomous driving in a limited closed environment. However, results of actual road tests show that the current technologies of intelligent vehicles still have many limitations and their large-scale application in complex urban and off-road environments still faces many challenges. As one of the key technologies, the motion planning and control technology has basically formed a complete theoretical system and has been widely applied in engineering. However, the traditional methods still have some defects in practical application, such as the inability of understanding dynamic and complex scenes, poor adaptability for different scenes, high complexity of the model, and difficulty in parameter tuning. Due to the strong ability in knowledge representation and model fitting, machine learning methods have been widely applied in perception and navigation technology for intelligent vehicles. In order to solve the problems of generalization and applicability in traditional motion planning and control techniques, many researchers have also devoted themselves to exploring the usage of deep learning, reinforcement learning, and so on machine learning methods in motion planning and control policy for intelligent vehicles. In this paper, machine learning-based methods were reviewed for motion planning and control in intelligent vehicles, analyzing the existing policy learning methods for motion planning and control from three aspects, including basic framework, basic learning paradigms, and different planning and control methods based on learning. Finally, the research status and future development directions were summarized and prospected.
- intelligent vehicles(IV)/
- machine learning/
- motion planning and control/
- model predictive control.

HTML全文

图 1本文各节逻辑架构

Figure 1.Logic framework of this paper

下载: 全尺寸图片幻灯片

参考文献 (60)

[1]	NARANJO J E, GONZALEZ C, GARCIA R, et al. Lane-change fuzzy control in autonomous vehicles for the overtaking maneuver[J]. IEEE Transactions on Intelligent Transportation Systems, 2008, 9(3):438 − 450.doi:10.1109/TITS.2008.922880
[2]	KLANČAR G, ŠKRJANC I. Tracking-error model-based predictive control for mobile robots in real time[J]. Robotics and Autonomous Systems, 2007, 55(6):460 − 469.doi:10.1016/j.robot.2007.01.002
[3]	VERSCHUEREN R, ZANON M, QUIRYNEN R, et al. Time-optimal race car driving using an online exact hessian based nonlinear mpc algorithm[C]//Proceedings of 2016 European Control Conference (ECC), [S.l.]: ECC, 2016: 141 − 147.
[4]	VERSCHUEREN R, FERREAU H J, ZANARINI A, et al. A stabilizing nonlinear model predictive control scheme for time-optimal point-to-point motions[C]//Proceedings of 2017 IEEE 56th Annual Conference on Decision and Control (CDC), 12−15 Dec. [S.l.]: IEEE, 2017: 2525 − 2530.
[5]	刘凯, 龚建伟, 陈舒平, 等. 高速无人驾驶车辆最优运动规划与控制的动力学建模分析[J]. 机械工程学报, 2018, 54(14):141 − 151. LIU Kai, GONG Jianwei, CHEN Shuping, et al. Dynamics modeling analysis of optimal motion planning and control of high-speed unmanned vehicles[J]. Journal of Mechanical Engineering, 2018, 54(14):141 − 151. (in Chinese)
[6]	KAPANIA N R, GERDES J C. Design of a feedback-feedforward steering controller for accurate path tracking and stability at the limits of handling[J]. Vehicle System Dynamics, 2015, 53(12):1687 − 1704.doi:10.1080/00423114.2015.1055279
[7]	WURMAN P R, BARRETT S, KAWAMOTO K, et al. Outracing champion gran turismo drivers with deep reinforcement learning[J]. Nature, 2022, 602(7896):223 − 228.doi:10.1038/s41586-021-04357-7
[8]	CHEN J, YUAN B, TOMIZUKA M. Model-free deep reinforcement learning for urbanautonomous driving[C]//Proceedings of 2019 IEEE Intelligent Transportation Systems Conference. [S.l]: ITSC, 2019: 2765 − 2771.
[9]	PAN Y, CHENG C-A, SAIGOL K, et al. Imitation learning for agile autonomous driving[J]. The International Journal of Robotics Research, 2019, 39(2-3):286 − 302.
[10]	SAMSAMI M R, BAHARI M, SALEHKALEYBAR S, et al. Causal imitative model for autonomous driving [J]. arXiv preprint arXiv: 211203908, 2021.
[11]	BOJARSKI M, DEL TESTA D, DWORAKOWSKI D, et al. End to end learning for self-driving cars[J]. arXiv preprint arXiv: 160407316, 2016.
[12]	KRETZSCHMAR H, SPIES M, SPRUNK C, et al. Socially compliant mobile robot navigation via inverse reinforcement learning[J]. The International Journal of Robotics Research, 2016, 35(11):1289 − 1307.doi:10.1177/0278364915619772
[13]	MULLER U, BEN J, COSATTO E, et al. Off-road obstacle avoidance through end-to-end learning[J]. Proceedings of the 18th International Conference on Neural Information Processing Systems, [S.l.]:MIT Press,2005:5-8.
[14]	JI X, HE X, LV C, et al. Adaptive-neural-network-based robust lateral motion control for autonomous vehicle at driving limits[J]. Control Engineering Practice, 2018, 76:41 − 53.
[15]	LI Z, GONG J, LU C, et al. Interactive behavior prediction for heterogeneous traffic participants in the urban road: a graph-neural-network-based multitask learning framework[J]. IEEE/ASME Transactions on Mechatronics, 2021, 26(3):1339 − 1349.doi:10.1109/TMECH.2021.3073736
[16]	LU C, HU F, CAO D, et al. Transfer learning for driver model adaptation in lane-changing scenarios using manifold alignment[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(8):3281 − 3293.doi:10.1109/TITS.2019.2925510
[17]	PÉREZ-HIGUERAS N, CABALLERO F, MERINO L. Learning human-aware path planning with fully convolutional networks[C]//Proceedings of 2018 IEEE International Conference on Robotics and Automation, [S.l.]: IEEE,2018: 5897 − 5902.
[18]	XING Y, LÜ C, WANG H, et al. An ensemble deep learning approach for driver lane change intention inference[J]. Transportation Research Part C: Emerging Technologies, 2020, 115:102615.
[19]	LI Z, LIN Y, GONG C, et al. An ensemble learning framework for vehicle trajectory prediction in interactive scenarios[J/OL]. 2022, https://arxiv.org/abs/2202.10617.
[20]	KIM B, PINEAU J. Socially adaptive path planning in human environments using inverse reinforcement learning[J]. International Journal of Social Robotics, 2016, 8(1):51 − 66.doi:10.1007/s12369-015-0310-2
[21]	HENRY P, VOLLMER C, FERRIS B, et al. Learning to navigate through crowded environments[C]//Proceedings of 2010 IEEE International Conference on Robotics and Automation. [S.l.]: IEEE, 2010: 981 − 986.
[22]	STEIN G J, BRADLEY C, ROY N. Learning over subgoals for efficient navigation of structured, unknown environments[R]. Conference on Robot Learning. [S.l.]:PMLR,2018: 213 − 222.
[23]	WIGNESS M, ROGERS J G, NAVARRO-SERMENT L E. Robot navigation from human demonstration: learning control behaviors[C]. 2018 IEEE International Conference on Robotics and Automation (ICRA). [S.l.]:IEEE,2018: 1150 − 1157.
[24]	ATTIA A, DAYAN S. Global overview of imitation learning[J]. arXiv preprint arXiv: 180106503, 2018.
[25]	WANG B, GONG J, CHEN H. Motion primitives representation, extraction and connection for automated vehicle motion planning applications[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(9):3931 − 3945.doi:10.1109/TITS.2019.2941859
[26]	王博洋, 龚建伟, 张瑞增, 等. 基于真实驾驶数据的运动基元提取与再生成[J]. 机械工程学报, 2020, 56(16):155 − 165.doi:10.3901/JME.2020.16.155 WANG Boyang, GONG Jianwei, ZHANG Ruizeng et al. Motion primitives extraction and regeneration based on real driving data[J]. Journal of Mechanical Engineering, 2020, 56(16):155 − 165.doi:10.3901/JME.2020.16.155
[27]	BANSAL M, KRIZHEVSKY A, OGALE A. Chauffeurnet: learning to drive by imitating the best and synthesizing the worst[J]. arXiv preprint arXiv: 181203079, 2018.
[28]	ZHOU J, WANG R, LIU X, et al. Exploring imitation learning for autonomous driving with feedback synthesizer and differentiable rasterization[C]//Proceedings of 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).[S.l.]: IEEE,2021: 1450 − 1457.
[29]	ZHANG J, OHN-BAR E. Learning by watching[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. [2021]: https://arxiv.org/abs/2106.05966 .
[30]	RHINEHART N, MCALLISTER R, LEVINE S. Deep imitative models for flexible inference, planning, and control[J]. arXiv preprint arXiv: 181006544, 2018.
[31]	WILLIAMS R J. Reinforcement-learning connectionist systems[M]. [S.l.]: College of Computer Science, Northeastern University, 1987.
[32]	DESJARDINS C, CHAIB-DRAA B. Cooperative adaptive cruise control: a reinforcement learning approach[J]. IEEE Transactions on Intelligent Transportation Systems, 2011, 12(4):1248 − 1260.doi:10.1109/TITS.2011.2157145
[33]	CHAE H, KANG C M, KIM B, et al. Autonomous braking system via deep reinforcement learning[C]//Proceedings of 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC). [S.l.]:IEEE,2017: 1 − 6.
[34]	WATKINS C, DAYAN P. Q-Learning[M]//Machine Learning, Boston: Kluwer Academic Publishers, 1992, 8: 279 − 292.
[35]	ZHANG J, SPRINGENBERG J T, BOEDECKER J, et al. Deep reinforcement learning with successor features for navigation across similar environments[C]//Proceedings of 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). [S.l.]: IEEE, 2017: 2371 − 2378.
[36]	ZHELO O, ZHANG J, TAI L, et al. Curiosity-driven exploration for mapless navigation with deep reinforcement learning[J]. arXiv preprint arXiv: 180400456, 2018,
[37]	CHEN Y F, LIU M, EVERETT M, et al. Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning[C]//Proceedings of 2017 IEEE International Conference On Robotics And Automation (ICRA). [S.l.]: IEEE, 2017: 285 − 292.
[38]	KUUTTI S, BOWDEN R, JIN Y, et al. A survey of deep learning applications to autonomous vehicle control[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 22(2):712 − 733.
[39]	MNIH V, BADIA A P, MIRZA M, et al. Asynchronous methods for deep reinforcement learning[C]//Proceedings of the 33rd International Conference on International Conference on Machine Learning. [S.l.]: ICML, 2016: 1928 − 1937.
[40]	SALLAB A E, ABDOU M, PEROT E, et al. End-to-end deep reinforcement learning for lane keeping assist[J]. [2016]. https://arxiv.org/abs/1612.04340.
[41]	ZHAO D, HU Z, XIA Z, et al. Full-range adaptive cruise control based on supervised adaptive dynamic programming[J]. Neurocomputing, 2014, 125:57 − 67.
[42]	WANG B, ZHAO D, LI C, et al. Design and implementation of an adaptive cruise control system based on supervised actor-critic learning[C]. 2015 5th International Conference on Information Science and Technology (ICIST). IEEE: 243 − 248.
[43]	PFEIFFER M, SCHAEUBLE M, NIETO J, et al. From perception to decision: a data-driven approach to end-to-end motion planning for autonomous ground robots[C]//Proceedings of 2017 Ieee International Conference On Robotics and Automation (ICRA). [S.l.]: IEEE: 2017: 1527 − 1533.
[44]	WULFMEIER M, RAO D, WANG D Z, et al. Large-scale cost function learning for path planning using deep inverse reinforcement learning[J]. The International Journal of Robotics Research, 2017, 36(10):1073 − 1087.doi:10.1177/0278364917722396
[45]	SADIGH D, LANDOLFI N, SASTRY S S, et al. Planning for cars that coordinate with people: leveraging effects on human actions for planning and active information gathering over human internal state[J]. Autonomous Robots, 2018, 42(7):1405 − 1426.doi:10.1007/s10514-018-9746-1
[46]	肖浩, 廖祝华, 刘毅志, 等. 实际环境中基于深度Q学习的无人车路径规划[J]. 山东大学学报(工学版), 2021, 51(1):100 − 107. XIAO Hao, LIAO Zhuhua, LIU Yizhi, et al. Unmanned vehicle path planning based on deep q learning in real environment[J]. Journal of Shandong University (Engineering Science), 2021, 51(1):100 − 107.
[47]	刘磊, 杨晔, 刘赛, 等. 基于生存理论训练机器学习的智能驾驶路径生成方法[J]. 控制与决策, 2020, 35(10):2433 − 2441. LIU Lei, YANG Ye, LIU Sai, et al. Path generation method for intelligent driving based on machine learning trained by viability theory[J]. Control and Decision, 2020, 35(10):2433 − 2441.
[48]	LIU B, XIAO X, STONE P. A lifelong learning approach to mobile robot navigation[J]. IEEE Robotics and Automation Letters, 2021, 6(2):1090 − 1096.doi:10.1109/LRA.2021.3056373
[49]	WANG Z, XIAO X, LIU B, et al. APPLI: Adaptive planner parameter learning from interventions[M]. 2021 IEEE International Conference on Robotics and Automation (ICRA). 2021: 6079 − 6085.
[50]	BHARDWAJ M, BOOTS B, MUKADAM M. Differentiable Gaussian process motion planning[C]//Proceedings of 2020 IEEE International Conference on Robotics and Automation (ICRA). [S.l.]:IEEE,2020:10598 − 10604.
[51]	TESO-FZ-BETOÑO D, ZULUETA E, FERNANDEZ-GAMIZ U, et al. Predictive dynamic window approach development with artificial neural fuzzy inference improvement[J]. Electronics, 2019, 8(9):
[52]	XIAO X, WANG Z, XU Z, et al. APPL: Adaptive planner parameter learning[J]. arXiv Preprint arXiv: 210507620, 2021,
[53]	SCHNELLE S, WANG J, SU H-J, et al. A personalizable driver steering model capable of predicting driver behaviors in vehicle collision avoidance maneuvers[J]. IEEE Transactions on Human-Machine Systems, 2016, 47(5):625 − 635.
[54]	CHONG L, ABBAS M M, MEDINA FLINTSCH A, et al. A rule-based neural network approach to model driver naturalistic behavior in traffic[J]. Transportation Research Part C:Emerging Technologies, 2013, 32:207 − 223.
[55]	LEFEVRE S, CARVALHO A, BORRELLI F. A learning-based framework for velocity control in autonomous driving[J]. IEEE Transactions on Automation Science and Engineering, 2015, 13(1):32 − 42.
[56]	LU C, WANG H, LV C, et al. Learning driver-specific behavior for overtaking: a combined learning framework[J]. IEEE Transactions on Vehicular Technology, 2018:1 − 1.
[57]	OSTAFEW C J, SCHOELLIG A P, BARFOOT T D. Robust constrained learning-based nmpc enabling reliable mobile robot path tracking[J]. The International Journal of Robotics Research, 2016, 35(13):1547 − 1563.doi:10.1177/0278364916645661
[58]	PAN Y, WANG J. Model predictive control of unknown nonlinear dynamical systems based on recurrent neural networks[J]. IEEE Transactions on Industrial Electronics, 2012, 59(8):3089 − 3101.doi:10.1109/TIE.2011.2169636
[59]	MCKINNON C. Learning-based path-tracking control for ground robots with discrete changes in dynamics[D]. Canada: University of Toronto, 2021.
[60]	BRUNNBAUER A, BERDUCCI L, BRANDSTÄTTER A, et al. Latent imagination facilitates zero-shot transfer in autonomous racing[J]. arXiv Preprint arXiv: 210304909, 2021.