Uncertainty-aware Path Planning using Reinforcement Learning and Deep Learning Methods

Document Type : Machine learning-Sadoghi


1 Department of Electrical Engineering, Imam Khomeini International University (IKIU), Qazvin, Iran,

2 Department of Electrical Engineering, Imam Khomeini International University (IKIU), Qazvin, Iran.


This paper proposes new algorithms to improve Reinforcement Learning (RL) and Deep Q-Network (DQN) methods for path planning considering uncertainty in the perception of environment. The study aimed to formulate and solve the path planning optimization problem by optimizing the path, avoiding obstacles, and minimizing the related uncertainty. In this regard, a reward function is constructed based on the weighted features of the environment images. In this study, Deep Learning (DL) is used for two purposes. First, for perceiving a real environment to find the state transition matrix of the mobile robot path planning problem, and second, for extracting the features of state directly from an image of the environment to select the appropriate actions. To solve the path planning problem, it is formed in the context of an RL problem, and a Convolutional Neural Network (CNN) is used to approximate Q-values as a linear parameterized function. Implementing this approach improves the Q-learning, SARSA, and DQN algorithms as the new versions, called POQL, POSARSA, and PODQN. The learning process results show that using newly improved algorithms increases path planning performance by more than 20%, 21%, and 5% compared to the Q-learning, SARSA, and DQN, respectively.


[1] M.Samadi and M. F. Othman, “Global path planning for autonomous mobile robot using genetic algorithm”, In Signal-Image Technology & Internet-Based Systems (SITIS), 2013 International Conference on “, pp. 726-730, IEEE, 2013.
[2] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D.  Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning”, arXiv preprint arXiv:1312.5602, 2013.
[3] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G.  Bellemare, andS.  Petersen, “Human-level control through deep reinforcement learning”, Nature, 518(7540), 529, 2015.
[4] V. Mnih, A. P. Badia, M. Mirza, A.  Graves, T. Lillicrap, T. Harley, and K. Kavukcuoglu, “Asynchronous methods for deep reinforcement learning”, In International Conference on Machine Learning, pp. 1928-1937, 2016.
[5] H. Van Hasselt, A. Guez, A., and D. Silver, “Deep reinforcement learning with double q-learning”, In Thirtieth AAAI Conference on Artificial Intelligence, 22016.
[6] M. Wulfmeier, D. Rao, D. Z. Wang, P. Ondruska, and I. Posner, I. “Large-scale cost function learning for path planning using deep inverse reinforcement learning”, The International Journal of Robotics Research, 3936(10), 1073-1087, 2017.
[7] J. Xin, H.  Zhao, D. Liu, and M. Li, “Application of deep reinforcement learning in mobile robot path planning”, In 2017 Chinese Automation Congress (CAC), pp. 7112-7116, IEEE, 2017.
[8] Y. F. Chen, M.  Everett, M. Liu, and J.P. How, “Socially aware motion planning with deep reinforcement learning”, arXiv preprint arXiv:1703.08862, 2017.
[9] U.  Challita, W.  Saad, and C. Bettstetter, “Deep reinforcement learning for interference-aware path planning of cellular-connected UAVs”, In Proc. of International Conference on Communications (ICC), Kansas 20 City, MO, USA, 2018.
[10] Y.H. Kim, J. I. Jang, and S. Yun,” End-to-end deep learning for autonomous navigation of mobile robot”, In Consumer Electronics (ICCE), 2018 IEEE International Conference on, pp. 1-6, IEEE, 2018.
[11] A. I. Panov, K. S. Yakovlev, R.  Suvorov, “Grid path planning with deep reinforcement learning: Preliminary results”, Procedia computer science, 123, 347-353. 2018.
[12] M. Pfeiffer, S. Shukla, M. Turchetta, C. Cadena, A. Krause, R. Siegwart, J. Nieto, “Reinforced Imitation: Sample Efficient Deep Reinforcement Learning for Mapless Navigation by Leveraging Prior Demonstrations”, IEEE Robotics and Automation Letters, 3(4), 4423-4430, 2018.  
[13] S. Zhou, X. Liu, Y. Xu, J. Guo, “A Deep Q-network (DQN) Based Path Planning Method for Mobile Robots”, In 2018 IEEE International Conference on Information and Automation (ICIA), pp. 366-371, IEEE, 2018.
[14] L. Lv, S. Zhang, D. Ding, Y. Wang, “Path planning via an improved DQN-based learning policy”, IEEE Access, 7, 67319-67330, 2019.
[15] G. Kahn, A. Villaflor, V. Pong, P. Abbeel, S. Levine, “Uncertainty-aware reinforcement learning for collision avoidance”, arXiv preprint arXiv:1702.01182, 2017.
[16] F. L. Da Silva, P. Hernandez-Leal, B. Kartal, and M. E. Taylor, “Uncertainty-Aware Action Advising for Deep Reinforcement Learning Agents”, Proceedings of the AAAI Conference on Artificial Intelligence, 34(04), 5792-5799, 2020.
[17] R. S. Sutton, and A.G. Barto, “Introduction to reinforcement learning, Vol. 135, Cambridge: MIT press, 1998.
[18] M. W. Otte, “A survey of machine learning approaches to robotic path-planning”, University of Colorado at Boulder, Boulder, 2015.
[19] X. Lei, Z. Zhang, and P. Dong, “Dynamic path planning of unknown environment based on deep reinforcement learning”, Journal of Robotics, 2018.
[20] T. Blum, W. Jones, and K. Yoshida, “Deep Learned Path Planning via Randomized Reward-Linked-Goals and Potential Space Applications”, arXiv preprint arXiv:1909.06034, 2019.
[21] S. Lange, M. Riedmiller, and A. Voigtländer, “Autonomous reinforcement learning on raw visual input data in a real world application”, In The 2012 international joint conference on neural networks (IJCNN), pp. 1-8, IEEE, 2012.
[22] J. Kober, J. A. Bagnell, and J. Peters, “Reinforcement learning in robotics: A survey”, The International Journal of Robotics Research, 32(11), 1238-1274, 2013.
 [23] P. Abbeel, and A.Y. Ng, “Apprenticeship learning via inverse reinforcement learning”, In Proceedings of the twenty-first international conference on Machine learning, p. 1, 2004.