【正文】
ng. IEEE Transactions on System, Man, and Cybernetics, Part C: Applications and Reviews, 2008, 38(2): 156172.[2] J. Enright and P. R. Wurman. Optimization and Coordinated Autonomy in Mobile Fulfillment Systems. Automated Action Planning for Autonomous Mobile Robots, 2011, 3338.[3] L. Zhou, Y. Shi, J. Wang and Pei Yang. A Balanced Heuristic Mechanism for Multirobot Task Allocation of Intelligent Warehouses. Mathematical Problems in Engineering, 2014, Vol. 2014.[4] Y. Hu, Y. Gao and Bo An. Multiagent Reinforcement Learning With Unshared Value Functions. IEEE Transaction on Cybernetics, 2014, 45(4): 647662.[5] 高陽, 陳世福, 陸鑫. 強化學習研究綜述. 自動化學報, 2004, 30(1): 86100.[6] A. Now233。Andrea and M. Mountz. Coordinating hundreds of cooperative, autonomous vehicles in warehouses. AI Magazine, 2008, 29(1): 9.[9] 任建功. 基于強化學習的自主式移動機器人導航控制. 哈爾濱: 哈爾濱工業(yè)大學, 2010, 17.[10] 郭娜. 基于模擬退火Q學習的移動機器人路徑規(guī)劃技術研究. 南京: 南京理工大學, 2009. 15.[11] 王勇. 智能倉庫系統(tǒng)多移動機器人路徑規(guī)劃研究. 哈爾濱: 哈爾濱工業(yè)大學, 2010. 918.[12] M. L. Littman. Markov games as a framework for multiagent reinforcement learning. Proceedings of the International Conference on Machine Learning (ICML), 1994, 15