期刊文献+

基于强化学习的学习变阻抗控制 预览

Learning variable impedance control based on reinforcement learning
在线阅读 下载PDF
收藏 分享 导出
摘要 为了提高力控制的性能,使机器人高效自主地学习执行力控制任务,本文提出一种学习变阻抗控制方法。该方法采用基于模型的强化学习算法学习最优阻抗调节策略,使用高斯过程模型作为系统的变换动力学模型,允许概率化的推理与规划,并在成本函数中加入能量损失项,实现误差和能量的权衡。仿真实验结果表明:该学习变阻抗控制方法具有高效性,仅需数次交互即可成功学习完成力控制任务,大大减少了所需的交互次数与交互时间,且学习得到的阻抗控制策略具有仿生特性,可用于学习执行力敏感型任务。 To improve the performance of force control and enable robots to learn how to execute force control tasks autonomously and efficiently,this paper presents an efficient variable impedance control method.The proposed method learns the optimal impedance regulation strategy using a model-based reinforcement learning algorithm.The Gaussian process model is used as a transformation dynamics model in the system.This model permits probabilistic inference and planning.Also,an energy consumption item is added to cost function to achieve a trade-off between error and energy.The simulation results show the efficiency of the proposed method,requiring only a few interactions to successfully learn how to complete force control tasks.Furthermore,the required number of interactions and interaction time are significantly reduced.The learned impedance control strategy features bionic characteristics,which are applicable in learning how to perform force-sensitive tasks.
作者 李超 张智 夏桂华 谢心如 朱齐丹 刘琦 LI Chao;ZHANG Zhi;XIA Guihua;XIE Xinru;ZHU Qidan;LIU Qi(College of Automation,Harbin Engineering University,Harbin 150001,China;Institute of Chemical Materials,China Academy of Engineering Physics,Mianyang 621000,China)
出处 《哈尔滨工程大学学报》 EI CAS CSCD 北大核心 2019年第2期304-311,共8页 Journal of Harbin Engineering University
基金 国家自然科学基金项目(U1530119).
关键词 机器人 阻抗控制 力控制 控制策略 强化学习 高效 高斯过程 成本函数 robot impedance control force control control strategy reinforcement learning efficient Gaussian process cost function
作者简介 李超,男,博士研究生;通信作者:张智,男,副教授,硕士生导师,E-mail:zhangzhi1981@hrbeu.edu.cn.
  • 相关文献

参考文献3

二级参考文献51

  • 1Flash T,Hogan N.The co-ordination of arm movements:an experimentally confirmed mathematical model[J].J Neurosci,1985.5:1688-1703. 被引量:1
  • 2Uno Y,Kawato M.Suzuki R.Formation and control of optimal trajectory in human multi-joint arm movement[J].Biol Cybern,1989,61:89-101. 被引量:1
  • 3Miyamoto.H,Kawato M.Feedback-error-learning neural network for trajectory control of robotic manipulator[J].Neural Netw,1988,1(3):251-265. 被引量:1
  • 4Uno M,Suzuki R,Kawato M.Minimum muscle tension change model which produces human arm movement[C].In:Proceedings of the 4th symposium on biological and physiological engineering,1989299-302. 被引量:1
  • 5Dornay M,Uno Y,Kawato M,Suzuki R.Minimum muscle-tension change trajectories predicted by using a 17-muscle model of the monkey's arm[J].J Motor Behav,1996,28(2):83-100. 被引量:1
  • 6Kashima T,Isurugi Y.Trajectory formation based on physiological characteristics of skeletal muscles[J].Biol Cybern,1998,78:413-422. 被引量:1
  • 7Kashima T,Isurugi Y,Shima M.Analysis of a muscular control system in human movements[J].Biol Cybern,2000,82.123-131. 被引量:1
  • 8Kashima T,Isurugi Y,Shima M.Control characteristics of a neuromuscular system in reaching movements[J].SICE,2004. 被引量:1
  • 9Hogan N.Impedance control:An approach to manipulation[J].ASME J Dyn Sys,1985:107:1-7. 被引量:1
  • 10Mussa-Ivaldi FA,Hogan N,Bizzi E.Neural,mechanical,and geometric factors subserving arm posture in humans[J].J Neurosci,1985,5(10):2732-2743. 被引量:1

共引文献1

投稿分析

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部 意见反馈