TD再励学习在卫星姿态控制中的应用

The Application of TD Based Reinforcement Learning in Satellite Attitude Control

摘要: 随着卫星姿态控制系统对控制精度、鲁棒性和抗干扰要求的不断提高,将模糊神经网络控制引入到三轴稳定卫星的姿态控制中,并采用基于时差(TD)法的再励学习来解决模糊神经网络参数在线调整的问题,可以在无需训练样本的前提下实现控制器的在线学习.仿真结果表明,这种结合再励学习的控制算法不仅可以满足对姿态控制精度的要求,有效地抵制了外界干扰,并对卫星的不确定性有较强的鲁棒性.

Abstract: With higher requirements on the accuracy,robustness and disturbance rejection ability in satellite attitude control system,a fuzzy neural control approach applied to the three-axis stabilized satellite is presented.In order to solve problems of online learning and tuning of fuzzy neural network parameters,reinforcement learning based on temporal difference(TD) is proposed and studied,so that training samples for the self-learning controllers are no longer needed.Simulation results showed that the proposed control method with reinforcement learning architecture could not(only) improve the accuracy and robustness of the system,but could also deal with the uncertainties and external disturbance efficiently.