单水听器强化学习无源自导方法
Passive homing method with reinforcement learning for a single hydrophone
-
摘要: 为降低水下无人自主航行器在搭载自导系统时对体积与成本的苛刻要求, 提出了一种简便可行的无源自导方法, 可仅使用单个声压水听器实现航行器对中心频率已知窄带声源目标的导引。该方法有效利用了航行器自身与目标间相对运动所产生的多普勒频移的变化, 对航行器的追踪行为进行控制。基于强化学习Actor-Critic算法, 利用实时获取的接收信号中心频率和信噪比参数构建价值函数, 利用该函数输出的目标检测结果、目标方位角和中心频率变化率参数构建策略函数, 结合历史动作, 输出当前最优动作。通过仿真和湖试实验研究了该方法的训练效率以及目标跟踪性能。结果表明, 所提无源自导方法可以方便地应用于水下无人平台, 导引航行器对不同速度的声源目标进行较为精确的跟踪。Abstract: In order to reduce the stringent volume and cost requirements for underwater unmanned autonomous vehicles (AUVs) when equipped with homing systems, this paper proposes a simple and feasible passive homing method. The proposed method can use only one sound pressure hydrophone to guide AUVs to approach the narrow-band sound source target with a known center frequency. The change in Doppler frequency shift caused by the relative motion between the AUV and target is used to control tracking behaviors. Based on the reinforcement learning Actor-Critic algorithm, the value function is constructed using the center frequency and signal-to-noise ratio parameters of the received signal obtained in real time. The target detection results, the target azimuth angle and the center frequency change rate parameters generated by the value function are utilized to construct the strategy function. The strategy function can combine historical actions to output the current optimal action. The training efficiency and the target tracking performance of the proposed method are studied through simulation experiments and lake trials. Results show that our passive homing method can be easily applied to underwater unmanned platforms to guide AUVs to more accurately track sound source targets with different speeds.