Dual-Branch Gaze Estimation Algorithm with Gaussian Mixture Distribution Heatmaps and Dynamic Adaptive Loss Function
-
Graphical Abstract
-
Abstract
Gaze estimation, a crucial non-verbal communication cue, has achieved remarkable progress through convolutional neural networks. However, accurate gaze prediction in unconstrained environments, particularly in extreme head poses, partial occlusions, and abnormal lighting, remains challenging. Existing models often struggle to effectively focus on discriminative ocular features, leading to suboptimal performance. To address these limitations, this paper proposes dual-branch gaze estimation with Gaussian mixture distribution heatmaps and dynamic adaptive loss function (DMGDL), a novel dual-branch gaze estimation algorithm. By introducing Gaussian mixture distribution heatmaps centered on pupil positions as spatial attention guides, the model is enabled to prioritize ocular regions. Additionally, a dual-branch network architecture is designed to separately extract features for yaw and pitch angles, enhancing flexibility and mitigating cross-angle interference. A dynamic adaptive loss function is further formulated to address discontinuities in angle estimation, improving robustness and convergence stability. Experimental evaluations on three benchmark datasets demonstrate that DMGDL outperforms state-of-the-art methods, achieving a mean angular error of 3.98° on the Max-Planck institute for informatics face gaze (MPIIFaceGaze) dataset, 10.21° on the physically unconstrained gaze estimation in the wild (Gaze360) dataset and 6.14° on the real-time eye gaze estimation in natural environments (RT-Gene) dataset, exhibiting superior generalization and robustness.
-
-