ISSN :2582-9793

A Graph-Attention Frontend for ORB-SLAM2 under Large Viewpoint Changes

Original Research (Published On: 05-Dec-2025 )
DOI : https://doi.org/10.54364/AAIML.2025.54258

Giuseppe Macario

Adv. Artif. Intell. Mach. Learn., 5 (4):4645-4674

1. Giuseppe Macario: unimercatorum.it

Download PDF Here

DOI: 10.54364/AAIML.2025.54258

Article History: Received on: 30-Aug-25, Accepted on: 28-Nov-25, Published on: 05-Dec-25

Corresponding Author: Giuseppe Macario

Email: gm@mit.edu.it

Citation: Giuseppe Macario. A Graph-Attention Frontend for ORB-SLAM2 under Large Viewpoint Changes. Advances in Artificial Intelligence and Machine Learning. 2025;5(4):258. https://dx.doi.org/10.54364/AAIML.2025.54258


Abstract

    

This paper presents a visual SLAM system that augments the ORB-SLAM2 frontend with learned detection and matching while preserving the standard backend. A fully convolutional feature extractor with a position–prior head predicts one keypoint per grid cell with sub-cell refinement, producing uniformly distributed and repeatable features under illumination changes and low texture. A graph-attention matcher jointly reasons over intra-image neighborhoods and cross-image relations; a Sinkhorn-based assignment with a dustbin yields calibrated, approximately one-to-one correspondences. A lightweight optimization layer prunes and reweights candidates before geometric verification, and a dynamic frontend policy activates the learned components only when viewpoint change or inlier support indicate difficult frames, otherwise falling back to fast ORB tracking. The backend—local mapping, loop closing (DBoW2), and bundle/pose-graph optimization—remains unchanged, with learned descriptors used for 3D–2D association. Experiments on HPatches, TUM RGB-D, KITTI odometry, and real-world robot runs show improved detector repeatability and matching mAP, as well as reduced trajectory error compared to ORB-SLAM2, DX-SLAM, and GCNv2-SLAM, while maintaining real-time throughput on modest hardware. These results indicate that attention-based matching and position-aware detection provide a practical path to robust SLAM under large viewpoint and appearance changes without sacrificing efficiency.

Statistics

   Article View: 1294
   PDF Downloaded: 8