Giuseppe Macario
Adv. Artif. Intell. Mach. Learn., 5 (4):4645-4674
1. Giuseppe Macario: unimercatorum.it
DOI: 10.54364/AAIML.2025.54258
Article History: Received on: 30-Aug-25, Accepted on: 28-Nov-25, Published on: 05-Dec-25
Corresponding Author: Giuseppe Macario
Email: gm@mit.edu.it
Citation: Giuseppe Macario. A Graph-Attention Frontend for ORB-SLAM2 under Large Viewpoint Changes. Advances in Artificial Intelligence and Machine Learning. 2025;5(4):258. https://dx.doi.org/10.54364/AAIML.2025.54258
This paper presents a visual SLAM system that augments the ORB-SLAM2 frontend with learned detection and matching while preserving the standard backend. A fully convolutional feature extractor with a position–prior head predicts one keypoint per grid cell with sub-cell refinement, producing uniformly distributed and repeatable features under illumination changes and low texture. A graph-attention matcher jointly reasons over intra-image neighborhoods and cross-image relations; a Sinkhorn-based assignment with a dustbin yields calibrated, approximately one-to-one correspondences. A lightweight optimization layer prunes and reweights candidates before geometric verification, and a dynamic frontend policy activates the learned components only when viewpoint change or inlier support indicate difficult frames, otherwise falling back to fast ORB tracking. The backend—local mapping, loop closing (DBoW2), and bundle/pose-graph optimization—remains unchanged, with learned descriptors used for 3D–2D association. Experiments on HPatches, TUM RGB-D, KITTI odometry, and real-world robot runs show improved detector repeatability and matching mAP, as well as reduced trajectory error compared to ORB-SLAM2, DX-SLAM, and GCNv2-SLAM, while maintaining real-time throughput on modest hardware. These results indicate that attention-based matching and position-aware detection provide a practical path to robust SLAM under large viewpoint and appearance changes without sacrificing efficiency.