Title Real-time Robot Joint Angle Astimation using Convolutional Neural Network /
Translation of Title Roboto valdymas realiuoju laiku naudojant regėjimu pagrįstą konvoliucinį neuroninį tinklą.
Authors Mazloum, Faisal Firas
Full Text Download
Pages 81
Keywords [eng] Convolutional neural networks ; direct angle regression ; keypoint based angle estimation ; robustness ; inertial sensor IMU
Abstract [eng] This study investigates varied approaches for real-time finger joint angle estimation for robotic control using convolutional neural networks (CNNs). Traditional vision-based methods rely on detecting hand keypoints, followed by trigonometric approaches for deriving joint angles. However, these are prone to large errors under visual occlusion. To address this limitation, this study mainly compares two paradigms — direct angle regression using direct angle ground truth labels versus keypoint based angle estimation — under a controlled, factorial experimental design, in addition to a comparison with a benchmark framework to evaluate the efficacy of developing reliable low resource models. The training dataset was acquired via a synchronized approach which gathered RGB images, direct angle labels via IMU integration, and keypoint coordinates using MediaPipe hands. Three CNN architctures — ResNet50, MobileNetV2, and EfficientNetB0 — served as foundations for each model, and the products exposed to varying levels of visual complexity (low, moderate, high) were evaluated across three dimensions: prediction accuracy (absolute error), inference speed (ms / frame), and robustness (percentage change in prediction under occlusion). Results of the experiment revealed that direct angle regression models achieved a mean absolute error (MAE) of 10° consistently across all projections and occlusion levels, while Keypoint based models had no reliability at any level. Mediapipe framework produced a 2° MAE for side views, increased to 45° in front views at low to moderate occlusion, and more unreliable at high levels. The statistical analysis revealed label type as a significant source of variability in robustness and mobileNetV2 the lowest inference time of 20.08 ms (~50 FPS). Final models were deployed on a 3-DOF, tendon-actuated robotic hand and real-time control was achieved at 50 FPS and a tip speed of 0.1 m/s. The results demonstrate that CNNs trained on direct angle labels can provide a robust, low-cost, and real-time control alternative to keypoint based vision systems.
Dissertation Institution Vilniaus Gedimino technikos universitetas.
Type Master thesis
Language English
Publication date 2025