Skip to Main content Skip to Navigation
Journal articles

End-to-End 6DoF Pose Estimation From Monocular RGB Images

Abstract : We present a conceptually simple framework for 6DoF object pose estimation, especially for autonomous driving scenarios. Our approach can efficiently detect the traffic participants from a monocular RGB image while simultaneously regressing their 3D translation and rotation vectors. The proposed method 6D-VNet, extends the Mask R-CNN by adding customised heads for predicting vehicle's finer class, rotation and translation. It is trained end-to-end compared to previous methods. Furthermore, we show that the inclusion of translational regression in the joint losses is crucial for the 6DoF pose estimation task, where object translation distance along longitudinal axis varies significantly, e.g., in autonomous driving scenarios. Additionally, we incorporate the mutual information between traffic participants via a modified non-local block to capture the spatial dependencies among the detected objects. As opposed to the original non-local block implementation, the proposed weighting modification takes the spatial neighbouring information into consideration whilst counteracting the effect of extreme gradient values. We evaluate our method on the challenging real-world Pascal3D+ dataset and our 6D-VNet reaches the 1st place in ApolloScape challenge 3D Car Instance task (Apolloscape, 2018), (Huang et al., 2018).
Document type :
Journal articles
Complete list of metadata
Contributor : Laurent Jonchère Connect in order to contact the contributor
Submitted on : Friday, April 16, 2021 - 3:55:51 PM
Last modification on : Friday, October 22, 2021 - 3:04:10 AM
Long-term archiving on: : Saturday, July 17, 2021 - 6:56:58 PM


Zou et al-2021-End-to-end 6DoF...
Files produced by the author(s)




Wenbin Zou, Di Wu, Shishun Tian, Canqun Xiang, Xia Li, et al.. End-to-End 6DoF Pose Estimation From Monocular RGB Images. IEEE Transactions on Consumer Electronics, Institute of Electrical and Electronics Engineers, 2021, 67 (1), pp.87-96. ⟨10.1109/TCE.2021.3057137⟩. ⟨hal-03189018⟩



Record views


Files downloads