RGB-D SLAM method of dynamic scene based on instance segmentation and optical flow

Chenggen WANG; Jinlong SHI; Haowei ZHU; Suqin BAI; Yunhan SUN; Jiawen LU; Shucheng HUANG

doi:10.37188/OPE.20243206.0857

Optics and Precision Engineering, Volume. 32, Issue 6, 857(2024)

RGB-D SLAM method of dynamic scene based on instance segmentation and optical flow

Chenggen WANG¹, Jinlong SHI^1、*, Haowei ZHU¹, Suqin BAI¹, Yunhan SUN², Jiawen LU¹, and Shucheng HUANG¹

Author Affiliations

¹School of Computer Science and Engineering， Jiangsu University of Science and Technology， Zhenjiang22000，China

²State Key Laboratory for Novel Software Technology， Nanjing University， Nanjing10046，China

show less

Abstract Get PDF(in Chinese)

References(37)

[1] [1] 张裕，张越，张宁，等. 基于逆深度滤波的双目折反射全景相机动态SLAM系统［J］. 光学精密工程， 2022， 30（11）： 1282-1289. doi: 10.37188/ope.20223011.1282ZHANGY， ZHANGY， ZHANGN， et al. Dynamic SLAM of binocular catadioptric panoramic camera based on inverse depth filter［J］. Opt. Precision Eng.， 2022， 30（11）： 1282-1289.（in Chinese）. doi: 10.37188/ope.20223011.1282

[2] [2] 郭道亮. 可变形物体的全局非刚性配准与重建［D］. 天津：天津大学， 2018.GUOD L. Global Non-Rigid Registration and Reconstruction of Deformable Objects［D］. Tianjin： Tianjin University， 2018. （in Chinese）

[3] NEWCOMBE R A, SEITZ S M. DynamicFusion： reconstruction and tracking of non-rigid scenes in real-time[C], 343-352(2015).

[4] [4] 刘东生，陈建林，费点，等. 基于深度相机的大场景三维重建［J］. 光学精密工程， 2020， 28（1）： 234-243. doi: 10.3788/ope.20202801.0234LIUD S， CHENJ L， FEID， et al. Three-dimensional reconstruction of large-scale scene based on depth camera［J］. Opt. Precision Eng.， 2020， 28（1）： 234-243.（in Chinese）. doi: 10.3788/ope.20202801.0234

[5] MUR-ARTAL R, TARDÓS J D. ORB-SLAM2： an open-source SLAM system for monocular， stereo， and RGB-D cameras[J]. IEEE Transactions on Robotics, 33, 1255-1262(2017).

[6] MUR-ARTAL R, MONTIEL J M M, TARDÓS J D. ORB-SLAM： a versatile and accurate monocular SLAM system[J]. IEEE Transactions on Robotics, 31, 1147-1163(2015).

[7] CAMPOS C, ELVIRA R, RODRÍGUEZ J J G et al. ORB-SLAM3： an accurate open-source library for visual， visual-inertial， and multimap SLAM[J]. IEEE Transactions on Robotics, 37, 1874-1890(2021).

[8] ANDUAGA XS, ANTONELLI S et al. The ATLAS experiment at the CERN large hadron collider[J](2008).

[9] GIRSHICK R, DONAHUE J, DARRELL T et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C], 580-587(2014).

[10] HE K M, GKIOXARI G, DOLLÁR P et al. Mask R-CNN[C], 2980-2988(2017).

[11] REN S Q, HE K M, GIRSHICK R et al. Faster R-CNN： towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149(2017).

[12] WANG X L, ZHANG R F, KONG T et al. SOLOv2： Dynamic and Fast Instance Segmentation[webpage]. arXiv, 2003-10152(2020). http：//arxiv.org/abs/2003.10152

[13] WANG X L, KONG T, SHEN C H et al. SOLO： Segmenting Objects by Locations[M]. Computer Vision – ECCV 2020, 649-665(2020).

[14] SUN D Q, YANG X D, LIU M Y et al. PWC-net： CNNs for optical flow using pyramid， warping， and cost volume[C], 8934-8943(2018).

[15] TEED Z, DENG J. RAFT： Recurrent All-Pairs Field Transforms for Optical Flow[M]. Computer Vision – ECCV 2020, 402-419(2020).

[16] CHO K, VAN MERRIENBOER B, BAHDANAU D et al. On the Properties of Neural Machine Translation： Encoder-Decoder Approaches[webpage]. arXiv, 1409-1259(2014). http：//arxiv.org/abs/1409.1259

[17] IZADI S, KIM D, HILLIGES O et al. KinectFusion： Real-Time 3D reconstruction and interaction using a moving depth camera[C], 559-568(2011).

[18] NEWCOMBE R A, IZADI S, HILLIGES O et al. KinectFusion： Real-Time dense surface mapping and tracking[C], 127-136(2011).

[19] NIEßNER M, ZOLLHÖFER M, IZADI S et al. Real-time 3D reconstruction at scale using voxel hashing[J]. ACM Transactions on Graphics, 32, 1-11(2013).

[20] RUNZ M, BUFFIER M, AGAPITO L. MaskFusion： Real-Time recognition， tracking and reconstruction of multiple moving objects[C], 10-20(2018).

[21] WHELAN T, LEUTENEGGER S, SALAS MORENO R et al. ElasticFusion： dense SLAM without a pose graph[C], 11(2015).

[22] BESCOS B, FÁCIL J M, CIVERA J et al. DynaSLAM： tracking， mapping， and inpainting in dynamic scenes[J]. IEEE Robotics and Automation Letters, 3, 4076-4083(2018).

[23] XU B B, LI W B, TZOUMANIKAS D et al. MID-Fusion： octree-based object-level multi-instance dynamic SLAM[C], 5231-5237(2019).

[24] RUSINKIEWICZ S, LEVOY M. Efficient variants of the ICP algorithm[C], 145-152(2002).

[25] WU W X, GUO L, GAO H L et al. YOLO-SLAM： a semantic SLAM system towards dynamic environment with geometric constraint[J]. Neural Computing and Applications, 34, 6011-6026(2022).

[26] KIM D S, FIGUEROA K W, LI K W et al. Profiling of dynamically changed gene expression in dorsal root Ganglia post peripheral nerve injury and a critical role of injury-induced glial fibrillary acidic protein in maintenance of pain behaviors[J]. Pain, 143, 114-122(2009).

[27] ALCANTARILLA P F, YEBES J J, ALMAZÁN J et al. On combining visual SLAM and dense scene flow to increase the robustness of localization and mapping in dynamic environments[C], 1290-1297(2012).

[28] ZHANG T W, ZHANG H Y, LI Y et al. FlowFusion： dynamic dense RGB-D SLAM based on optical flow[C], 7322-7328(2020).

[29] LIN W B, ZHENG C W, YONG J H et al. OcclusionFusion： occlusion-aware motion estimation for real-time dynamic 3D reconstruction[C], 1726-1735(2022).

[30] SCARSELLI F, GORI M, TSOI A C et al. The graph neural network model[J]. IEEE Transactions on Neural Networks, 20, 61-80(2009).

[31] BUJANCA M, LENNOX B, LUJÁN M. ACEFusion-accelerated and energy-efficient semantic 3D reconstruction of dynamic scenes[C], 11063-11070(2022).

[32] STURM J, ENGELHARD N, ENDRES F et al. A benchmark for the evaluation of RGB-D SLAM Systems[C], 573-580(2012).

[33] PALAZZOLO E, BEHLEY J, LOTTES P et al. ReFusion： 3D Reconstruction in dynamic environments for RGB-D cameras exploiting residuals[C], 7855-7862(2019).

[34] LORENSEN W E, CLINE H E. Marching cubes： a high resolution 3D surface construction algorithm[J]. ACM SIGGRAPH Computer Graphics, 21, 163-169(1987).

[35] TARG S, ALMEIDA D, LYMAN K. Resnet in ResNet： Generalizing Residual Architectures[webpage]. arXiv, 1603-08029(2016). http：//arxiv.org/abs/1603.08029

[36] SCONA R, JAIMEZ M, PETILLOT Y R et al. StaticFusion： background reconstruction for dense RGB-D SLAM in dynamic environments[C], 3849-3856(2018).

[37] WONG Y S, LI C J, NIEßNER M et al. RigidFusion： RGB-D scene reconstruction with rigidly-moving objects[J]. Computer Graphics Forum, 40, 511-522(2021).

CLP Journals

[1] Shuguang LI, Qinmei CHEN, Jinlong SHI, Suqin BAI, Chenggen WANG, Xin ZHUO. Object-level dynamic simultaneous localization and mapping method fusing object detection and optical flow[J]. Optics and Precision Engineering, 2025, 33(8): 1314

[2] Shuguang LI, Qinmei CHEN, Jinlong SHI, Suqin BAI, Chenggen WANG, Xin ZHUO. Object-level dynamic simultaneous localization and mapping method fusing object detection and optical flow[J]. Optics and Precision Engineering, 2025, 33(8): 1314

Tools

Get Citation

Copy Citation Text

Chenggen WANG, Jinlong SHI, Haowei ZHU, Suqin BAI, Yunhan SUN, Jiawen LU, Shucheng HUANG. RGB-D SLAM method of dynamic scene based on instance segmentation and optical flow[J]. Optics and Precision Engineering, 2024, 32(6): 857

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category:

Received: Sep. 7, 2023

Accepted: --

Published Online: Apr. 19, 2024

The Author Email: Jinlong SHI (shi_jinlong@163. com)

DOI:10.37188/OPE.20243206.0857

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology