Laser & Optoelectronics Progress
Co-Editors-in-Chief
Dianyuan Fan
Yiming Guo, Xiaoqing Wu, Changdong Su, Shitai Zhang, Cuicui Bi, and Zhiwei Tao

This study proposes a generative adversarial network (GAN) based on bidirectional multi-scale feature fusion to reconstruct target celestial images captured by various ground-based telescopes, which are influenced by atmospheric turbulence. This approach first constructs a dataset for network training by convolving a long-exposure atmospheric turbulence degradation model with clear images and then validates the network's performance on a simulated turbulence image dataset. Furthermore, images of the International Space Station collected by the Munin ground-based telescope (Cassegrain-type telescope) that were influenced by atmospheric turbulence are included in this study. These images were sent to the proposed neural network model for testing. Different image restoration assessment shows that the proposed network has a good real-time performance and can produce restoration results within 0.5 s, which is more than 10 times faster than standard nonneural network restoration approaches; the peak signal to noise ratio (PSNR) is improved by 2 dB?3 dB, and structural similarity (SSIM) is enhanced by 9.3%. Simultaneously, the proposed network has a pretty good restoration impact on degraded images that are influenced by real turbulence.

Nov. 25, 2022
  • Vol. 59 Issue 22 2201001 (2022)
  • Jiayi Wang, and Zhongxing Duan

    The external insulation layer of a building's exterior wall is widely used for energy conservation in buildings. An infrared imaging detection method based on a three-dimensional heat transfer model for the surface cracks of the external insulation layer is proposed to effectively detect the quality problems of the external insulation layer. First, with the help of infrared thermal imaging technology, an experimental platform for surface crack detection of the external insulation layer of building exterior wall was built to detect the surface crack of the external insulation layer. Then, using ANSYS software, the three-dimensional infrared thermal imaging detection model for the surface cracks of the external insulation layer of the building exterior wall was established, the model's feasibility was verified, and the effects of crack size and ambient temperature on the detection effect were simulated and calculated.The results show that the experimental ambient temperature and crack size have the greatest influence on the temperature difference between the crack and non-crack areas of the external insulation layer. When the ambient temperature remains constant, the crack width and thickness grow, as does the temperature difference. With an increase in ambient temperature, the temperature difference also increases gradually. When the ambient temperature is below 10 ℃, the temperature difference changes gently; when the ambient temperature exceeds 10 ℃, the rate of temperature difference growth increases gradually with increasing ambient temperature.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2204001 (2022)
  • Chao Wang, Yongshun Wang, and Fan Di

    A fast and automatic fuzzy C-means clustering (FCM) color image segmentation algorithm is proposed as an alternative to the traditional FCM algorithm, which has high computational complexity and fails to automatically determine the number of clusters. First, the image is presegmented by an improved simple linear iterative clustering (SLIC) algorithm, transforming the traditional pixel-based clustering into superpixel region-based clustering and reducing computational complexity. Second, the improved density peak algorithm determines the number of clusters automatically and improves flexibility. Finally, superpixel images are subjected to histogram-based FCM clustering to complete image segmentation. The BSDS500, AID, and MSRC public databases were utilized as experimental datasets and compared with other FCM segmentation methods to verify their effectiveness. In terms of segmentation accuracy, fuzzy segmentation coefficient, fuzzy segmentation entropy, and visual effect, the experimental results show that the proposed segmentation algorithm outperforms several other comparative algorithms.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2210001 (2022)
  • Cong Wu, Zhiqiang Guo, and Jie Yang

    A residual network based on dual-channel attention mechanism is designed to address the difficulty in extracting single and segmented seedling dataset images from the channel attention mechanism feature of the SENet network, which integrates the channel attention mechanism and spatial attention. The mechanism module can obtain the channel and spatial dimension feature weights simultaneously to enhance the feature learning ability of the network. To address the problem of missing the target in the segmented sample data, a random erasure method is proposed. Experiments on the self-made plug seedling Plant_seed dataset demonstrate that the improved network ResNet34+CBAM_basic_conv, which introduces the attention mechanism module between the ResNet34 network residual module and the conv*_x module, reaches the optimal accuracy of 93.8%. The error rate of the model classification drops after some images in the dataset are randomly erased, demonstrating the excellent performance of the proposed method.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2210002 (2022)
  • Wanjun Liu, Yitong Li, and Wentao Jiang

    A high-confidence adaptive feature fusion target tracking algorithm is proposed to address the problem of tracking drift in the complex scenes, such as occlusion and complex backgrounds, using a complementary learning tracking algorithm. First, we use the Bhattacharyya coefficient to calculate the similarity between the foreground and background color histograms of each frame in real time, and adopt the log loss function to obtain the final fusion factor to achieve better feature fusion of each frame. The average peak correlation energy and the ratio of the response peak value to its corresponding historical average value are then used to determine confidence determination parameter, and the target position is updated and corrected for tracking based on the determination result. Experiments on the OTB100 and LaSOT datasets show that this algorithm improves the precision rate by 17.5% and 15.4%, and success rate by 27.3% and 18.0%, respectively when compared with the Staple algorithm. The results demonstrate the effectiveness and robustness of the algorithm.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2210003 (2022)
  • Hao Wang, Zengshan Yin, Guohua Liu, Denghui Hu, and Shuang Gao

    In this paper, a lightweight optical remote sensing image target detection algorithm LW-YOLO is proposed based on the YOLOv5 detection model to solve the difficulty of deploying the deep learning target detection algorithm on the satellite due to the large volume of the model and too many parameters. First, a lightweight Ghost module is introduced to replace the ordinary convolution in the network to reduce the number of parameters and solve the computational overhead caused by feature information redundancy in the YOLOv5 network. Then, a space and channel Fusion Attention (FA) module is designed, and the bottleneck layer FABotleneck of the network is reconstructed to further reduce the parameters and improve the positioning ability of the algorithm for optical remote sensing image targets. Finally, a sparse parameter adaptive network pruning method is proposed to prune the network and further compress the model size. Experiments on the DOTA dataset show that compared with YOLOv5s, the LW-YOLO algorithm reduces 64.7% of parameters, 62.7% of model size, 3.7% of reasoning time, and only 6.4% of mean precision. The algorithm achieves the lightweight of the network model at the cost of small accuracy loss and provides a theoretical basis for on-orbit target detection in spaceborne optical images.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2210004 (2022)
  • lijie Zhao, Guishuo Zhang, Shida Zou, Guogang Wang, Wenyu Fan, Yuhong Zhang, and Mingzhong Huang

    Addressing the problem of a single activated sludge microscopic image under high magnification of a small field of view and limited information for characterizing sludge samples, a multi-image stitching approach of activated sludge microscopic images based on the Floyd algorithm is proposed. First, the scale-invariant feature transform algorithm is used to extract feature points of activated sludge microscopic images, and the distance matrix of multi-image feature matching points is computed using cosine distance. Next, the Floyd algorithm is used to determine the multi-image stitching reference map and the optimized stitching path. Finally, the images are stitched based on the stitching path and the reference map by an affine transformation. The experimental findings show the efficiency of the approach proposed in this study. In the case of a limited field of microscopic and multiple images in disorder, the approach can solve the problem of stitching images.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2210005 (2022)
  • Xiao Liang, Huiping Deng, Sen Xiang, and Jin Wu

    The existing light field image saliency detection algorithms cannot effectively measure the focus information, resulting in an incomplete salient object, information redundancy, and blurred edges. Considering that different slices of the focal stack and the all-focus image play different roles in saliency prediction, this study combines the efficient channel attention (ECA) network and convolutional long short-term memory model (ConvLSTM) network to form a feature fusion network that adaptively fuse the features of the focal stack slices and all-focus images without reducing the dimension; then the feedback network composed of the cross feature module refines the information and eliminates the redundant information generated after the feature fusion; finally, the ECA network is used for weighing the high-level features to better highlight the saliency area to obtain a more accurate saliency map. The network proposed has F-measure and mean absolute error (MAE) of 0.871 and 0.049, respectively, in the most recent data set, which are significantly better than the existing red, green, and blue (RGB) images, red, green, blue, and depth (RGB-D) images, and light field images saliency detection algorithms. The experimental results show that the proposed network can effectively separate the foreground and background regions of the focal stack slices and produce a more accurate saliency map.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2210006 (2022)
  • Shiyu Lin, Xuejiao Yan, Zhe Xie, Hongwen Fu, Song Jiang, Hongzhi Jiang, Xudong Li, and Huijie Zhao

    Employing a robot to inspect the inner surface of the pipeline periodically is crucial to guarantee that the pipeline runs safely and reliably. Limited by the robot size and power, small three-dimensional measurement sensors with lower accuracy are frequently used with the robot to obtain environmental and navigation information. However, the quality of the pipeline point cloud acquired using such a sensor is substandard, making it challenging to reliably detect obstacles. Therefore, a point cloud processing approach according to time series and neighborhood analysis is proposed, which employs time and spatial distribution characteristics of obstacle point clouds and noise point clouds to remove noise and finally detects the obstacles by fitting the pipeline inner wall point clouds. The experiments reveal that the detection accuracy improves by 30 percentage points and the processing time is less than 1 s, meeting the requirements of the pipeline inspection robot.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2210007 (2022)
  • Ming Chen, Xiangyun Xi, and Yang Wang

    A hyperspectral image classification method based on residual generative adversarial network (GAN) is proposed to address the problems of high demand for labeled samples and high classification accuracy in the process of hyperspectral image classification. The method is based on GAN and includes: replacing the deconvolution layer network structure of the generator with an eight-layer residual network composed of an upsampling layer and a convolution layer to improve data generation ability; improving feature extraction ability, the discriminator's convolutional layer network structure is replaced with a thirty-four-layer residual convolutional network. The experiment compares the datasets from Indian Pines, Pavia University, and Salinas. The proposed method is compared to GAN, CAE-SVM, 2DCNN, 3DCNN, and ResNet. The results demonstrate that the proposed method improves overall classification accuracy, average classification accuracy, and Kappa coefficient significantly. Among them, the overall classification accuracy reached 98.84% on the Indian Pines dataset, which is 2.99 percentage points, 22.03 percentage points, 12.91 percentage points, 4.99 percentage points, and 1.79 percentage points higher than the comparison methods. In summary, adding a residual structure to the network improves information exchange between the shallow and deep networks, extracts deep features of the hyperspectral image, and improves hyperspectral image classification accuracy.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2210008 (2022)
  • Wei Lin, Haihua Cui, Wei Zheng, Xinfang Zhou, Zhenlong Xu, and Wei Tian

    As a noncontact high-precision optical full-field measurement method, shearography can be used for the nondestructive detection of internal defects in composite materials. However, the obtained phase fringe pattern contains a high amount of speckle noise that seriously affects the detection results and accuracy. Therefore, we propose a phase fringe-filtering method using an unsupervised image style conversion model (CycleGAN). Furthermore, the original noise phase fringe image obtained using shearography is converted into an ideal noiseless fringe image via network training to achieve noise filtering in the phase fringe pattern. The experimental results show that the proposed method achieves high-efficiency filtering for noise in areas where the stripe distribution is relatively sparse, with clear boundaries and significant contrast in filtered images. Additionally, the running time of the proposed method is better than that of the other methods (by approximately 30 ms), achieves high-quality filtering, meets the development demand of dynamic nondestructive testing, and provides a new idea for the noise filtering of phase fringe pattern.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2210009 (2022)
  • Wei Gao, Boyang He, Ting Zhang, Meiqing Guo, Jun Liu, Huimin Wang, and Xingzhong Zhang

    The perception of the spatial distance between operators and dangerous equipment is a basic safety management and control task issue in a substation scene. With the advancement of lidar and three-dimensional (3D) vision theory, 3D point cloud target detection can provide necessary technical assistance for downstream spatial distance measurement tasks. Aiming at the problem of inaccurate target detection caused by factors such as complex background and equipment occlusion in the substation scene, based on the PointNet++ model, an improved attention module is introduced in the local feature extraction stage, and a 3D object detection network PointNet suitable for substation operation scene is proposed. First, the network undergoes a two-level local feature extraction to obtain fine-grained features in each local area, then encodes all local features into feature vectors using a mini-pointnet to obtain global features, and finally passes through the fully connected layer to predict the results. Considering the large gap between the number of front and background points in the cloud data of substation sites, this study calculates the classification loss using focal loss to make the network pay more attention to the feature information of the front points. Experiments on the self-built dataset show that the PowerNet has a mean average precision (mAP) value of 0.735, which is greater than previous models and can be directly applied to downstream security management and control tasks.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2210010 (2022)
  • Xiaomiao Li, Yanchun Yang, Jianwu Dang, and Yangping Wang

    This paper proposes a multi-focus image fusion method based on double-scale decomposition and random walk to smooth the edge region and avoid artifacts at the edge junction. The source images are first decomposed into large-scale and small-scale focus images using a Gaussian filter, and the edges of the decomposed large-scale and small-scale focus images are smoothed using various guiding filters. Then, the large-scale and small-scale focus maps are used as the marker nodes of the random walk algorithm, the initial decision map is obtained using the fusion algorithm, and the guided filter is used to optimize the decision map again. Finally, the source images are reconstructed using the decision graphs to produce the final fused image. The results of the experiments show that our method can effectively obtain the focus information in the source images while retaining the edge texture and detailed information of the focus area. It outperformed the competition in both subjective and objective evaluation indicators.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2210011 (2022)
  • Jiawei Zhu, Zhaohui Jiang, Shilan Hong, Huimin Ma, Jianpeng Xu, and Maosheng Jin

    The grain filling stage is a critical growth phase of rice. To segment the panicle accurately during filling stage and explore the relationship between its characteristics and plant maturation, a method of segmentation and characteristics analysis is proposed based on neural architecture search (NAS). Based on the DeepLabV3Plus network model, the backbone network is automatically designed using NAS, and the semantic segmentation network Rice-DeepLab is built by modifying atrous spatial pyramid pooling (ASPP). The area ratios, dispersion, average curvature, and color characteristics of the panicles of four rice varieties are calculated and analyzed after segmentation by Rice-DeepLab. The experimental results show that the improved Rice-DeepLab network has a mean intersection over union (mIoU) of 85.74% and accuracy (Acc) of 92.61%, which is 6.5% and 2.97% higher than that of the original model, respectively. According to the panicles' area ratios, dispersion, average curvature, and color characteristics recorded in the image, it can be roughly distinguished whether the panicles are sparse or dense, whether grain filling is complete, and whether the color is green, golden, or gray. This study suggests that field cameras can be easily used to monitor rice in the filling stage preliminarily to estimate maturation and crop size by panicle segmentation and characteristics analysis, thus providing support for field management.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2210012 (2022)
  • Fengyuan Shi, Chunming Zhang, Lihui Jiang, Qi Zhou, and Di Pan

    A principal component analysis (PCA)-based point cloud registration strategy is proposed by analyzing the algorithm registration process, and the PCA registration design is added to the iterative process of the iterative closest point (ICP) algorithm to solve the problem, wherein the ICP algorithm easily falls into a local minimum. In addition, the registration is time-consuming.First, the center of gravity method is used to make the center of gravity of the reference point cloud coincide with the point cloud to be registered before the first iteration to determine the initial pose. Second, the PCA is used to master the point cloud to be registered and the reference point cloud in each iteration of the ICP algorithm. After performing PCA, the first three principal component eigenvectors are selected, and corresponding matching through posture transformation are performed, so that after the initial registration of the two-point clouds is complete, the Euclidean distance is used to find the closest point to complete the subsequent registration process.The classic ICP algorithm with three initial pose determination methods, the mainstream algorithm of the literature, and the proposed iterative PCA algorithm with three initial pose determination methods are selected for comparative analysis in this study. The results show that while the first two algorithms are not able to register, the proposed algorithm not only avoids falling into the local minimum but also improvs in speed and accuracy. The number of iterations is 10 times, which takes 19.427939 s and the registration error is 2.1932, improving the overall registration performance.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2211001 (2022)
  • Yingran Zhao, Keding Yan, and Shuwei Yang

    A small Wi-Fi-intelligent microscope system is proposed to get rid of the problems of the traditional microscope's complicated operation, large volume, and limited light source. The common network camera CMOS circuit is used as the microscope's imaging circuit, and the network camera's advantage is reasonably utilized. The communication module adopts network communication between the computer and the microscope, utilizing the AP function of the ESP8266 module. The image can be transmitted to the upper computer software to set the camera parameters and save the image. The cost is low because the mechanical structure of the microscope is designed using NX12.0 software and three-dimensional printing technology. Consequently, the whole system works in conjunction, and the cell image is collected through the microscope.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2211002 (2022)
  • Yuquan He, Yongjie Zhang, Guangqi Xie, Lingfei Duan, and Hongqiao Zhang

    Recently, there have been widespread extensive research and application on charge coupled device (CCD). The imaging shape of linear CCD is a straight line, which can efficiently scan the space plane. This paper proposes a method for target plane positioning using the accumulation characteristics of points in a space line at the same pixel position and the projection information of the origin position of the spatial plane. For the obtained linear image, we used histogram equalization, which extracts the pixel position of the measured target stably, to enhance the contrast. After obtaining the straight line solving matrix and collecting the target's pixel position of each linear array CCD, we obtained line equations passing through the target object, and finally, we calculated its plane position using the least square method. The results show that for the linear CCD module TSL1401, the average measurement error of the measurement system is about 0.19 cm and the standard deviation is about 0.09 cm in the 20 cm×20 cm measured area, proving the effectiveness of the proposed method.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2212001 (2022)
  • Nana Fu, Daming Liu, Hengbo Zhang, and Xuandong Li

    To achieve real-time effects of the human behavior recognition network on the embedded platform, a human behavior recognition technique based on the lightweight OpenPose model is proposed. This approach begins with the viewpoint of 18 human body bone key points and calculates the behavior type based on the spatial position of the bone key points. First, the lightweight OpenPose model is used to extract the 18 bone key points to coordinate information about the human body. Then, the key point coding is used to describe the human body behavior. Finally, the classifier is used to classify the acquired key point coordinates to detect the human body behavior status and transplant it into Jetson Xavier NX equipment using a monocular camera for testing. Experimental results show that this method can quickly and accurately identify 11 types of human behaviors, such as walking, waving, and squatting, on the embedded development board Jetson Xavier NX, with an average recognition accuracy rate of 96.08%, and detection speed of >11 frame/s. The frame rate is increased by 177% compared to the original model.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2215001 (2022)
  • Tao Long, Chang Su, and Jian Wang

    Detecting salient key points in images and extracting feature descriptors are important components of computer vision tasks such as visual odometry and simultaneous localization and mapping systems. The main goal of the feature point extraction algorithm is to detect accurate key point positions and extract reliable feature descriptors. Reliable feature descriptors should maintain stability against rotation, scale scaling, illumination changes, viewing angle changes, noise, etc. Due to the loss of image information during the downsampling process in recent deep learning-based feature point extraction algorithms, the reliability of the descriptor and accuracy of feature matching are reduced. This study proposes a network structure to detect detail-preserving oriented feature descriptors to solve this problem. The proposed network fuses shallow detail and deep semantic features to sample the descriptors to a higher resolution. Combined with the attention mechanism, local (corners, lines, textures, etc.), semantic, and global features are used to improve the detection of feature points and the reliability of feature descriptors. Experiments on the Hpatches dataset show that the matching accuracy of the proposed method is 55.5%. Additionally, when the input image resolution is 480×640, the homography estimation accuracy of the proposed method is 5.9 percentage points higher than that of the existing method. These results demonstrate the effectiveness of the proposed method.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2215002 (2022)
  • Yaoze Sun, and Junwei Gao

    Aiming at the problems of low accuracy and low efficiency of wheel set tread defect detection of high-speed trains, an improved YOLOv5 algorithm is proposed to realize fast and accurate detection. A convolution attention mechanism is introduced to optimize the features in the channel and spatial dimensions so that the essential target features occupy a greater proportion in the network processing to enhance feature learning ability in the target region. According to the size of the tread defect category, the structure of the Neck area is simplified, and the characteristic graph branches suitable for detecting small- and medium-sized targets are retained to reduce the model's complexity. The loss function of the bounding box regression is changed to efficient intersection over union (EIoU) to integrate more bounding box information and improve prediction accuracy. Compared with the original YOLOv5 algorithm, experimental results demonstrate that the mean average precision (mAP) of the enhanced YOLOv5 algorithm on the test set is increased by 5.5 percentage points, and the detection speed is improved by 2.8 frames/s, which has a strong generalization ability in complex scenes.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2215003 (2022)
  • Qingjiang Chen, and Yali Xie

    To solve the low contrast problem of underwater degraded images, an underwater image enhancement algorithm based on a deep cascaded convolutional neural network is proposed. First, the degraded underwater image is converted from traditional red, green, and blue to hue, saturation, and value color space, which retains the hue and lightness component without changes, and the cascaded convolutional neural network is employed to examine the saturation component improvement. New dense blocks are introduced in the process of feature extraction network encoding and decoding. The dense block combines residual connection, skip connection, and multiscale convolution to correct color distortion. The texture refinement network employs six texture refinement units to extract feature information from the refined image. Finally, the S-channel image is extracted using the cascaded convolutional neural network, which is combined with the H- and V-channel images to achieve an improved underwater image. The experimental findings reveal that the average underwater color image quality estimation of underwater images improved using the proposed algorithm can reach 0.616875, and the average underwater image quality measurement can reach 5.197000. The comparison algorithm findings reveal that the proposed underwater image enhancement algorithm not only has a good improvement effect but also ensures the improved images are in line with human vision.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2215004 (2022)
  • Haitao Yin, and Yongying Yue

    To efficiently employ a small amount of labeled data, a medical image fusion network based on semisupervised learning and a generative adversarial network is developed. The developed fusion network comprises one generator and two discriminators. A semisupervised learning scheme is developed to train the network, including the supervised-training, unsupervised training, and parameters fine-tuning phases. Furthermore, the generator is constructed using a fusion inspired U-Net, squeeze and excitation attention modules. The discriminator contains three convolution layers, one fully connected layer, and a sigmoid activation function. The experimental findings on different multimodal medical images exhibit the proposed approach is competitive with six existing deep-learning based approaches in terms of visual effects and objective indexes. Moreover, the ablation investigations show the effectiveness of a semisupervised learning scheme that can enhance the quality of fused images.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2215005 (2022)
  • Gaojie Wang, Xiangyang Hao, and Shufeng Miao

    Existing image matching algorithms in the field of visual navigation are primarily based on the similarity measure of descriptors. The large number of feature points required and the lack of consideration for the overall features of the image affect the real-time reliability of image matching. To that end, this paper proposes an overall matching algorithm for image feature points based on clustering analysis. The algorithm performs distance-based cluster analysis on the set of feature points to filter out representative feature points with a high repetition rate, divides the target image and the image to be matched into four regions based on the distribution of feature points, selects two feature points randomly from each region to calculate the basic matrix, performs overall feature point matching based on the epipolar constraint and position constraint, and checks the matching results based on the geometric similarity among the feature points. The images in Technical University of Munich RGB-Depth data set, unmanned aerial vehicles, and mobile robots are selected for the image matching test. The results show that the proposed algorithm has a high matching accuracy of 97.1% and an average matching time of less than 25 ms, which can meet the requirements of real-time matching.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2215006 (2022)
  • Zhijun Yu, Guodong Wang, and Xinyue Zhang

    Recently, multiscale and multistage image deblurring methods have encountered issues such as insufficient multiscale image feature extraction and loss of feature information due to stage deepening. To address the above problems, an image deblurring method based on an enhanced multiscale feature network is proposed in this paper. First, a multiscale residual feature extraction module is proposed, and convolution kernels with various sizes are used in the two branches to expand the receptive field and fully extract the feature information of images with various resolution sizes. Second, a cross-stage attention module is proposed to filter and transfer the key features of the image. Finally, a cross-stage feature fusion module, similar to a jump connection, is designed to compensate for feature loss and fuse feature information from input images with various sizes, to enrich spatial feature information, and to improve texture processing. Experimental results on the GoPro and HIDE datasets show that the proposed method can successfully reconstruct the image.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2215007 (2022)
  • Xiaole Chen, Ruifeng Yang, and Chenxia Guo

    Aiming at the challenges of the low reliability and poor practicability of current optical fiber coil defect detection approaches, an algorithm of optical fiber coil defect detection based on an enhanced low-rank representation model is suggested in this paper. Based on the low-rank representation theory, the defect detection challenge is modeled, defect-free optical fiber coil image is modeled as a low-rank structure, and defect is modeled as a sparse structure. Meanwhile, the Laplacian regularization constraint is incorporated into the low-rank representation model to widen the gap between defect and background. To enhance the efficiency of the algorithm, the idea of power iteration is employed to achieve singular value decomposition. The algorithm is confirmed via experiments, and the findings indicate that the suggested algorithm possesses a good detection performance for various defect types. Compared with other algorithms, the suggested algorithm attains the best performance.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2215008 (2022)
  • Lingzhi Yu, and Xifan Zhang

    Sketch-based image retrieval uses hand-drawn sketches as input to retrieve corresponding natural images, allowing users to draw and find desired natural images when no accurate query images are available. Edge maps are commonly used as an intermediate modality to bridge the domain gap between sketches and natural images. However, existing methods ignore the inherent relationship between edge maps and natural images. Based on the assumption that natural images and their corresponding edge maps have similar key regions, this paper proposes a deep learning model based on a cross-domain spatial co-attention network. The proposed model derives the shared spatial attention mask from the fused feature of the edge map and natural image, and it combines the loss function and auxiliary classifier for end-to-end training. When compared with existing representative sketch-based image retrieval methods, the proposed method can effectively extract the features of sketches and natural images, with mean average precision (mAP) values of 0.933 and 0.799 on the Sketchy and TU-Berlin datasets, respectively, outperforming most representative methods.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2215009 (2022)
  • Zhiling Zhu, Zhifeng Zhou, Yong Zhao, Yongquan Wang, and Liduan Wang

    This study proposes a multiobject tracking algorithm combining YOLO-V4 and improved SiameseRPN to overcome the low accuracy of existing multiobject tracking algorithms. First, the tracking objects are automatically obtained using the YOLO-V4 network. After creating the template, enter it into the SiameseRPN tracking network. Then, the adaptive background strategy is adopted in the template branch to initialize the template, and the Siamese network is constructed by integrating residual connections. Finally, the results of YOLO-V4 and improved SiameseRPN are used to perform data association through the Hungarian algorithm to achieve multiobject tracking. The experimental results show that the proposed algorithm has better tracking performance than other algorithms. Furthermore, the proposed algorithm can achieve stable tracking under object scale, appearance change, and partial occlusion conditions.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2215010 (2022)
  • Mengxue Pan, Ning Qu, Yeru Xia, Deyong Yang, Hongyu Wang, and Wenlong Liu

    Given the relatively small proportion of breast cancer in the overall image, which affects the accuracy of early breast cancer detection, this study proposes a wide residual-depth neural network based on convolution residual blocks to restore the high-resolution features of breast cancer magnetic resonance images. The proposed method adopts the combination of global and local residuals, allowing the top layer of the network to directly receive a substantial amount of low-frequency information. A convolution layer is added in front of each residual block for feature pre-extraction, and the sub-pixel convolution layer is used for up-sampling to complete the reconstruction of the low-resolution image. Experiments on the dataset with 260 samples and comparisons with other methods reveal that the proposed network outperforms bicubic interpolation and other deep learning methods in super-resolution of breast cancer magnetic resonance images.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2217001 (2022)
  • Wei Xiong, Ling Yue, Lei Zhou, Kai Zhang, and Lirong Li

    Most current cross-modal pedestrian re-identification algorithms lack clustering ability and make it difficult to extract high-efficiency discriminative features; therefore, this paper proposes a multi-granular cross-modal pedestrian re-identification algorithm. First, the nonlocal attention mechanism module is added to the backbone network Resnet50 to focus on the relationship between long-distance pixels and retain detailed information. Second, a multi-branch network is used to extract fine-grained feature information to improve the distinguishing feature extraction ability of the model. Finally, the sample- and center-based triple losses are combined to supervise the training process, which achieves the purpose of accelerating the convergence of the model. The proposed method achieves Rank-1 and mean average precision of 62.83% and 58.10%, respectively, in the full search mode of the SYSU-MM01 dataset. In the visible-to-infrared mode of the RegDB dataset, Rank-1 and mAP reach 87.78% and 76.22%, respectively.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2220001 (2022)
  • Shanen Yu, Wendong Yang, and Huajun Li

    The structural design of optical tomography sensors is crucial for enhancing the reconstructed image quality, reducing the structural cost, and reducing the system complexity. In this study, we investigate the reconstructed images under various distributed scenes and various sensor structures. The findings reveal that the scene's simultaneous interpreting and sensor structure have a significant effect on the reconstructed image quality. The parameter curves generated by changing the number of sensors in a scene follow a similar pattern; the sensor structure of 30 × 30 has good performance in the reconstructed images under various distribution scenes. According to the study findings and considering the factors, including sensor size, cost, and feasibility, a set of 30 × 30 laser photodiode optical tomography hardware devices is designed. The static experiments of optical tomography with various distribution cross-section states are conducted using the device. The findings demonstrate that the image hardware device can accurately reconstruct various distribution cross-section states.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2228001 (2022)
  • Jiaqi Yao, Haoran Zhai, Ren Liu, Hong Zhu, Liuru Hu, and Xinming Tang

    As a new generation of laser altimetry satellite, photon-counting laser altimetry satellite has better data coverage density and altimetry accuracy. The final altimeter data's accuracy is greatly influenced by forward scattering caused by cloud, aerosol layers, and snow blowing. The atmospheric detection algorithm of the photon-counting laser altimeter satellite is considerably different from that of the traditional laser altimeter satellite. This study focuses on the related technologies of photon-counting laser altimetry satellite atmospheric detection and summarizes two photon-counting satellites cloud aerosol transmission system and ICESat-2 from the aspects of system parameters, atmospheric detection technology, and inversion of atmospheric-related parameters, and the difficulties in atmospheric detection technology of existing photon-counting laser altimetry satellites are examined. The research on photon-counting satellite atmospheric detection technology in this study will promote the research on photon-counting lidars in China.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2228002 (2022)
  • Jianqi Miao, Hongtao Wang, and Puguang Tian

    This paper proposes an airborne light detection and ranging point cloud classification method that integrates the graph convolution model and PointNet to address the problem of low classification accuracy, which is caused by the lack of point local features description in the three-dimensional deep learning network PointNet. This method first determines the optimal neighborhood of each point using the minimum Shannon entropy criterion, and then the shallow features of each point are calculated to feed into the deep learning network. Second, the shallow features of point clouds are used to derive local features via a graph convolution operation, which are combined with the point-based and global features extracted by PointNet to obtain the feature vectors. Finally, the above features are combined to obtain the feature vectors. The proposed method is validated using the Vaihingen dataset provided by the International Society for Photogrammetry and Remote Sensing, and the experimental results show that the proposed method improves accuracy by 9.58 percentage points compared with the PointNet point cloud classification method.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2228003 (2022)
  • Xiaoyu Yang, and Xili Wang

    Remote sensing image segmentation is a crucial application in the field of remote sensing image processing. A semantic segmentation network MAE-Net combining multiscale attention and edge supervision is proposed based on the deep learning network U-Net to address the phenomena of building missing classification, missing segmentation, and inaccurate building contour segmentation in the building segmentation of remote sensing images via convolution neural network. First, a multiscale attention module is introduced into each layer during the coding stage. The module separates the input feature map into equal channels and employs the convolution kernels of various sizes for feature extraction in each group. Thereafter, the channel attention mechanism is used in each group to gain more efficient features through self-learning to solve the problem of inaccurate feature extraction of buildings of various sizes. Second, in the decoding stage, the edge extraction module is introduced to build the edge supervision network. The error between the learning edge label and expected edge is supervised by the loss function to aid the segmentation network in better learning the building edge features and make the building boundary's segmentation result more continuous and smoother. The experimental findings show that MAE-Net can completely segment buildings from remote sensing images with complicated and diverse backgrounds and large-scale changes, and the segmentation accuracy is higher.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2228004 (2022)
  • Xulun Liu, Shiping Ma, Linyuan He, Chen Wang, Xu He, and Zhe Chen

    Addressing the challenge of low detection accuracy due to large differences in target scale and random direction distribution in remote sensing images, this study proposes a remote sensing object detection method based on a sparse mask Transformer. This approach is based on a Transformer network. First, the angle parameter is added to the Transformer network for realizing appropriate rotational characteristics of remote sensing targets. Then, in the feature extraction section, the multi-level feature pyramid is employed as an input to deal with the large variations of the remote sensing image targets' size and enhance the detection impact for targets with various scales, particularly for small targets. Finally, the self-attention module is replaced with a sparse-interpolation attention module, which efficiently reduces the error due to the large computation amount of Transformer network detecting high-resolution images, and accelerates the network convergence speed during the training phase. The detection findings on the large-scale remote sensing dataset DOTA reveal that the proposed method's average detection accuracy is 78.43% and the detection speed is 12.5 frame/s. Compared to the traditional methods, the proposed method's mean average precision (mAP) is improved by 3.07 percentage points, which shows the proposed method's effectiveness.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2228005 (2022)
  • Xirui Ma, Yueqian Shen, Jinhu Wang, and Teng Huang

    Street lamp extraction is an important research direction for target extraction from point clouds obtained using vehicle-borne laser scanning. However, the upper and lower parts of street lamps always produce adhesion and occlusion, making street lamp identification difficult. Considering that the adhesion and occlusion are unlikely to occur in the lamp pole’s middle, relative position relationship between lamp burner and lamp pole is examined and a hierarchical extraction approach for street lamp point cloud considering the relative distance is introduced. First, the original point cloud is divided into the lamp burner, lamp pole, and ground layers using the cloth simulation filtering (CSF) algorithm, then the connected component analysis is used to cluster the lamp burner layer and lamp pole layer point clouds. Then, the lamp pole point cloud is extracted using the diagonal’s length in each clustering rectangle of the lamp pole layer and the fitted circle’s included area. Finally, the lamp burner point cloud is estimated based on the relative distance between the lamp burner’s center and the lamp pole’s center to extract the entire street lamp. The proposed method was tested on three datasets. The proposed method’s extracted correctness, completeness, quality, and F1 value for data 1 are 100%; the correctness, completeness, and F1 value for data 2 are 87.50% while the quality is 77.78%; the completeness and quality for data 3 are 94.74%, and the correctness is 100% while the F1 value is 97.30%. The experimental findings illustrate that this method can efficiently recognize and extract street lamps.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2228006 (2022)
  • Yunwei Pu, Taotao Liu, Haixiao Wu, and Jiang Guo

    Radar emitter signal recognition is an important means to defeat the enemy on the actual battlefield. To solve the problems of incomplete characteristic parameters and low timeliness of an artificial extraction's radar emitter signal, based on the unique role of the ambiguity function in characterizing the internal structure of the signal, this study proposes a recognition method for convolutional bidirectional long- and short-term memory network combined with the transformation of the main ridge coordinate of the ambiguity function. First, to amplify the difference between different signals, the main ridge section was mathematically converted into a geometric image in the polar coordinate domain, which was used as the input of a neural network. Second, a convolutional neural network was designed to excavate the feature information of a two-dimensional time-frequency map. Finally, a bidirectional long- and short-term memory network was built to classify and recognize the extracted features. Simulation results show that the proposed method maintains 100% accuracy even when the signal-to-noise ratio is above 0 dB, the recognition rate reaches above 93.58% even at -6 dB, and the signal classification time is effectively shortened. Furthermore, the proposed method extracts the hidden abstract features of the signal and produces good timeliness and antinoise performances.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2228007 (2022)
  • Xingchen Pan, Cheng Liu, Weigang Xiao, and Jianqiang Zhu

    The ptychographic iterative engine (PIE), a recently developed lens-less imaging technique, provides a reliable access and solution to the phase problem. It has superior convergence speed and accuracy compared with conventional coherent diffraction imaging technologies. PIE is used extensively in various imaging and measurement fields owing to its unlimited field-of-view, high resolution, and high robustness to noise. In this review, we discuss the background and principle of PIE and summarize the major technological advances. The primary milestones of PIE in X-ray, electron, and visible light over the past decade are also discussed. Furthermore, we also elaborate on the latest PIE-based techniques and potential future developments and challenges.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2200001 (2022)
  • Zheng Sun, and Shuyan Wang

    Intravascular optical coherence tomography (IVOCT) is a minimally invasive imaging model that currently has the highest resolution. It is capable of providing information of the vascular lumen morphology and near-microscopic structures of the vessel wall. For each pullback of the target vessel, hundreds or thousands of B-scan images are obtained in routine clinical applications. Manual image analysis is time-consuming and laborious, and the findings depend on the operators' professional ability in some sense. Recently, as deep learning technology has continuously made significant breakthroughs in the medical imaging field, it has also been used in the computer-aided automated analysis of IVOCT images. This study outlines the applications of deep learning in IVOCT, primarily involving image segmentation, tissue characterization, plaque classification, and object detection. The benefits and limitations of the existing approaches are discussed, and the future possible development is described.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2200002 (2022)
  • Jinyong Zhang, Zongneng Xie, Ping Kong, and Yue Li

    Speckle blood perfusion imaging is a medical imaging technique that monitors blood perfusion information based on the scattering characteristics of red blood cells to the laser. Its benefits over traditional blood perfusion monitoring technologies include real-time imaging, high resolution, cheap cost, and the absence of contrast chemicals, particularly in surface blood flow monitoring. This review focuses on the basic principle of speckle blood perfusion imaging, technical advancements, and its application in monitoring fundus blood flow, cerebral blood flow, body surface microcirculation, and tumor blood perfusion, as well as its new application in monitoring angiogenesis of chicken embryo tumors.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2200003 (2022)
  • Yunle Ding, Huiqin Wang, Ke Wang, Zhan Wang, and Gang Zhen

    The classification and recognition of pigments is the basis of ancient mural protection and restoration. The multispectral imaging method can quickly obtain and analyze the spectral image data of mural pigments without damage. Continuous convolution and pooling operations in the traditional convolutional neural network feature extraction algorithm will lose part of the feature information of the fresco multispectral image, making the image details unable to be reconstructed, resulting in an unsmooth boundary of the classified image. To solve this problem, a three-dimensional hole convolution residual neural network based on multiscale feature fusion is proposed to classify multispectral mural images. To begin, the hole structure is introduced into the convolution kernel to improve the receptive field and extract different scale information to avoid the loss of some features caused by the pooling operation. Second, the feature fusion method is used to combine images of different scales. Finally, a multilevel gradient of the feature map is introduced to prevent the edge from disappearing. On the multispectral image dataset of simulated murals, the experimental results show that the proposed method's overall accuracy and average accuracies are 98.87% and 96.89%, respectively. The proposed method not only outperforms the control groups in classification accuracy, but it also produces classification images with clearer boundaries.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2230001 (2022)
  • Changjie Liu, Haochuan Wang, Guoqing Wang, and Jinping Chen

    Background subtraction is one of the most commonly used methods for moving target detection in video sequences. A background modeling method integrating image color and texture features is proposed to accurately and quickly complete the background modeling of a video sequence and accurately detect the moving foreground. First, the kernel and mode kernel density estimation methods are used to model the RGB color space and the Haar local binary pattern (HLBP) texture of a video image, and the color and texture models are obtained. Then, the color and texture models are fused by normalization and twice threshold judgment. The color and texture models complement each other to form a background model by setting an appropriate threshold. Finally, the background model is used to detect the moving foreground of the video sequence, and the background model is updated. The experimental results show that the proposed method works well with dynamic backgrounds and shadowed scenes. The proposed method's average F1-score on the test set is 0.8471, which is higher than the common algorithms. The average frame rate is 25.57 frame·s-1, which meets the real-time requirement.

    Nov. 25, 2022
  • Vol. 59 Issue 22 2233002 (2022)
  • Please enter the answer below before you can view the full text.
    8-7=
    Submit