Acta Optica Sinica, Volume. 45, Issue 1, 0109002(2025)
Bio-Vision-Inspired Neural Network for Dynamic-Static Segmentation of Particle Holograms
Holographic imaging, widely used for detecting sol particles such as microalgae, pollen, and biological cells, allows us to reconstruct various images, such as amplitude, phase, and morphology, from recorded holograms. However, these reconstructions often suffer from interference caused by background fringes of static particles. In practical applications, static particles can adhere to the optical surfaces within the imaging pathway, leading to noisy images and reduced accuracy in detecting dynamic particles. Therefore, the accurate segmentation of dynamic and static particles is crucial to enable effective downstream tasks such as two-dimensional (2D) shape and phase imaging, or three-dimensional (3D) reconstructions of the particles. To address this challenge, we propose Hformer, a biologically inspired neural network based on the Transformer architecture, designed specifically for the dynamic-static particle segmentation problem in holographic imaging. The key innovation of Hformer is its ability to process both grayscale images and event data—mimicking the dual sensitivity of biological vision to light intensity and changes in light intensity over time. By integrating these two modalities and employing self-supervised learning, Hformer achieves high-quality segmentation of holograms containing overlapping dynamic and static targets, ensuring the preservation of high-frequency fringes necessary for subsequent reconstructions.
Hformer incorporates several key components, including grayscale and event inputs, spiking neural network (SNN), transformer-based architecture, dual decoders, and a self-supervised learning strategy. The input to the Hformer network consists of three consecutive grayscale images, which are combined into a three-channel image. Simultaneously, these grayscale images are processed by an event generator to produce event data, capturing the dynamic changes within the scene. Hformer uses an SNN to process the event data, mimicking the biological processing of visual information through discrete spikes. The SNN efficiently extracts features from the event data, which are then fused with the grayscale image features. The Transformer-based architecture captures long-range dependencies in the images, effectively integrating spatial and temporal information. The key modules of the Transformer architecture, such as the local-enhanced window (LeWin) module, multi-head self-attention (MSA), and layer normalization, enable efficient feature extraction and integration from both grayscale and event inputs. Hformer uses two independent decoders for the dynamic and static particle segmentation. These decoders work in parallel, ensuring that the dynamic and static particle holograms are separated and reconstructed independently, preserving the distinct characteristics of each. Hformer adopts a self-supervised learning approach, generating pseudo-labels from the data itself, making it more practical for real-world applications where labeled training data is scarce. We evaluate the performance of Hformer through extensive experiments using both simulated and real holographic data. The real data includes holograms of pollen particles, while the simulated data is generated using a template-based method to mimic real-world holographic scenarios.
Through a series of ablation studies, we systematically remove various components of Hformer to analyze their influence on segmentation performance. The ablation experiments involve testing three Hformer variants: removing the SNN module, using only grayscale input (Yformer), and a simpler version with single input and output branches (Uformer). The results show that removing the SNN results in a significant drop in segmentation accuracy, as measured by the structural similarity index measure (SSIM) values of the reconstructed holograms. This confirms the necessity of using event data for accurate dynamic-static segmentation. Additionally, using a single input (Yformer) or a single output (Uformer) leads to poorer performance, highlighting the importance of dual-modal input and dual-output decoders. Further, we demonstrate that directly reconstructing 2D shapes from original holograms without segmentation often leads to significant distortions, particularly when dynamic and static particle holograms overlap. Regarding self-supervised learning performance, the network is trained on one dataset and tested on other three datasets with different simulation parameters. Results show that the self-supervised model consistently outperforms traditional supervised learning approaches in terms of SSIM values. The superior generalization ability of the self-supervised model can be attributed to its ability to learn the inherent structure of holographic data, preserving important details like high-frequency fringes in the holograms. This demonstrates the potential of self-supervised learning in applications where labeled data is limited or unavailable. In addition to simulated data, we also test Hformer on real holographic images of pollen particles. Results show that Hformer effectively segments dynamic particles from static backgrounds in a lensless holographic imaging system. This confirms that Hformer can handle real-world challenges, such as noise and overlapping particles, making it a promising solution for particle holography.
For the dynamic-static segmentation task of sol particle holograms, inspired by the dual sensitivity of biological vision to light intensity and its variations, we propose Hformer, a neural network that incorporates event data processing. The network is based on the Transformer architecture, featuring dual input branches (grayscale and event data) and dual output branches for dynamic and static holograms, employing a self-supervised learning paradigm. Our results show that the biomimetic designs in Hformer, such as event data input, the SNN module, and independent dynamic-static decoders, all contribute to the accurate segmentation of overlapping dynamic and static particle holograms while preserving high-frequency fringes. Furthermore, the self-supervised learning approach adopted by Hformer not only simplifies the data preparation process compared to traditional supervised methods but also offers better transferability and generalization. Experimental results with aerosol pollen demonstrate that Hformer accurately and completely obtains dynamic sol particle holograms, making it a promising front-end algorithm for a wide range of tasks in shape, phase, and 3D morphology reconstruction and detection of flow field particles.
Get Citation
Copy Citation Text
Mingjie Tang, Jie Xu, Zhenxi Chen, Rui Xiong, Liyun Zhong, Xiaoxu Lü, Jindong Tian. Bio-Vision-Inspired Neural Network for Dynamic-Static Segmentation of Particle Holograms[J]. Acta Optica Sinica, 2025, 45(1): 0109002
Category: Holography
Received: Aug. 21, 2024
Accepted: Sep. 27, 2024
Published Online: Jan. 20, 2025
The Author Email: Xu Jie (xujie@gml.ac.cn), Tian Jindong (jindt@szu.edu.cn)