Fluorescence microscopy is an important tool in the life sciences for observing cells, tissues, and organisms. However, the Abbe diffraction limit [
Photonics Research, Volume. 8, Issue 8, 1350(2020)
Fast structured illumination microscopy via deep learning
This study shows that convolutional neural networks (CNNs) can be used to improve the performance of structured illumination microscopy to enable it to reconstruct a super-resolution image using three instead of nine raw frames, which is the standard number of frames required to this end. Owing to the isotropy of the fluorescence group, the correlation between the high-frequency information in each direction of the spectrum is obtained by training the CNNs. A high-precision super-resolution image can thus be reconstructed using accurate data from three image frames in one direction. This allows for gentler super-resolution imaging at higher speeds and weakens phototoxicity in the imaging process.
1. INTRODUCTION
Fluorescence microscopy is an important tool in the life sciences for observing cells, tissues, and organisms. However, the Abbe diffraction limit [
Owing to its low phototoxicity and high frame rate acquisition, SIM stands out among these techniques to achieve optical super-resolution in bio-imaging [
To compute unknown frequencies from raw data, SIM requires three images with shifting illumination patterns to separate mixed spatial frequencies along a given orientation. To enhance isotropic resolution, this process is performed three times with illumination patterns obtained at different angles and requires a total of nine raw images per super-resolved (SR) SIM image, which means that the sample needs to be repeatedly exposed. Thus, reducing number of raw images in SIM reconstruction has been researched in recent years. SR image reconstruction using three [
Sign up for Photonics Research TOC Get the latest issue of Advanced Photonics delivered right to you!Sign up now
Machine learning [
The DL framework does not explicitly use any model or prior knowledge, and instead relies on large datasets to “learn” the underlying inverse problem. The convolutional neural networks (CNNs) [
In recent years, deep learning methods have been applied to super-resolution microscopic imaging, such as the regular optical microscopes [
The paper proposes the use of a deep-learning-based framework to reconstruct SIM images using fewer frames than are currently required. The cycle-consistent generative adversarial network (CycleGAN) is used to reconstruct the super-resolution image (we called it 3_SIM) through the single-direction phase shift of three raw SI images (we called them 1d_SIM). Owing to the characteristics of the CycleGAN, the data in train A and train B do not need to correspond one to one. The network can be trained without using paired training data, which reduces the number of training steps needed and saves time. Our method does not require assumptions about the modeling of the process of image formation, and instead creates a super-resolved image directly from the raw data. It requires only three SI images in a given direction and reconstructs a 1d_SIM image, and it can generate a 3_SIM image with a reconstruction resolution comparable to the traditional linear SIM methods. This method is parameter free, requires no expertise on the part of the user, is easy to implement on any SIM dataset, and does not rely on prior knowledge of the structure in the sample.
2. Methods
A. Cycle-Consistent Generative Adversarial Networks
Generative adversarial networks (GANs) [
CycleGAN [
The generator consists of three parts: encoders, a transformer, and decoders. The encoders extract features from an image using a convolution network. Then, different nearby features of an image are combined by the transformer, which uses six layers of ResNet blocks to transform the feature vectors of an image from domain to . The residual block in the transformer can ensure that properties of the inputs of previous layers are available for subsequent layers as well, so that the output does not deviate much from the original input. Otherwise, the characteristics of the original images are not retained in the output and the results are inaccurate. A primary aim of the transformer is to retain the characteristics of the original input, like the size and shape of the object, so that residual networks are a good fit for these kinds of transformations. The decoding step is the exact opposite of encoding, and it involves building low-level features from the feature vector by applying a deconvolution layer.
For the discriminator, a patch-GAN [
B. Loss Function
The goal of CycleGAN is to use the given training samples to learn mapping functions between domains and by applying adversarial losses to them.
The generators A and B should eventually be able to fool the discriminator regarding the authenticity of images generated by it. This can be performed if the recommendation made by the discriminator for the generated images is as close to 1 as possible. The generator seeks to minimize discriminator B , belongs to domain , and belongs to domain . Thus, the loss is The last and the most important loss function is cyclic loss, which captures whether the image can be recaptured using another generator. In the image translation cycle of networks, each image from domain should be able to bring back to the original image. Thus, the difference between the original image and the cyclic image should be as small as possible: The multiplicative factor of for cyc_loss assigns more importance to cyclic loss than discrimination loss, and the CycleGAN total loss is
C. Training
This paper generates the 1d_SIM images (super-resolution in one direction) and 9_SIM images (super-resolution in three directions) as datasets. The images of 1d_SIM contained high-frequency information in only one direction. CycleGANs are used to learn missing items of high-frequency information from a large dataset. Using a trained model, the missing values in the 1d_SIM image are filled, and a super-resolution 3_SIM image is reconstructed.
To train the neural network [Fig.
Figure 1.Schematics of the deep neural network trained for SIM imaging. (a) The inputs are 1d_SIM and 9_SIM images generated by nine lower-resolution raw images (using the SIM algorithm) as two training datasets with different training labels. The deep neural network features two generators and two discriminators. These generators and discriminators are trained by optimizing various parameters to minimize the adversarial loss between the network’s input and output as well as cycle consistency loss between the network’s input image and the corresponding cyclic image. The cyclic 9_SIM in the schematics is the final image (3_SIM) desired. (b) Detailed schematics of half of the CycleGAN training phase (generator 1d_SIM and discriminator 9_SIM). The generator consists of three parts: an encoder (which uses convolution layers to extract features from the input image), a converter (which uses residual blocks to combine different similar features of the image), and a decoder (which uses the deconvolution layer to restore the low-level features from the feature vector), realizing the functions of encoding, transformation, and decoding. The discriminator uses a 1D convolution layer to determine whether these features belong to that particular category. The other half of the CycleGAN training phase (generator 9_SIM and discriminator 1d_SIM) is the same as this.
Discriminator A and discriminator B input images of the 1d_SIM and 9_SIM datasets, respectively, are trained by the loss function to identify the generated image as one output by the generator. If the discriminator recognizes it as such, the input image is rejected. If generators A and B want to ensure that the images they generate are accepted by the discriminator, the generated images need to be very close to the original image. This can be implemented using . The discriminator also needs to be upgraded so that the discriminator can determine whether the output image is a raw image or one generated by the generator.
3. RESULTS
We validated the proposed method on both simulated and experimental data. To enable quantitative comparison, the 1d_SIM and 9_SIM images were generated from the same raw datasets. The 1d_SIM images were reconstructed from three of the nine raw SI frames, and the 9_SIM images were reconstructed from all nine raw SI frames. To verify the effectiveness of the neural network on images with different features, the authors prepared three datasets for training containing points, lines, and curves.
All datasets were generated in MATLAB. First, we generate some pixel size random binary images, superposed illuminating patterns and convolved with the point diffusion function (PSF), and we obtain nine raw no-noise SI images. Each pixel represents 10 nm; the PSF is based on the first-order Bessel function, where NA is set to 1.5 and wavelength is set to 532 nm; the pattern vector is 18; and the modulation index is 0.8. We reconstruct these raw SI images by SIM algorithm [
The trained model was applied to a distinct set of SIM images generated by the same stochastic simulation. First, model performance was tested on the dataset of point images (Fig.
Figure 2.Experimental comparison of imaging modes with a database of point images. For all methods, nine raw SI images were used as the basis for processing. (a) The WF image was generated by summing all raw SI images. (b) 1d_SIM images were generated by three raw SI images in the
To further quantify this improvement in resolution achieved by the CNN, complex graphics were used to test the proposed method. Figure
Figure 3.Using deep learning to transform images in the dataset of lines from 1d_SIM to 9_SIM. (a) WF line image. (b) 1d_SIM line image used as network input. (c) 3_SIM line image used as network output. (d) 9_SIM line image used as contrast. (e) The achieved resolution of different approaches of line images.
Figure
Figure 4.Deep learning-enabled transformation of images of curves from 1d_SIM to 9_SIM. (a) WF curve image. (b) 1d_SIM image of curves used as input to the neural network. (c) 3_SIM image that was the network output, compared to the (d) 9_SIM image.
The proposed method was also tested on a homemade setup of total internal reflection structured illumination microscopy (TIRF-SIM) shown in Fig.
Figure 5.Experimental setup for the TIRF-SIM. A laser beam with a wavelength of 532 nm was employed as the light source. After expansion, the light was illuminated into digital micromirror device (DMD) and generated structured illumination. A polarizer and a half-wave plate were used to rotate the polarization orientation; a spatial mask is used to filter the excess frequency components. The generated structured illumination is tightly focused by a high-numerical-aperture (NA) oil-immersion objective lens (Olympus,
Fluorescent beads [labeled with Rhodamine 6G (R6G) molecules, Bangs Laboratories] with a nominal diameter of 100 nm were imaged using a SIM system. The microscope was equipped with an oil-immersion objective lens. (Olympus, , ), and the excitation light was 532 nm. The peak wavelength of emission of fluorescence was 560 nm. An sCMOS (scientific complementary metal oxide semiconductor) camera (Hamamatsu, ORCA-flash 3.0) was used, with each pixel representing 65 nm in the sample plane.
Images of size were used in the experiment. The SIM algorithm (fairSIM ImageJ plugin [
The training model was obtained after 10,000 iterations, and then we applied it to the 1d_SIM nanoparticles’ test data. As shown in Fig.
Figure 6.Comparison of the experiment results of deep learning [(c) 3_SIM]) with (a) WF, (b) 1_direction SIM, and (d) 9_SIM. Wide-field images were generated by summing all raw images, 1d_SIM images were reconstructed using three SI raw images in one direction (
For a quantitative assessment of the quality of the images output by the network, the corresponding root mean square error (RMSE), peak signal-to-noise ratio (PSNR), structural similarity (SSIM index) [
|
Deep learning can also be used to transform images from wide field (WF) to SIM [
As shown in Fig.
Figure 7.Fourier analysis of the reconstructed images. (a) Comparison of the frequency spectrum of images with different numbers of Gaussian points. The frequency spectrum of the Gaussian points is highly symmetrical. (b) The different colors indicate different types of frequency-related information. The yellow area represents the frequency-related information of the original image, and the green area represents information restored by the network. The grid in (b) represents the relationship between the available frequency-related information and the frequency-related information recovered by the network. (c) The Fourier transform of the reconstructions in Fig.
We calculate the correlation of the spectrum information with a length of ( is the OTF cutoff frequency) in the direction and the direction (see
In the process of WF2SIM, the network was used to recover high-frequency information in the image [Fig.
In the 1d_SIM2SIM process, some high-frequency information was already present in the images [Fig.
Different models were trained using different numbers of data items for the WF2SIM and 1d_SIM2SIM training datasets, and they were used to reconstruct the WF and 1d_SIM images, respectively. Figures
Figure 8.Comparing WF to 9_SIM with 1d_SIM to 9_SIM. (a) The 9_SIM image reconstructed from nine SI raw images. (b)–(d) Network output, 200, 500, and 900 image pairs (1d_SIM and 9_SIM) were used to train the network models, respectively. (e)–(h) Network output, using 100, 200, 500, and 900 image pairs (WF and 9_SIM) as datasets to train the network models. Each network underwent 10,000 iterations. Some details were not correctly restored in the WF-to-9_SIM training model. The arrows in (a)–(h) point to a missing detail.
4. CONCLUSION
Since the introduction of structured illumination microscopy, numerous algorithms have been developed to reconstruct super-resolved images from SI images. Considerable effort has been invested to reduce the number of raw SI frames, but images generated by such treatment are poor and require parameter tuning.
This study proposed a fast, precise, and parameter-free method for super-resolution imaging using SI frames. Unpaired simulated SIM images were used for unsupervised training by the CycleGAN network. The results of experiments showed that the CycleGAN used in this work performed well to help generate a reconstructed SIM image from three raw SIM frames (3_SIM). The quality of the generated image was very similar to the original nine-frame SIM image (9_SIM). The image reconstructed using 1d_SIM images through CNNs yielded images of better quality than that reconstructed from the WF image. During network training, 1d_SIM to 9_SIM also delivered better performance. In addition, recent studies [
The central idea of the proposed technique is based on the observation that the SI image datasets contained a large amount of structural information. By the principle of ergodicity, statistical information learned from such large datasets ensembles in a 1d_SIM image is sufficient to predict 9_SIM images with high fidelity.
All images were blindly generated here by the deep network: that is, the input images were not previously seen by the network. Thus, the network can recover images by learning missing high-frequency information from large datasets, instead of merely replicating the images.
As a purely computational technique, the proposed method does not require any changes in current systems of microscopy and requires only standard 1d_SIM and 9_SIM images for training. Although different types of images need to be trained separately, the neural network used in our method enables us to complete the training efficiently. Once the model has been trained, it can be applied to new 1d_SIM images to rapidly generate a 9_SIM image. This approach can also be extended to nonlinear SIM to reduce the number of frames needed to render it suitable for bio-imaging.
Acknowledgment
Acknowledgment. L. Du acknowledges the support given by the Guangdong Special Support Program.
[1] E. Abbe. Contributions to the theory of the microscope and that microscopic perception. Arch. Microsc. Anat., 9, 413-468(1873).
[12] E. Narimanov. Resolution limit of label-free far-field microscopy. Adv. Photon., 1, 056003(2019).
[13] E. F. Fornasiero, K. Wicker, S. O. Rizzoli. Super-resolution fluorescence microscopy using structured illumination. Super-Resolution Microscopy Techniques in the Neurosciences, 133-165(2014).
[28] Z. Ghahramani, I. J. Goodfellow, J. Pouget-Abadie, M. Welling, C. Cortes, M. Mirza, N. D. Lawrence, B. Xu, D. Warde-Farley, K. Q. Weinberger, S. Ozair, A. Courville, Y. Bengio. Generative adversarial nets. Proceedings of the 27th International Conference on Neural Information Processing Systems, 2672-2680(2014).
[29] , J.-Y. Zhu, T. Park, P. Isola, A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. IEEE International Conference on Computer Vision, 2242-2251(2017).
[30] M. Mirza, S. Osindero. Conditional generative adversarial nets(2014).
[31] L. A. Gatys, A. S. Ecker, M. Bethge. Image style transfer using convolutional neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2414-2423(2016).
[32] , P. Isola, J.-Y. Zhu, T. Zhou, A. A. Efros. Image-to-image translation with conditional adversarial networks. 30th IEEE Conference on Computer Vision and Pattern Recognition, 5967-5976(2017).
[33] B. Leibe, C. Li, J. Matas, M. Wand, N. Sebe, M. Welling. Precomputed real-time texture synthesis with Markovian generative adversarial networks. Computer Vision—European Conference on Computer Vision (ECCV), 702-716(2016).
[34] K. Daniilidis, N. Sundaram, T. Brox, P. Maragos, K. Keutzer, N. Paragios. Dense point trajectories by GPU-accelerated large displacement optical flow. Computer Vision—European Conference on Computer Vision (ECCV), 438-451(2010).
[35] , C. Godard, O. Mac Aodha, G. J. Brostow. Unsupervised monocular depth estimation with left-right consistency. 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6602-6611(2017).
[36] , K. He, X. Zhang, S. Ren, J. Sun. Deep residual learning for image recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770-778(2016).
[40] M. B. Matthews, Z. Wang, E. P. Simoncelli, A. C. Bovik. Multi-scale structural similarity for image quality assessment. Conference Record of the 37th Asilomar Conference on Signals, Systems & Computers, 1398-1402(2003).
Get Citation
Copy Citation Text
Chang Ling, Chonglei Zhang, Mingqun Wang, Fanfei Meng, Luping Du, Xiaocong Yuan, "Fast structured illumination microscopy via deep learning," Photonics Res. 8, 1350 (2020)
Category: Imaging Systems, Microscopy, and Displays
Received: May. 11, 2020
Accepted: Jun. 15, 2020
Published Online: Jul. 23, 2020
The Author Email: Chonglei Zhang (clzhang@szu.edu.cn), Luping Du (lpdu@szu.edu.cn), Xiaocong Yuan (xcyuan@szu.edu.cn)