NUCLEAR TECHNIQUES, Volume. 46, Issue 3, 030101(2023)

X-ray crystallography experimental data screening based on convolutional neural network algorithms

Zi HUI1, Li YU2,3, Huan ZHOU4, Lin TANG1, and Jianhua HE1、*
Author Affiliations
  • 1The Institute for Advanced Studies, Wuhan University, Wuhan 430072, China
  • 2Shanghai Institute of Applied Physics, Chinese Academy of Sciences, Shanghai 201800, China
  • 3University of Chinese Academy of Sciences, Beijing 100049, China
  • 4Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai 201204, China
  • show less
    Figures & Tables(15)
    Comparison of LN83 diffraction pattern before (a) and after (b) gray value equalization
    Diffraction pattern of protein crystal after gray value equalization
    LN83 diffraction pattern image enhancement results (a) Original image, (b) Flip left and right, (c) Rotate 90° counterclockwise, (d) Rotate 25° counterclockwise and move 10 pixels to the right, (e) Rotate 110° clockwise, move 5 pixels to the right and 5 pixels to the down, (f) Rotate 60° clockwise
    Flow chart of convolutional neural network for training and prediction
    Accuracy and operation rate of verification set and test set based on different networks(a) Verification set accuracy, (b) Test set accuracy, (c) Verification set running rate, (d) Test setverification set running rate
    t-SNE dimensionality reduction results of six convolutional neural networks (the circle is the "maybe " sample, the cross is the "Miss" sample, and the pentagram is the "hit" sample)(a) MobileNets, (b) ResNet, (c) Inception-v1, (d) Inception-v3, (e) Vgg16, (f) AlexNet
    Running rate of LN83 on GPU and CPU
    MobileNets hit /maybe (a) and miss sample (b) reliability distribution
    Sample selected by MobileNets (a) Hit, (b) Maybe, (c) Miss
    • Table 1. Experimental data

      View table
      View in Article

      Table 1. Experimental data

      数据

      Dataset

      蛋白质

      Protein

      入射能量

      Incident energy / keV

      仪器

      Instrument

      探测器

      Detector

      LN83氢化酶蛋白质晶体 Hydrogenase9.498MFXRayonix
      LN84光系统 II Photosystem II9.516MFXRayonix
      LO19辛环素 Cyclophilin A9.442MFXRayonix
      L498嗜热菌蛋白酶 Thermolysin9.773CXICSPAD
    • Table 2. Data classification

      View table
      View in Article

      Table 2. Data classification

      数据类型

      Data type

      布拉格点的数量

      Number of Bragg points

      有效信息含量

      Effective information content

      命中HitX≥10较多有效信息 More valid information
      也许命中 Maybe10>X≥4较少有效信息 Less valid information
      未命中 MissX≤3缺失有效信息 Loss valid information
    • Table 3. Five convolutional neural networks

      View table
      View in Article

      Table 3. Five convolutional neural networks

      网络

      Net

      网络深度 / 层

      Depth / layer

      特点

      Characteristic

      AlexNet8网络层数少,采用ReLu激活函数 Less layer, use ReLu activation function
      Vgg1616采用小卷积核,收敛速度加快 Small convolution kernels to speed up convergence
      Inception-V122并行计算,去除全连接层 Parallel computing, remove the full connection layer
      Inception-V346并行计算,将卷积拆分,减少数据规模 Parallel computing, split convolution
      ResNet101101采用残差网络优化学习目标 Optimize learning objectives using residual network
      MobileNets-V128卷积可分离,引入全局超参数 Separate the convolution depth, use global hyperparameters
    • Table 4. Verification set and test set accuracy of each samples based on MobileNets

      View table
      View in Article

      Table 4. Verification set and test set accuracy of each samples based on MobileNets

      样品 Samples验证集准确度 Accurancy / %测试集准确度 Accurancy
      L498-氢化酶蛋白质晶体 Thermolysin62.27/10
      LN84-光系统 II Photosystem II82.38/10
      LN83-嗜热菌蛋白酶Hydrogenase81.88/10
      LO19-辛环素Cyclophilin A78.09/10
    • Table 5. Accuracy of verification set and test set using different networks based on LN83

      View table
      View in Article

      Table 5. Accuracy of verification set and test set using different networks based on LN83

      网络 Nets标签 LabelLN83-氢化酶蛋白质晶体 Hydrogenase
      命中 Hit也许命中 Maybe未命中 Miss
      MobileNets命中 Hit0.9190.0700.011
      也许命中 Maybe0.1680.7010.131
      未命中 Miss0.0140.0430.943
      Inception-v1命中 Hit0.9350.0430.022
      也许命中 Maybe0.3500.4160.234
      未命中 Miss0.0080.0280.964
      Inception-v3命中 Hit0.9580.0290.013
      也许命中 Maybe0.5470.3430.109
      未命中 Miss0.0580.2020.740
      Vgg16命中 Hit0.8930.0860.021
      也许命中 Maybe0.0730.8760.051
      未命中 Miss0.0200.1410.840
      ResNet命中 Hit0.8540.0840.063
      也许命中 Maybe0.0150.5180.467
      未命中 Miss0.0010.0040.995
      AlexNet命中 Hit0.9070.0140.079
      也许命中 Maybe0.9270.0220.051
      未命中 Miss0.5090.0160.475
    • Table 6. Accuracy of two classification based on Ln83 sample

      View table
      View in Article

      Table 6. Accuracy of two classification based on Ln83 sample

      网络Nets命中/也许命中Hit/maybe未命中Miss
      MobileNets0.9700.943
      Inception-V10.9440.964
      Inception-V30.9720.740
      Vgg160.9740.840
      ResNet0.8730.955
      AlexNet0.9250.475
    Tools

    Get Citation

    Copy Citation Text

    Zi HUI, Li YU, Huan ZHOU, Lin TANG, Jianhua HE. X-ray crystallography experimental data screening based on convolutional neural network algorithms[J]. NUCLEAR TECHNIQUES, 2023, 46(3): 030101

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Research Articles

    Received: Oct. 28, 2022

    Accepted: --

    Published Online: Apr. 17, 2023

    The Author Email:

    DOI:10.11889/j.0253-3219.2023.hjs.46.030101

    Topics