Researching | Monte Carlo simulation fused with target distribution modeling via deep reinforcement learning for automatic high-efficiency photon distribution estimation

Photonics Research, Volume. 9, Issue 3, B45(2021)

Monte Carlo simulation fused with target distribution modeling via deep reinforcement learning for automatic high-efficiency photon distribution estimation

Jianhui Ma¹, Zun Piao¹, Shuang Huang¹, Xiaoman Duan¹, Genggeng Qin¹, Linghong Zhou^1,2、*, and Yuan Xu^1,3、*

Author Affiliations

¹School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China

²e-mail: smart@smu.edu.cn

³e-mail: yuanxu@smu.edu.cn

show less

Full Text Get PDF

Figures&Tables (16)

References (42)

Paper Information

Figures & Tables(16)

Fig. 1. Automatic scatter estimation framework. The MC algorithm generates raw scatter signals in terms of the X-ray source energy spectrum and system geometry configuration. The DRL scheme (denoted by the dashed black arrow) employs a deep Q-network to interact with the statistical distribution model to yield a satisfactory scatter image.

Download full size

View in Article

Fig. 2. Network architecture in the DDQN. The network takes a scatter image as input and predicts three possible actions for parameter adjustment. The number at the top denotes the feature map size and channel number, and the operations for each layer are presented at the bottom. For instance, the first hidden layer convolves 16 filters of 3×3 with stride four with the input layer followed by a rectified linear unit (ReLU) activation function, and the output layer is a fully connected linear layer with three outputs.

Download full size

View in Article

Fig. 3. (a) is the primary projection of the head and neck (H&N) patient; (b)–(i) represent raw scatter projections that are separately calculated by the MC particle sampling algorithm with source photons of 5×105, 1×106, 5×106, 1×107, 1×108, 1×109, 1×1010, and 1×1012 for the same projection angle.

Download full size

View in Article

Fig. 4. (a)–(g) are the scatter images of Figs. 3(b)–3(h) smoothed by the over-relaxation smoothing algorithm; (h) corresponds to Fig. 3(i), which is considered a noise free scatter image and utilized as the ground truth.

Download full size

View in Article

Fig. 5. Intensity profiles of Fig. 4 along the (a) horizontal and (b) vertical directions as denoted by the orange lines in Fig. 4(h).

Download full size

View in Article

Fig. 6. From top to bottom: six testing results with 5×105, 1×106, 5×106, 1×107, 1×108, and 1×109 source photons. From left to right: primary signals, smoothed scatter signals restored by the over-relaxation algorithm with empirical parameters, smoothed scatter signals restored by the proposed framework, and the ground truth.

Download full size

View in Article

Fig. 7. (a)–(d) Intensity profiles of the first, second, third, and last rows in Fig. 6. The locations of the profiles (a)–(d) are denoted by orange lines at the last column of Fig. 6.

Download full size

View in Article

Fig. 8. (a)–(c) indicate boxplots of the metric difference of SSIM, PSNR, and RAE between Empirical and ASEF for all testing cases. metricdiff=metricEmpirical−metricASEF, where metric denotes SSIM, PSNR, and RAE, respectively. (d) is the boxplot of the SSIM comparison of Empirical and ASEF.

Download full size

View in Article

Fig. 9. Automatic scatter estimation process for a testing case. (a)–(c) are smoothed scatter images at Steps 1, 7, and 13, respectively. (d) and (e) separately plot the SSIM and RAE over steps.

Download full size

View in Article

Fig. 10. Different scatter images. From left to right: scatter projection input, the ground truth of the scatter image at the first column, and Grad-CAM heatmaps of three subnetworks {Wk,Wω,Wβ}.

Download full size

View in Article

Fig. 11. From top to bottom: four prostate cases with 5×105, 1×106, 5×106, and 1×107 source photons. From left to right: primary signals, smoothed scatter signals restored by the over-relaxation algorithm with empirical parameters, smoothed scatter signals restored by the proposed framework, and the ground truth.

Download full size

View in Article

Fig. 12. (a)–(d) Intensity profiles of the four prostate cases presented in Fig. 11. Profile locations are outlined by orange lines in the last column of Fig. 11.

Download full size

View in Article

Table 1. DDQN Training Process

View in Article

Table 1. DDQN Training Process

1.	Initialize main network weights $W$ and target network weights $\hat{W}$
2.	For $episode = 1, 2, \dots$ , $N_{episode}$ do
3.	For $projection = 1, 2, \dots, N_{prj}$ do
4.	Initialize ${k_{0}, ω_{0}, β_{0}}$
5.	Generate $s_{1}$ using Eq. (10) with ${k_{0}, ω_{0}, β_{0}}$
6.	For $t = 1, 2, \dots$ , $N_{step}$ do
7.	Randomly select one subnetwork from ${W_{k}, W_{ω}, W_{β}}$
8.	With probability $ε$ select action $a_{t}$ randomly
9.	Otherwise choose $a_{t} = \arg \max_{a} [Q_{π} {(s}_{t}, a; W)]$
10.	Adjust parameters ${k_{t}, ω_{t}, β_{t}}$ according to $a_{t}$
11.	Generate $s_{t + 1}$ using Eq. (10) with ${k_{t}, ω_{t}, β_{t}}$
12.	Compute reward $r_{t}$ using Eq. (19)
13.	Store dataset ${s_{t}, a_{t}, r_{t}, s_{t + 1}}$ in experience replay $D$
14.	Randomly sample a mini-batch of dataset from $D$
15.	Compute the gradient of loss function in Eq. (17)
16.	Update main network weights $W = {W_{k}, W_{ω}, W_{β}}$
17.	For every $N_{update}$ steps, let $\hat{W} = W$
18.	End For
19.	End For
20.	End For

Table 2. Parameters in the DDQN Training Phase

View in Article

Table 2. Parameters in the DDQN Training Phase

Parameters	Values	Descriptions
$N_{episode}$	100	Number of training episodes
$N_{prj}$	45	Number of training projections
$N_{step}$	30	Number of steps for each episode
$N_{update}$	20	Number of steps for target network weights update
$D$	2000	Capacity of experience replay memory
$ε$	[0.01, 1]	Probability of random action in $ε$ -greedy algorithm
$γ$	0.6	Discount factor
$l_{r}$	0.001	Learning rate of gradient descent for main network
$N_{batch}$	64	Mini-batch samples for network training

Table 3. SSIM, PSNR, and RAE Statistics ( $avg. \pm std.$ ) among All Testing Cases^a

View in Article

Table 3. SSIM, PSNR, and RAE Statistics ( $avg. \pm std.$ ) among All Testing Cases^a

Photon Number	SSIM ( $1 = Best$ )				PSNR (dB)				RAE (%)
	Empirical		ASEF		Empirical		ASEF		Empirical		ASEF
	avg.	std.	avg.	std.	avg.	std.	avg.	std.	avg.	std.	avg.	std.
$5 \times 10^{5}$	0.79	$4.70 \times 10^{- 2}$	0.94	${2.36 \times 10}^{- 2}$	21.54	0.85	26.55	1.34	12.03	$1.27 \times 10^{- 2}$	5.62	${1.27 \times 10}^{- 2}$
$1 \times 10^{6}$	0.88	$3.73 \times 10^{- 2}$	0.96	${1.67 \times 10}^{- 2}$	23.99	0.72	29.05	1.22	8.52	$9.65 \times 10^{- 3}$	4.22	${6.53 \times 10}^{- 3}$
$5 \times 10^{6}$	0.97	$8.83 \times 10^{- 3}$	0.99	${3.85 \times 10}^{- 3}$	30.26	0.91	33.76	1.03	3.81	$4.69 \times 10^{- 3}$	2.42	${3.25 \times 10}^{- 3}$
$1 \times 10^{7}$	0.98	$4.31 \times 10^{- 3}$	0.99	${2.02 \times 10}^{- 3}$	33.19	0.83	36.05	0.89	2.68	$3.14 \times 10^{- 3}$	1.87	${2.35 \times 10}^{- 3}$
$1 \times 10^{8}$	0.99	$4.96 \times 10^{- 4}$	0.99	${3.97 \times 10}^{- 4}$	43.03	0.82	43.96	0.73	0.84	$9.31 \times 10^{- 4}$	0.74	${7.36 \times 10}^{- 4}$
$1 \times 10^{9}$	0.99	$4.84 \times 10^{- 5}$	0.99	${4.64 \times 10}^{- 5}$	52.97	0.91	53.12	0.89	0.27	$3.26 \times 10^{- 4}$	0.26	${3.06 \times 10}^{- 4}$

Table 4. Computation Time for One Scatter Image of a Prostate Patient across Different Photon Numbers
View table
View in Article
Table 4. Computation Time for One Scatter Image of a Prostate Patient across Different Photon Numbers
Computation Time (s)
$5 \times 10^{5}$ $1 \times 10^{6}$ $5 \times 10^{6}$ $1 \times 10^{7}$ $1 \times 10^{8}$ $1 \times 10^{9}$ $1 \times 10^{10}$ $1 \times 10^{11}$
MC 0.43 0.45 0.57 0.83 5.94 60.00 633.95 6402.60
DRL 8.98 4.80 1.94 0.98 0.32 0.29 0.29 0.29
Total 9.41 5.25 2.51 1.81 6.26 60.29 634.24 6402.89

Tools

Get Citation

Copy Citation Text

Jianhui Ma, Zun Piao, Shuang Huang, Xiaoman Duan, Genggeng Qin, Linghong Zhou, Yuan Xu, "Monte Carlo simulation fused with target distribution modeling via deep reinforcement learning for automatic high-efficiency photon distribution estimation," Photonics Res. 9, B45 (2021)

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Special Issue: DEEP LEARNING IN PHOTONICS

Received: Oct. 26, 2020

Accepted: Dec. 21, 2020

Published Online: Feb. 24, 2021

The Author Email: Linghong Zhou (smart@smu.edu.cn), Yuan Xu (yuanxu@smu.edu.cn)

DOI:10.1364/PRJ.413486

Topics

laser devices and laser physics

Lasers and Laser Optics

laser manufacturing

Instrumentation, Measurement and Metrology