The rapidly expanding volume of human daily data and the evolving machine vision systems powered by artificial intelligence have elevated the significance of dynamic motion recognition in diverse AI applications[
Journal of Semiconductors, Volume. 45, Issue 9, 092401(2024)
Multiframe-integrated, in-sensor computing using persistent photoconductivity
The utilization of processing capabilities within the detector holds significant promise in addressing energy consumption and latency challenges. Especially in the context of dynamic motion recognition tasks, where substantial data transfers are necessitated by the generation of extensive information and the need for frame-by-frame analysis. Herein, we present a novel approach for dynamic motion recognition, leveraging a spatial-temporal in-sensor computing system rooted in multiframe integration by employing photodetector. Our approach introduced a retinomorphic MoS2 photodetector device for motion detection and analysis. The device enables the generation of informative final states, nonlinearly embedding both past and present frames. Subsequent multiply-accumulate (MAC) calculations are efficiently performed as the classifier. When evaluating our devices for target detection and direction classification, we achieved an impressive recognition accuracy of 93.5%. By eliminating the need for frame-by-frame analysis, our system not only achieves high precision but also facilitates energy-efficient in-sensor computing.
Introduction
The rapidly expanding volume of human daily data and the evolving machine vision systems powered by artificial intelligence have elevated the significance of dynamic motion recognition in diverse AI applications[
Figure 1.(Color online) (a) Schematic of traditional frame-by-frame detecting system. Detector genetrates output for subsequent computing in every single frame. (b) Schematic of multiframe-integrated, in-sensor computing using persistent photoconductivity effect. Detector continuously detects multiple frames and only generates one final output state for analysis, which already memorizes the information of past and current frames, spatially and temporally. The final state is input to the subsequent linear classifier, serving as the readout layer.
To shift away from the traditional frame-by-frame analysis model, detectors must be capable of capturing images that incorporate spatial-temporal data covering multiple frames[
In this work, we employed a kind of retinomorphic photodetector to develop a spatial-temporal in-sensor computing system, leveraging the principles of integrating multiple frames into one by using reservoir computing (RC) networks (
Experiment
The Au/MoS2/Au retinomorphic photodetector were fabricated based on CVD-grown MoS2 film on sapphire substrate as follows: First, the photoresist S1818 is uniformly spun on the MoS2/sapphire substrate, and the interdigitated shape pattern region is defined by lithography and development. Then, the Au/Cr (45 nm/15 nm) is deposited by electron beam evaporation (EBE) to form interdigitated electrodes, and the rest of the photoresist is removed to expose the MoS2/sapphire substrate. Finally, the sample is annealed on the hot plate at 200 °C for 10 min.
To demonstrate the microstructural morphology characterization and the elemental composition of the retinomorphic MoS2 photodetector, scanning electron microscopy (SEM) imaging and wavelength-dispersive X-ray spectroscopy (WDS) measurements were performed using a field emission electron probe micro-analyzer (EPMA, JXA-8530F Plus).
To demonstrate the atomic structure of MoS2 crystal, the cross-sectional MoS2 layers were analyzed using a Talos F200X transmission electron microscope (ThermoFisher Scientific).
To estimate the chemical structure, phase and morphology of the MoS2, Raman measurements were conducted using a Labram HR800 Raman spectrometer (Horiba Jobin. Yvon. TM.) at 295 K with a 532 nm excitation wavelength.
X-ray photoelectron spectroscopy (XPS) analysis was performed using an ESCALAB 250Xi instrument to assess chemical states and elemental content qualitatively.
The persistent photoconductivity outputs of the retinomorphic MoS2 photodetector were assessed at room temperature using a Thorlabs-LP520-SF15 pigtailed laser diode and a Keysight B1500A semiconductor device analyzer.
To demonstrate the capability that the photodetectors can encode the optical information of several frames into one frame, we let one detector receive different sequences of light pulses which represent different information to generate distinguished photocurrent, the final states of photocurrent is recorded by semiconductor device analyzer.
To demonstrate the feasibility of our in-sensor computing system for dynamic motion recognition, we use our MoS2 photodetector to simulate a 8 × 8 array with one device per pixel. We created a 8 × 8 map where a car can move. We used light pulses to mimic target location and continuously presented four frames depicting target motion in two possible directions (clockwise and anticlockwise) at varying speeds, and trained the network to discern walking directions. For comparison, we finished the same task using a traditional FC network with a comparable network scale. The classifier and the output of results are accomplished by computer.
Results and discussion
Characteristics of the retinomorphic MoS2 photodetector
Figure 2.(Color online) (a) Schematic image of retinomorphic MoS2 photodetector. (b) Scanning electron microscopy (SEM) imaging (left) and wavelength-dispersive X-ray spectroscopy (WDS) imaging (sulfur element and molybdenum element image are present in red and green, respectively) of retinomorphic MoS2 photodetector, Scarbar, 50 μm. (c) Raman spectra of retinomorphic MoS2 photodetector. The inset shows the transmission electron microscopy (TEM) images of cross-sectional MoS2 flake, Scarbar, 10 nm. (d) X-ray photoelectron spectroscopy (XPS) of the retinomorphic MoS2 photodetector. The inset shows the optical image of a 1 cm × 1 cm retinomorphic MoS2 photodetector array.
Figure 3.(Color online) (a) I−V characterization of retinomorphic MoS2 photodetector in logarithmic scale. The inset shows the I−V curve on a linear scale. (b) The persistent photoconductivity effects observed in retinomorphic MoS2 photodetector illuminated under laser pulses (520 nm, 10 mW). Pink rectangle: light on; blue rectangle: light off. (Insert: Photocurrent of Au/MoS2/Au device were measured under illumination by light pulses with different power (λ = 520 nm, 3, 5, 10, 12, 14 mW laser power)). (c) 3-bit light pulse inputs ranging from "000" to "111", each with a pulse width and interval of 100 and 900 ms were used. (d) The resultant normalized photocurrent characteristics, including input−output feature extraction, were analyzed using a retinomorphic MoS2 photodetector.
Persistent photocurrent effect for encoding temporal information
Three continuous frames of light pulses are irradiated to the detector with a fixed frequency (10 Hz in our experiment), as shown in
Dynamic motion recognition task
Based on the results of optoelectronic characteristics above, we further simulated an 8 × 8 detector array to demonstrate the feasibility of our in-sensor computing system for dynamic motion recognition, as shown in
Figure 4.(Color online) (a) Schematic of the mission proposed, target can move in two directions (clockwise/anticlockwise), we use light pulse irradiated onto detector array (8 × 8 pixels in our simulation) to represent the location of target. (b) Schematic of four heatmaps of photocurrent of all detectors after every single frame, the green one refers to clockwise, the blue one refers to anticlockwise, darker a pixel is, later a light pulse appears here. (c) Evolution of the accuracy rates based on multiframe-integrated RC system and traditional FC network within 100 epochs.
The readout layer contains 64 × 2 weights for classifying two directions. We utilized 8000 training data to train our in-sensor RC system and 4000 test data for validation. Through simulation, we achieved a recognition accuracy of 93.5% after 100 epochs, surpassing the traditional FC network by 40% (
Conclusion
In summary, we successfully applied MoS2 photodetector to implement in-sensor computing by integrating multiple frames into one with persistent photoconductivity for dynamic motion recognition. The inherent photoconductivity of the MoS2 photodetector allows for the embedding of spatial-temporal information into a single frame, effectively reducing redundant data flow and simplifying dynamic visual tasks. Furthermore, unlike with traditional recurrent neural networks, the coupling weights in the reservoir are not trained. They are usually chosen randomly, globally scaled in order for the network to operate in a certain dynamical regime. When detector devices are applied as the nodes of a reservoir, the responsibility of the device acts as the input coefficient of the reservoir, and the weight of each node doesn’t need to be the same, as long as it can remain unchanged. Thus, there is no need for the homogeneity of all devices in a reservoir. The scalability of the two-terminal structure and tolerance for heterogeneity in 2D material devices facilitate the addition of more devices per pixel, potentially enriching the reservoir states and enhancing network training, leading to improved accuracy with minimal cost. In conclusion, this in-sensor computing system serves as a prototype for high energy-efficient dynamic machine vision applications.
Get Citation
Copy Citation Text
Xiaoyong Jiang, Minrui Ye, Yunhai Li, Xiao Fu, Tangxin Li, Qixiao Zhao, Jinjin Wang, Tao Zhang, Jinshui Miao, Zengguang Cheng. Multiframe-integrated, in-sensor computing using persistent photoconductivity[J]. Journal of Semiconductors, 2024, 45(9): 092401
Category: Articles
Received: Apr. 1, 2024
Accepted: --
Published Online: Oct. 11, 2024
The Author Email: Fu Xiao (XFu), Cheng Zengguang (ZGCheng)