Sparse coding, a method that decomposes a signal into a few elements of a dictionary, can uncover semantic information about images[
Journal of Semiconductors, Volume. 44, Issue 10, 104101(2023)
Forward stagewise regression with multilevel memristor for sparse coding
Sparse coding is a prevalent method for image inpainting and feature extraction, which can repair corrupted images or improve data processing efficiency, and has numerous applications in computer vision and signal processing. Recently, several memristor-based in-memory computing systems have been proposed to enhance the efficiency of sparse coding remarkably. However, the variations and low precision of the devices will deteriorate the dictionary, causing inevitable degradation in the accuracy and reliability of the application. In this work, a digital-analog hybrid memristive sparse coding system is proposed utilizing a multilevel Pt/Al2O3/AlOx/W memristor, which employs the forward stagewise regression algorithm: The approximate cosine distance calculation is conducted in the analog part to speed up the computation, followed by high-precision coefficient updates performed in the digital portion. We determine that four states of the aforementioned memristor are sufficient for the processing of natural images. Furthermore, through dynamic adjustment of the mapping ratio, the precision requirement for the digit-to-analog converters can be reduced to 4 bits. Compared to the previous system, our system achieves higher image reconstruction quality of the 38 dB peak-signal-to-noise ratio. Moreover, in the context of image inpainting, images containing 50% missing pixels can be restored with a reconstruction error of 0.0424 root-mean-squared error.
Introduction
Sparse coding, a method that decomposes a signal into a few elements of a dictionary, can uncover semantic information about images[
However, these memristive sparse coding systems are chosen to be as efficient as possible, regardless of the degradation of the dictionary and the corresponding detrimental effects on image reconstruction and feature extraction. In their system, a locally competitive algorithm is chosen for implementation. Then, to reduce the time complexity of the algorithm, both the forward and backward iterative operations of the algorithm are accelerated by analog operations. Next, the encoding results will be solely based on the dictionary held on the memristor array. However, the accuracy and reliability of the dictionary are not guaranteed. When the dictionary is stored on the memristor array, it deteriorates into various dictionaries at different terminals due to the accuracy and variability of the memristor[
In this work, to ensure the quality of sparse coding while accelerating it, we propose a digital-analog hybrid memristive sparse coding system (
Figure 1.(Color online) Schematic diagram of the digital-analog hybrid memristive sparse coding system. Input images can be divided into small patches and then represented by a few dictionary elements. The memristor array is used to store the approximate dictionary to calculate the cosine distance between the dictionary elements and the residual vector (image reconstruction error). The digital system then determines the most relevant dictionary element based on the result of the analog calculation, and that element at full precision becomes part of the reconstructed image.
Memristor-based FSR
The procedure of sparse coding involves dictionary construction and sparse representation. The constructed dictionary can be either the predefined dictionary or the learned dictionary. The FSR algorithm is utilized for achieving sparse representation. In this section, we illustrate the principle of accelerating the FSR algorithm by memristive IMC. Initially, we introduce the multilevel Pt/Al2O3/AlOx/W memristor, followed by the algorithm of the FSR. The evaluation of the memristor-based FSR is predicated on the performance of the aforementioned multilevel memristors. We then propose data mapping methods and finally demonstrate the digital-analog hybrid system.
Multilevel Pt/Al2O3/AlOx/W memristor
We fabricated a Pt/Al2O3/AlOx/W stacked memristor. After depositing a Ti adhesion layer on the Si/SiO2 substrate, we deposited a 100-nm Pt bottom electrode by direct current (dc) magnetron sputtering. Then, through the process of atomic layer deposition, a 3-nm Al2O3 layer was formed, followed by a 5-nm AlOx layer which was deposited via radio frequency magnetron sputtering. Finally, the 100-nm W top electrode was grown by dc magnetron sputtering and patterned by ultraviolet lithography with a size of 50 × 50 μm2. A sketch of the device structure is given in
Figure 2.(Color online) (a) A schematic of the device structure. (b) SEM image of the Pt/Al2O3/AlOx/W memristor. (c) XPS image of Al 2p and O 1s in the AlOx and Al2O3 layers. (d) 100 consecutive dc I−V curves with forming voltage about 4.8 V. (e) HRS and LRS distributions for 10 devices. (f) An instance of tuning the device conductance to reach a target conductance state of 60 μS with an error rate < 4% is demonstrated. The inset shows the write-verify method where a step voltage of ± 20 mV is employed. (g) Eight target conductance states are fine-tuned through the write-verify method, with < 4% variations. (h) Stable read distribution of each eight target conductance states at a dc reading voltage of 0.1 V. (i) Retention test over 3000 s of the same eight conductance states mentioned in
Algorithm of forward stagewise regression
FSR serves as a kind of L1 norm regression is widely used in processing sparse models[
where β stands for the coefficient estimates of variables x1, x2,···, xm. To model y as a linear combination of x1, x2,···, xm, the FSR updates the coefficient estimates β(k) by calculating the cosine distance to find the most relevant variable with the current residual y−ŷ, and then transfer the newly estimated ŷ to the next iteration. For the kth iteration, the processes can be expressed as follows:
where j stands for the index of the most related variable, and ε is the step size. Such iteration continues until the repeat time k reaches the set maximum or the fitting error ║y−ŷ║2 converges under a certain level. After the convergence, the stagewise estimate will follow the desired sparsity properties[
The flow chart of the FSR is illustrated in
Figure 3.(Color online) (a) Flow chart of the FSR. The sign of ε is dependent on the cosine similarity between the corresponding variable and residual vector. (b) Calculating the cosine distance between residual vector y−ŷ and variables x1, x2,···, xm by the memristor array. Each line of the array stores the values of a variable in the dataset and each element of the variable is represented by the conductance difference of two memristors. (c) Residual vectors mapped to the 4-bit scaling range. During the iteration, the numerical-voltage scaling ratio will be continuously decreased with the shrinking of the residual vector.
Data-mapping method
For the implementation of FSR on the memristor arrays, the variables x1, x2,···, xm and the residual vector y−ŷ are mapped as the device conductance and the input voltages, respectively. However, the dilemma of data mapping is that the precision of the data is usually above 8 bits, while the number of available memristor conductance states and the precision of the DAC are usually limited. To reduce the precision requirements of the in-memory sparse coding for conductance and voltage, two optimized data-mapping methods are proposed for the variables and residual vectors, respectively. For the variables, (1) standardize the data separately for each variable vector (z-score standardization)[
We note that the above mapping method may lead to severe truncation errors at several data points, because it does not use the maximum value as the upper limit of the mapping range. However, large truncation errors occur mainly for outliers, which are usually detrimental to regression analysis[
Hybrid digital-analog system
Here, we illustrate the operation process of the proposed hybrid digital-analog system. After the variables x1, x2,···, xm are stored on the memristor array in an approximate map-ping manner, the system initializes the regression coefficients β(0) and predicted values ŷ to zero and then starts iteration. During the iteration process, first, the residual vector y−ŷ is converted into a voltage vector according to the aforementioned method and applied to the array. Then, the outputs of the circuit are sampled by analog-digital converters (ADCs), and the results are the approximate cosine distances of the residual vector y−ŷ and variables x1, x2,···, xm, which can be used to find the most relevant variable to the current residual vector (
Memristor-based FSR for sparse coding
In this section, we demonstrate the use of memristor-based FSR to solve sparse coding tasks. During the process, the variable matrix X is the dictionary, the real value y is the image patch, and the fitting goal is to fit the image patch with the dictionary elements. We will first introduce the algorithm for sparse coding and then perform simulations to verify that the memristor-based FSR can accurately handle sparse coding tasks. The key step, calculating the cosine distance between the residual vector and dictionary elements, is implemented in a modeled memristor array, where the analog properties are simulated on the python platform. Additionally, considering that both the predefined and learned dictionaries are widely used, we employ both dictionaries to perform memristor-based sparse coding and compared the results.
Sparse coding algorithm
Sparse coding algorithms aim to find a linear combination of a small number of dictionary elements to represent the input signal. It can reduce the dimensionality of high-dimensional data, find essential features of signals, and extract semantic information of graphs. As such, it has a broad range of applications in computer vision and signal processing. Sparse coding usually contains two steps, namely, the construction of a dictionary and the sparse representation. Predefined dictionaries (DCT, Gabor) and learned dictionaries are widely used in this field[
where y is the image patch, X is the dictionary, β is the coefficient estimation of the dictionary elements, and q is the number of selected elements. However, since the L0-norm minimization is an NP-hard problem, the L1-norm minimization is a popular choice to replace it in sparse coding, as these two methods have the same effect when the solution is sparse enough[
Memristor-based sparse coding
Here we will evaluate the sparse representation performance of the memristor-based FSR with the DCT dictionary and learned dictionary. The over-determined DCT dictionary can be obtained by sampling the cosine wave at different frequencies[
Figure 4.(Color online) (a) The overdetermined DCT dictionary is mapped to the 128 × 256 memristor array. (b) Examples of the elements of the DCT dictionary. (c) Scheme of the original image (128 × 128). The image is divided into 8 × 8 patches for processing. (d) One patch in (c) to perform sparse coding with consideration of nonideal factors in a real circuit. (e) The dictionary element coefficient update path of (d). (f) Simulated reconstructed picture of (e), with consideration of nonideal factors in a real circuit. (g, h) In the case of adopting the DCT dictionary, the image reconstruction quality and sparsity of FSR under different thresholds (L0 is the average number of selected elements) with respect to (g) memristor-based FSR and (h) full-precision FSR.
In
The learned dictionary is obtained by a dictionary learning algorithm trained on natural images. Here, the 64 × 256 DCT dictionary is selected as the initial dictionary and the K-SVD algorithm[
Figure 5.(Color online) (a) Schemes of the natural pictures used to train the dictionary. (b) The offline-learned dictionary is mapped to 128 × 256 memristor array. (c) Examples of the elements of the learned dictionary. (d, e) In the case of adopting the offline-learned dictionary, the image reconstruction quality and sparsity of FSR under different thresholds with respect to (d) memristor-based FSR and (e) full-precision FSR.
Analysis of hardware nonidealities
To analyze the influence of the various hardware nonidealities on the system, we further conduct simulation analysis. The offline-learned dictionary is selected for the analysis according to the image reconstruction quality, and the threshold is set as in the condition when the reconstruction error (mean square error) is less than 0.0006 to balance the reconstruction quality and sparsity. As can be seen in
Figure 6.(Color online) (a) The influence of conductance precision on peak-signal-to-noise ratio (PSNR) and sparsity (L0). (b) The influence of DAC precision on PSNR and sparsity. (c) The influence of ADC precision on PSNR and sparsity. (d) The robustness analysis of PSNR with device variations. (e) The robustness analysis of the sparsity with device variations.
Image inpainting application
The purpose of image inpainting is to restore corrupted images, and it is commonly used to repair aged photos or image files that have been corrupted during data transfer. Sparse coding can capture semantic information from the remaining pixels of a corrupted image to fill in the missing parts. With the above hardware parameter design considerations, to demonstrate that the memristor-based FSR can be applied to a realistic image processing task, we apply it to complete image inpainting (a basic application of sparse coding[
Figure 7.(Color online) (a) The image inpainting task is performed using memristor-based sparse coding, where the array input voltage is the residual vector of remaining pixels. (b) Image restoration effect based on the DCT dictionary and learned dictionary, the middle one is based on the DCT dictionary and the right one is based on the learned dictionary.
Discussion
Finally, we now compare our study with similar works reported in the literature. Can et al. have developed a 128 × 64 memristor array for signal and image processing tasks[
|
Conclusion
In this work, we introduced a digital-analog hybrid memristive sparse coding system (memristor-based FSR), which can reduce the time complexity of the FSR from O(m × n) to O(m + n) by utilizing memristive IMC. Besides, the image reconstruction quality (PSNR) of memristive sparse coding can be improved to 38 dB. The hardware requirements of the system are low, which only requires moderate memristor conductance states (≥ 4), write variation (≤ 15%), read variation (≤ 15%) and DACs (≥ 4 bits). In the image inpainting task, this method has restored the image with 50% lost pixels, and the reconstruction error after a filling is 0.0424 RMSE. These results show that memristor-based FSR can accomplish sparse coding with high efficiency and high quality, and is promising for applications such as image recognition, anomaly detection, etc.
[4] H Lee, A Battle, R Raina et al. Efficient sparse coding algorithms, 801(2006).
[7] R J Tibshirani, B Yu. A general framework for fast stagewise algorithms. J Mach Learn Res, 16, 2543(2015).
[24] P J Rousseeuw, A M Leroy. Robust regression and outlier detection(2005).
Get Citation
Copy Citation Text
Chenxu Wu, Yibai Xue, Han Bao, Ling Yang, Jiancong Li, Jing Tian, Shengguang Ren, Yi Li, Xiangshui Miao. Forward stagewise regression with multilevel memristor for sparse coding[J]. Journal of Semiconductors, 2023, 44(10): 104101
Category: Articles
Received: Mar. 13, 2023
Accepted: --
Published Online: Dec. 26, 2023
The Author Email: