Spectroscopy and Spectral Analysis, Volume. 32, Issue 9, 2399(2012)
Parallel PLS Aigorithm Using MapReduce and Its Aplication in Spectral Modeling
Partial least squares (PLS) has been widely used in spectral analysis and modeling, and it is computation-intensive and time-demanding when dealing with massive data. To solve this problem effectively, a novel parallel PLS using MapReduce is proposed, which consists of two procedures, the parallelization of data standardizing and the parallelization of principal component computing. Using NIR spectral modeling as an example, experiments were conducted on a Hadoop cluster, which is a collection of ordinary computers. The experimental results demonstrate that the parallel PLS algorithm proposed can handle massive spectra, can significantly cut down the modeling time, and gains a basically linear speedup, and can be easily scaled up.
Get Citation
Copy Citation Text
YANG Hui-hua, DU Ling-ling, LI Ling-qiao, TANG Tian-biao, GUO Tuo, LIANG Qiong-lin, WANG Yi-ming, LUO Guo-an. Parallel PLS Aigorithm Using MapReduce and Its Aplication in Spectral Modeling[J]. Spectroscopy and Spectral Analysis, 2012, 32(9): 2399
Received: Mar. 8, 2012
Accepted: --
Published Online: Sep. 26, 2012
The Author Email: Hui-hua YANG (yanghuihua@tsinghua.edu.cn)