Spectroscopy and Spectral Analysis, Volume. 32, Issue 9, 2399(2012)

Parallel PLS Aigorithm Using MapReduce and Its Aplication in Spectral Modeling

YANG Hui-hua1、*, DU Ling-ling2, LI Ling-qiao2, TANG Tian-biao2, GUO Tuo2, LIANG Qiong-lin3, WANG Yi-ming3, and LUO Guo-an3
Author Affiliations
  • 1[in Chinese]
  • 2[in Chinese]
  • 3[in Chinese]
  • show less

    Partial least squares (PLS) has been widely used in spectral analysis and modeling, and it is computation-intensive and time-demanding when dealing with massive data. To solve this problem effectively, a novel parallel PLS using MapReduce is proposed, which consists of two procedures, the parallelization of data standardizing and the parallelization of principal component computing. Using NIR spectral modeling as an example, experiments were conducted on a Hadoop cluster, which is a collection of ordinary computers. The experimental results demonstrate that the parallel PLS algorithm proposed can handle massive spectra, can significantly cut down the modeling time, and gains a basically linear speedup, and can be easily scaled up.

    Tools

    Get Citation

    Copy Citation Text

    YANG Hui-hua, DU Ling-ling, LI Ling-qiao, TANG Tian-biao, GUO Tuo, LIANG Qiong-lin, WANG Yi-ming, LUO Guo-an. Parallel PLS Aigorithm Using MapReduce and Its Aplication in Spectral Modeling[J]. Spectroscopy and Spectral Analysis, 2012, 32(9): 2399

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Received: Mar. 8, 2012

    Accepted: --

    Published Online: Sep. 26, 2012

    The Author Email: Hui-hua YANG (yanghuihua@tsinghua.edu.cn)

    DOI:10.3964/j.issn.1000-0593(2012)09-2399-06

    Topics