PLS path modeling

https://doi.org/10.1016/j.csda.2004.03.005Get rights and content

Abstract

A presentation of the Partial Least Squares approach to Structural Equation Modeling (or PLS Path Modeling) is given together with a discussion of its extensions. This approach is compared with the estimation of Structural Equation Modeling by means of maximum likelihood (SEM-ML). Notwithstanding, this approach still shows some weaknesses. In this respect, some new improvements are proposed. Furthermore, PLS path modeling can be used for analyzing multiple tables so as to be related to more classical data analysis methods used in this field. Finally, a complete treatment of a real example is shown through the available software.

Introduction

There is some confusion in the terminology used in the PLS field. Trying to clarify it is a way to follow the evolution of the ideas in the PLS approach.

Herman Wold first formalized the idea of partial least squares in his paper about principal component analysis (Wold, 1966) where the NILES (=nonlineariterativeleastsquares) algorithm was introduced. This algorithm (and its extension to canonical correlation analysis and to specific situations with three or more blocks) was latter named NIPALS (=nonlineariterativepartialleastsquares) in the 1973 and 1975 Wold's papers given in reference (Wold 1973, Wold 1975).

The first presentation of the finalized PLS approach to path models with latent variables (LVs) has been published by Wold, 1979 and then the main references on the PLS algorithm are Wold 1982, Wold 1985.

Herman Wold opposed SEM-ML (Jöreskog, 1970) “hard modeling” (heavy distribution assumptions, several hundreds of cases necessary) to PLS “soft modeling” (very few distribution assumptions, few cases can suffice). These two approaches to structural equation modeling have been compared in Jöreskog and Wold (1982). In the following, these two approaches are compared on an example and it seems that, in fact, LVs estimates by both methods are very correlated if the SEM-ML LVs estimates are modified so that only the manifest variables (MVs) related to an LV are used to estimate the LV itself.

In the chemometrics field, PLS regression (Wold et al., 1983) has a tremendous success and in many publications there is a confusion between the father's (H. Wold) and the son's (S. Wold) work. The term “PLS approach” is somewhat too general and merge now PLS for path models and PLS regression. Following a suggestion by H. Martens, we have decided to name “PLS Path Modeling” the use of PLS for structural equation modeling.

The unique available software has been for many years LVPLS 1.8 developed by (Lohmöller (1987, last available version). Lohmöller has extended the basic PLS algorithm in various directions and published all his research results in 1989. More recently, a new software has been developed by Chin (2001, for the last version, still in beta test however): PLS-Graph 3.0. It contains a Windows user-friendly graphical interface to PLSX, a program for PLS path modeling on units by variables data table available in LVPLS 1.8. Moreover, it proposes a cross-validation of the path model parameters by jack-knife and bootstrap. Both software will be studied in details on a practical example.

A very important review paper on PLS approach to structural equation modeling is Chin (1998). A basic review on PLS path modeling for Marketing is Fornell and Cha (1994). Our paper is more theory oriented and should be seen as a complement to these paper for readers more oriented to Statistics. In particular, we give a detailed study of the relationship between PLS path modeling and multiple table analysis methods.

Section snippets

The PLS path modeling algorithm

To clarify the presentation of the PLS path modeling algorithm, it is very useful to refer to a practical example. PLS has been applied very extensively in customer satisfaction studies. So we will first present the construction of a customer satisfaction index (CSI).

The NIPALS algorithm

The roots of the PLS algorithm are in the nonlinear iterative least-squares estimation (NILES), which later became nonlinear iterative partial least squares (NIPALS), algorithm for principal component analysis (Wold, 1966). We now remind the original algorithm of H. Wold and show how it can be included in the PLS framework described in this paper. The interests of the NIPALS algorithm are double as it shows: how PLS handles missing data and how to extend the PLS approach to more than one

The PLS approach for two sets of variables

We show in this section how using PLS path modeling allows to find again the main data analysis methods to relate two sets of variables. We give in Table 15 the methods corresponding to the various choices of modes A or B for the LVs y1 or y2.

To show these results, it is enough to write the stationary conditions of PLS path modeling. For simplification purposes, but without any lack of generality, we suppose that the LVs ξ1 and ξ2 are positively correlated. Consequently z1=y2 and z2=y1.

Denoting

The PLS approach for J sets of variables

In this section, we show that the various options of PLS path modeling (modes A or B for outer estimation; centroid, factorial or path weighting schemes for inner estimation) allow to find again many methods for multiple tables analysis: generalized canonical analysis (the Horst's (1961) one and the Carroll's (1968) one), Multiple Factor Analysis (Escofier and Pagès, 1994), Lohmöller's (1989) split principal component analysis, Horst's (1965) maximum variance algorithm.

The links between PLS and

Acknowledgements

This paper has been supported by the ESIS (European Satisfaction Index System) IST Project within the Vth Framework Programme (IST-2000-31071) and the Fondation HEC (Paris, France).

References (36)

  • L. Eriksson et al.

    Multi- and Megavariate Data Analysis: Principles and Applications

    (2001)
  • C. Fornell

    A national customer satisfaction barometerthe Swedish experience

    J. Marketing

    (1992)
  • C. Fornell et al.

    Partial least squares

  • P. Horst

    Relations among M sets of variables

    Psychometrika

    (1961)
  • P. Horst

    Factor Analysis of Data Matrices

    (1965)
  • A.Z. Israels

    Redundancy analysis for qualitative variables

    Psychometrika

    (1984)
  • K.G. Jöreskog

    A general method for analysis of covariance structure

    Biometrika

    (1970)
  • K.G. Jöreskog et al.

    The ML and PLS techniques for modeling with latent variables: historical and comparative aspects

  • Cited by (4095)

    View all citing articles on Scopus
    View full text