Good behavioral research often requires critical interplay between theory and data. Relevant theory helps determine appropriate criteria and the variables that predict them. In turn, empirical results that are based on good measurement serve to inform and refine theory. Although prediction is the primary goal of many applied research efforts, theory is not subordinate to or divorced from this goal. Frequently, researchers will insert multiple predictors into a regression model to predict the criterion (or criteria) of interest. In a purely practical sense, the substantive nature of the predictors may actually be less important than predicting what is valued (e.g., job performance, smoking risk). But without theory, one can easily be overwhelmed by infinite possibilities; therefore, theory is necessary to guide the search for the most appropriate and important constructs to be measured and modeled.

Why assess predictor importance?

Determining the most important variables in a model—which to include (and exclude) in the model, and which of the included variables contribute the most to prediction—is critical from both the practical and theoretical perspectives. On the practical side, it is often essential to select a subset of tests that is both cost- and time-efficient, and that has adequate criterion-related validity. Often, one can select a small subset of predictors from a large set without any practical loss in predictive efficiency (see Madden & Bottenberg, 1963). On the theory side, good theories are parsimonious, containing only those constructs essential to understanding behavioral phenomena. In sum, determining the relative importance of predictor variables is important for building regression models, both for the practical purpose of prediction and for building theoretical models to further our understanding of behavioral phenomena.

This article features a program that calculates several quantitative indices of the relative importance of the predictors in a regression model. It is important to note that researchers and practitioners often determine the relative importance of predictor variables out of necessity (e.g., because of limited testing time), even without the assistance of a quantitative method for doing so. But rather than relying on human intuition for determining variable importance, we argue that a systematic method for quantifying relative importance is necessary. Even expert intuition can be flawed, inconsistent, or otherwise unreliable (see Grove & Meehl, 1996, and Meehl, 1954, for the multiple benefits of statistical over single-rater methods for combining information).

Relative importance is defined as the proportionate contribution each predictor in a linear regression model makes to the model R2, considering both its unique contribution and its contribution when combined with other predictor variables (Hoffman, 1960). Although there are no unambiguous measures of relative importance when predictor variables are correlated, some approaches are well motivated and have been shown to provide meaningful results. Note that the most typical approach to determining relative importance is also the least informative: After selecting a set of predictors and conducting a linear regression analysis, many researchers evaluate the predictors’ importance in the regression model by examining the size of the standardized regression weight associated with each variable (usually by eyeball). Variables with larger weights are viewed as more important than those with smaller weights.

There is a problem inherent in this popular or intuitive approach to determining predictor importance: Whether or not the data are standardized, the vector b contains least-squares regression weights that together maximize prediction by minimizing the sum of the squared errors of prediction. In mathematical terms:

$$ {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}} {y}} = XB = {b_{\rm{o}}} + \sum\limits_{{i = 1}}^p {{b_i}{x_i},{\hbox{minimizing}}\sum\limits_{{i = 1}}^p {{{\left( {{y_i} - {{{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}} {y}}}_i}} \right)}^2}} } $$
(1)

These weights do not directly indicate predictor importance—in fact, they are not intended to. Actually, if they are taken at face value as indicators of predictor importance, they are often counterintuitive, such as when finding small or negative weights for variables that have positive criterion-related validities.

Regression weights are uninterpretable as indicators of predictor importance to the extent that predictors show nonzero intercorrelations with one another, also known as multicollinearity. The degree of multicollinearity can be assessed by computing either the tolerance or the variance inflation factor (VIF) for each predictor (see Cohen, Cohen, West, & Aiken, 2003). Multicollinearity is independent of sample-size requirements in multiple regression (Green, 1977); high levels of multicollinearity (i.e., VIF  =  10, tolerance  =  0.10) makes determining the contributions of predictors with least-squares regression weights either difficult or impossible. To overcome the problem of determining relative importance from regression weights, importance indices are computed instead.Importance indices are operationalized and expressly intended to reflect the relative importance of predictors in a linear regression model, even in the presence of extreme multicollinearity.

We focus on three types of importance indices: incremental R 2, general dominance weights, and relative importance weights.

Incremental R 2

Incremental R 2 reflects the unique criterion variance accounted for by a predictor after all other variance accounted for by the remaining predictors has been partialed out of the criterion. More specifically, in a hierarchical regression where predictors are inserted into the model in a stepwise fashion, the incremental R 2 for a predictor is the increase in R 2 when that predictor is entered last, indicating the unique impact of that predictor in the model (Cohen et al., 2003). Incremental R 2 is also equivalent to the unique commonality coefficients that are computed when performing a communality analysis (Nunnally and Bernstein, 1994; Pedhazur, 1997).From these values, the communality among predictor variables can be computed using a series of mathematical equations (see Nimon, Lewis, Kane, & Haynes, 2008). By using incremental R 2 to compute a commonality analysis, researchers could gain additional insight into the unique and common contributions among predictors in explaining variance in the criterion of interest.

General dominance weights

A more sophisticated approach to computing importance indices was proposed by Budescu (1993), who details the procedure for conducting a dominance analysis. Dominance analysis produces general dominance weights that are computed by averaging a given predictor’s incremental validity across all possible submodels that involve that predictor (i.e., given p predictors, there are 2p – 1 possible submodels). The incremental validity of predictor i in a submodel is defined by

$$ \Delta R_{{ih}}^2 = r_{{y \cdot xixh}}^2 - r_{{y \cdot xh}}^2, $$
(2)

where x h represents one unique subset of k predictors in the submodel and x i represents the (k  + 1)th variable added to the submodel. Mathematically, the average incremental validity for predictor x i contained in all submodels of size k is

$$ C_{{{x_i}}}^{{(k)}} = \sum {_{{h = 1}}^{{\left( {\begin{array}{*{20}{c}} {p - 1} \hfill \\k \hfill \\\end{array} } \right)}}} {{{\Delta R_{{ih}}^2}} \left/ {{\left( {\begin{array}{*{20}{c}} {p - 1} \hfill \\k \hfill \\\end{array} } \right),}} \right.} $$
(3)

where \( \Delta R_{{ih}}^2 \) is as defined in Eq. 2, h is one unique subset of k predictors, and \( \left( {\begin{array}{*{20}{c}} {p - 1} \hfill \\k \hfill \\\end{array} } \right) \) is the combination function equal to p!/[k!(p – 1 – k)!], which is the number of subsets of size k that can be formed from (p – 1) predictors.

Because the general dominance weight for a given predictor x i is equal to its average incremental validity across all submodels that include that predictor, the values in Eq. 3 are averaged across all values of k:

$$ {C_{{{x_i}}}} = \sum\nolimits_{{k = 1}}^{{p - 1}} {C_{{{x_i}}}^{{(k)}}{{{}} \left/ {{\left( {p - 1} \right).}} \right.}} $$
(4)

General dominance weights have two appealing properties: First, each general dominance weight is the average contribution of a predictor to a criterion, both on its own and when taking all other predictors in the model into account. Second, general dominance weights across predictors always sum to the overall model R 2 (or to 1 if one divides each weight by the sum of the weights).

Relative importance weights

Relative importance weights (Fabbris, 1980; Johnson, 2000) are a third type of importance index, computed by first transforming a set of p predictors into a new set of p predictors that are not correlated with one another, yet are correlated as highly as possible to their counterpart. Mathematically speaking, given that X is a data matrix with N rows of data and p columns of predictors, it is well known that X can be subjected to the singular value decomposition

$$ {\hbox{X}} = P\Delta Q\prime, $$
(5)

where P and Q are the eigenvectors of XX′ and XX, respectively, and Δ contains the singular values of X (i.e., the square roots of the eigenvalues of XX′or XX, which are the same). Johnson (1966) showed that the orthogonal counterpart of X having the least squared error of prediction is the matrix Z, where

$$ Z = PQ\prime . $$
(6)

Both Fabbris (1980), and later Johnson (2000) pointed out that a past method for establishing relative importance weights by Green, Carroll, and DeSarbo (1978) was still affected by the correlation of the X variables, because they were regressing the orthogonal variables of Z onto the p correlated variables of X. Fabbris’s and J. W. Johnson’s shared insight was to note that treating X as the set of dependent variables and regressing X onto Z instead creates weights (called Λ*) without this problem, because the p components of the Z independent variables are orthogonal:

$$ {\Lambda^{ * }} = {\left( {Z\prime Z} \right)^{{ - 1}}}Z\prime X = Z\prime X. $$
(7)

These \( {\Lambda^{ * }} \) weights, when squared, sum to 1 and provide the proportional contribution of Z to X. The proportional contribution of Y to Z, in turn, can also be determined by regressing Y on Z:

$$ {\beta^{*}} = {\left( {Z\prime Z} \right)^{{ - 1}}}Z\prime y = Z\prime y. $$
(8)

Once again, because the components of Z are uncorrelated, the squared weights β* also sum to 1, and in this case provide the proportional contribution of Y to Z.

Thus, a relative importance weight is defined as the contribution of a given predictor to criterion variance, considering the predictor’s contribution alone as well as jointly with the other predictors in the model. It is equal to the sum of the two components just discussed: (1) the squared regression weight for the given predictor regressed onto its orthogonal counterpart, multiplied by (2) the squared weight regressing the criterion on that given predictor’s orthogonal counterpart. Looking back at Eqs. 7 and 8, the relative weight \( \varepsilon_i^2 \) for predictor i is equal to

$$ \varepsilon_i^2 = \beta_i^{{ * 2}}\Lambda_i^{{ * 2}}, $$
(9)

and the sum of each of the relative weights across p predictors is equal to the model R 2 (see Fabbris, 1980; Johnson, 2000):

$$ {R^2} = Sigma\sum\nolimits_{{i = 1}}^p {\varepsilon_i^2} = \sum\nolimits_{{Sigmai = 1}}^p {\beta_i^{{ * 2}}\Lambda_i^{{ * 2}}} . $$
(10)

In this way, relative weights are easy to explain in the same way as general dominance weights, because they sum to the model R 2— or the weights can be divided by R 2 so that they sum to 1.

As mentioned, there is no unambiguous measure or gold standard for relative importance, and many potential issues can arise when relying on any of the aforementioned importance indices. For example, it is possible that the study of some behavioral phenomena requires multiple criteria as well as multiple predictors, leading to a complex canonical prediction problem (e.g., Azen & Budescu, 2006; LeBreton & Tonidandel, 2008). Also, some researchers study dichotomous variables that require the use of logistic regression, where an extension of importance indices is needed to properly identify predictor importance (e.g., Azen & Traxel, 2009; Tonidandel & LeBreton, 2010).Another frequent concern is the reliability or stability of importance weight estimates across independent samples to which a regression model is supposed to generalize (e.g., Azen & Budescu, 2003; Johnson, 2004). Finally, in the case of predictor dominance (a predictor having the largest importance weight), there are some cases where theory might expect the dominance of a predictor in some submodels but not others, such that averaging across submodels is less useful than examining the submodels themselves (Azen & Budescu, 2003). Although all of these issues are worthy of consideration, it is beyond the scope of this report to individually address them. Our main focus is to provide a program useful for generating different types of predictor importance weights in multiple regression.

Given the variety of indices available, it can be informative to consider an array of weights and to report the most appropriate importance weights, or to examine how they converge and diverge, rather than to merely focus on the weights that are the most popular or typically available. The present program will compute all of the aforementioned importance indices that have been used in present-day behavioral research; however, it obviously requires the expertise of the researcher or practitioner to determine which set or sets of importance weights are the most appropriate to report. Fortunately, a number of articles have addressed this issue in detail (e.g., Budescu & Azen, 2004; LeBreton, Ployhart, & Ladd, 2004).

The goal of this article is focused: to provide an easily accessible program to conduct what we call an exploratory regression analysis. The program provides a variety of relative importance weights, dominance weights, and other results that, both independently and taken as a whole, indicate the contribution of each predictor in a linear regression model.

Method

Program overview

We wrote this program in Microsoft Excel so that it would be easy to access and familiar to many users, and so the Visual Basic code can be readily modified or extended by those who have experience with almost any programming language. Such programming improvements can be shared with the research community. After opening the Excel file, it is important that macros be enabled; then the user alternates between two worksheet tabs by clicking on the Analysis and Results tabs at the bottom of the screen. The Analysis tab is leftmost, corresponding to the sheet where users are prompted to enter how many predictors are in their regression model. Then the user enters a correlation matrix between the predictors and the criterion (see Fig. 1); this can be accomplished by cutting and pasting a square symmetric matrix, or one may enter or cut and paste the lower diagonal. Then, once the two buttons “Prepare for Analyses” and then “Run Analyses” are pressed, the program computes output for regression models based on all (2p – 1) possible subsets of p predictors with the criterion (e.g., if there are three predictors, there are seven models: X1; X2; X3; X1, X2; X1, X3; X2, X3; and X1, X2, X3). For each subset, an overall R 2 is computed, and then several indices of importance are provided. As we noted, there is no gold standard for establishing variable importance, so for each submodel we report the weights already described:

  1. 1.

    Overall model R 2;

  2. 2.

    Standardized least-squares regression weights; these are the least recommended weights for understanding relative importance, but the weights can show how different they can be from relative importance weights; they also can be used to double-check your regression output from another program;

  3. 3.

    Incremental R 2 for each predictor when entered last in the given regression model;

  4. 4.

    General dominance weights (Budescu, 1993); and

  5. 5.

    Relative importance weights (Fabbris, 1980; Johnson, 2000).

Fig. 1
figure 1

Analysis page

All of these statistics are displayed in the Results worksheet. To start over with a new regression model, the user should click the Analysis tab at the bottom of the screen and then press the “New Analysis” button. It is important to note that starting a new analysis will delete any prior information in the Results worksheet. To ensure that no information is lost, researchers should copy the Results page to another worksheet (or workbook) before running a new regression model.

A number of similar tools exist in the statistical programs SPSS and SAS, as well as in R code that can compute importance weights. For example, the package relaimpo in R allows researchers to compute importance weights for a regression model as well as to graph results and use bootstrapping to get confidence intervals (Grömping, 2006). Similarly, Budescu has written a number of SAS macros to compute various types of dominance weights, and LeBreton, Tonidandel, and J. W. Johnson have written SPSS syntax that will compute univariate and multivariate relative importance weights. However, the tools that are currently available in these programs either compute only one type of importance index, provide overall results across predictors but not the results across all possible regression submodels, or they have both limitations. Modifying the similar macros in SPSS, SAS, or R to eliminate these limitations would be extremely cumbersome, especially for researchers and practitioners that are less familiar with these software packages. Building this program in Excel allows even novice analysts the ability to compute a wide array of importance indices across all possible submodels in an easy, familiar, and efficient way. When researchers and practitioners are provided with a quick method to obtain multiple indices of predictor importance, they can understand multiple perspectives on the underlying relationships among predictor variables as they relate to the criterion of interest. It is the hope in this report that the theoretical and practical advantages that importance indices provide will reach a broader audience with this tool than has been possible in the past.

Limitations

The convenience and accessibility of using Excel to compute importance indices comes with a number of limitations we want to point out. Due to a rounding issue within Microsoft Excel, the general dominance weights are reliable for all practical purposes, with precision only being limited past the thousandths decimal place (0.001) due to the underlying Visual Basic module being unable to execute double-precision arithmetic. In addition, only one criterion variable can be inserted in the model at a time, although this does not prevent a researcher from running multiple models that keep all predictor intercorrelations but change the zero-order criterion-related validities. Also, only nine predictors can be added to the model at once. We would argue that the addition of more predictors than this is unnecessary, assuming that theory and expertise had guided the initial selection of predictors that are important to the regression model (Madden & Bottenberg, 1963); more predictors than this may also challenge the stability or replicability of the importance weight results across independent samples. Finally, bootstrapping was not added to the program. Bootstrapping is commonly used to compute confidence intervals around the parameter estimates. If confidence intervals are needed, it is our recommendation that a statistical program that focuses on that particular set of importance weights be used (see Johnson, 2004).

Example

To demonstrate the ability of the Excel program to generate multiple indices of relative importance across all predictor submodels, we replicate an example provided by Budescu (1993) with four predictors and one criterion. The criterion (Y) is salary for academics in psychology. The four predictors are years since doctoral degree (X1), number of publications (X2), sex (X3), and number of article citations (X4).There was multicollinearity in the data, as seen in Fig. 1, where predictors X1 and X2 correlate .68 with one another, and predictors X1 and X4 correlate .46. The program generates all 15 possible submodels, and for each model it provides the standardized regression weights, followed by the aforementioned indices of predictor importance: incremental R 2, general dominance weights, and relative importance weights. Figs. 2, 3 and 4 show the screen shots associated with this program output.

Fig. 2
figure 2

Results page: R 2 and regression weights for all submodels

Fig. 3
figure 3

Results page: Incremental R 2and general dominance weights for all submodels

Fig. 4
figure 4

Resultspage: Relative importance weights for all submodels

As you can see from this example, there are four predictors, all of which show nontrivial levels of criterion-related validity, ranging from .26 to .61. Furthermore, these predictors show some multicollinearity with one another. Thus, one cannot systematically or objectively determine predictor importance by eyeball, where one would have to inspect these zero-order intercorrelations, incorporate levels of criterion-related validity with patterns of predictor intercorrelation, and determine the relative importance of the predictors. Clearly one needs to rely on a set of well-motivated indices of relative importance. The program generates the overall R 2 and weights for all possible regression submodels (there are 24 – 1  =  15 submodels).That way, one can see (1) how well each subset of predictors explains the criterion and (2) whether or how predictor variable importance changes across submodels; a variable whose importance remains strong across submodels is likely a predictively powerful one (similar to Budescu’s notion of complete dominance).

As we noted, regression weights serve the function of minimizing the squared error of prediction, not predictor variable importance. In the present example, the magnitude of the weight appears uniformly strong for X1, but that is not the case when examining the different types of relative importance weights. Incremental R 2 tends to favor X1, but when all four predictors are in the model, X4 has the largest incremental R 2. This is not the case for general dominance weights, where X1 remains dominant across all models in which it is included. When X1 is not included, X2 competes with X4 for dominance. Relative importance weights also show X1 to be the most important variable across all models where it is included. Similarly, when X1 is not in the model, X2 competes with X4 for relative importance. In this example, the relative importance weights and general dominance weights show the same pattern of predictor importance across submodels. It is also important to note that in submodels containing only two predictors, relative importance weights and general dominance weights always provide identical values. These results are not surprising, given that Johnson (2000) and Lorenzo-Seva, Ferrando, and Chico (2010) found that there is consistent substantial agreement between general dominance weights and relative importance weights.

It is also important to consider the R 2 for each submodel, because it makes little sense to evaluate the relative importance of predictors in submodels that account for a very small proportion of variance in the dependent variable. Examining the submodel R 2 can help researchers determine which set of predictors allow for the most model parsimony without a substantial loss in variance explained. In the example presented, the full model accounts for approximately 49% of the variance in Y. The submodel containing only X1, X3, and X4 accounts for approximately 48% of the variance. By eliminating X2, a researcher could gain a more cost-effective and parsimonious model with very little loss of variance explained (a loss of 1%). It is also important to look at the overall R 2 in addition to the importance indices because they may provide slightly different information. The importance indices show that X1, X2, and X4 are frequently the strongest predictors and have the highest criterion-related validity. However, this combination of predictors actually performs worse in terms of overall prediction than the submodel presented above, due to predictor multicollinearity (X1 and X2 correlate .68).

Clearly, there is no unambiguous measure of predictor importance. Predictor importance is unclear when comparing across different relative importance indices, because each type of index has a justifiable but different underlying rationale. It is also unclear when comparing within a single type of relative importance index across all submodels, because a predictor’s importance can shift dramatically depending on the other predictors in the model. Therefore, it is important to consider all information before deciding on the predictors to keep in the model.

Discussion

Researchers strive for parsimony whenever building models or developing theory. Likewise, corporations face time and monetary demands that require practitioners to create fast and efficient selection systems. Importance indices are a good tool to deal with both of these types of demands, theoretical and practical. Although simply eyeballing the standardized regression weights is the most common way of assessing importance, other methods for establishing predictor importance are well motivated, more accurate, and can be replicated. However, since no clear best solution exists for assessing importance, it might be wise to consider multiple approaches before conclusively determining which predictors truly contribute the most to explaining variance in the criterion. The program mentioned here is an easy and convenient method for looking at all possible submodels of a set of predictors and, given a particular submodel, assessing the contribution of each predictor to the model’s criterion-related validity. It is also able to compute all possible submodels much more quickly and easily than other available programs that require the user to enter one submodel at a time. Finally, because the program is written in Microsoft Excel, it runs on a program that most people have access to.

Of course, the data should be scanned for outliers, miscoding, or other anomalies before computing the correlation matrix that goes into the program, since we know that correlations (or any statistics) are sensitive to the appropriateness of the data on which they depend. Similarly, it is important to ensure that the program has downloaded correctly and is working properly before using it to interpret results. It is our recommendation that researchers first replicate the example provided in this article (from Budescu, 1993) and verify that all results are identical before analyzing other relevant data sets. The example provided has been verified against Budescu’s (1993) original study as well as with a number of macros from SPSS, SAS, and R, and can provide researchers with a standard against which to ensure that the program is working correctly. Lastly, it is important that researchers review all results carefully before making interpretations, to avoid making inferential mistakes.

To aid in properly interpreting the importance indices, it is important to note that sampling error in the estimate of the weights is important to consider: All the information provided are estimates and therefore are sample-specific. We did not provide standard errors of these estimates, although that could be done in two ways with a bootstrapping module added to the program: Either sample the raw data with replacement and generate correlation matrices from these data, or treat the observed correlation matrix as a population matrix and then generate sample realizations from this matrix assuming multivariate normality of the data. Sampling error variance could distort the information such that the observed importance indices are attenuated, and in some cases the rank-ordered relative importance of variables could be altered. The ideal multiple regression model—and its indices of relative importance—would be based on large samples and also could be replicated in independent samples.

We also have a general recommendation for the appropriate use of the program that we offer. First, the model R 2 should meet or exceed a practically significant value determined by the researcher. Second, assuming that this criterion is reached, it would be sensible to compare the values from a relative importance analysis to determine which predictors are more important, given the definition of relative importance for that particular index (incremental R 2, general dominance weights, relative importance weights). Third, relative importance indices tell you that a specific predictor is important, but a further examination of validity coefficients would help tell you why that predictor is important. The literature has made this point indirectly, in discussions of how regression or canonical weights do not provide direct information about suppression effects or multicollinearity, whereas structure coefficients do (see Courville & Thompson, 2001; Nimon, Henson, & Gates, 2010). Structure coefficients are a linear function of correlations that reflect the relationship between the predictor and either a single criterion (in linear regression analysis) or canonical variate (in canonical correlation analysis).They can also be important and informative for interpreting results from confirmatory factor analysis and structural equation models (see Graham, Guthrie, & Thompson, 2003).

Finally, we would argue that theory is the foundation for conducting sound empirical research; for one, it narrows down the number of constructs to operationalize and model. That said, when a theory is in the early development phase, there is a place for empirical results to inform the theory by using tools such as the exploratory regression analysis program we have provided. The argument for exploratory factor analysis has been made for similar reasons (Haig, 2005). We would also assert that exploratory regression analysis could be of practical use when practitioners want to analyze a theory-driven research data set and, for practical purposes, they want to construct a more parsimonious model that predicts the outcomes of critical interest, because they only have the time and money to purchase tests and collect data on a limited number of variables.

The program outlined in this articleis currently available by contacting either author or by direct download from http://dl.dropbox.com/u/2480715/ERA.xlsm?dl=1.