Mokken Scale Analysis for Dichotomous Items Using Marginal Models

van der Ark, L. Andries; Croon, Marcel A.; Sijtsma, Klaas

doi:10.1007/s11336-007-9034-z

Mokken Scale Analysis for Dichotomous Items Using Marginal Models

Open access
Published: 08 November 2007

Volume 73, pages 183–208, (2008)
Cite this article

Download PDF

You have full access to this open access article

Psychometrika Aims and scope Submit manuscript

Mokken Scale Analysis for Dichotomous Items Using Marginal Models

Download PDF

L. Andries van der Ark¹,
Marcel A. Croon¹ &
Klaas Sijtsma¹

1707 Accesses
31 Citations
1 Altmetric
Explore all metrics

Abstract

Scalability coefficients play an important role in Mokken scale analysis. For a set of items, scalability coefficients have been defined for each pair of items, for each individual item, and for the entire scale. Hypothesis testing with respect to these scalability coefficients has not been fully developed. This study introduces marginal modelling as a framework to derive the standard errors for the scaling coefficients and test hypotheses about these coefficients. Several examples demonstrate the possibilities of marginal modelling in Mokken scale analysis. These possibilities include testing whether Mokken’s criteria for a scale are satisfied, testing whether scalability coefficients of different items are equal, and testing whether scalability coefficients are equal across different groups.

Scalability Coefficients for Two-Level Polytomous Item Scores: An Introduction and an Application

Scale length does matter: Recommendations for measurement invariance testing with categorical factor analysis and item response theory approaches

Article Open access 15 December 2021

Effect of Within-Group Dependency on Fit Statistics in Mokken Scale Analysis in the Presence of Two-Level Test Data

1 1. Introduction

Mokken scale analysis (Mokken, 1971; Sijtsma & Molenaar, 2002) is used for scaling items and measuring respondents on an ordinal scale. Mokken scale analysis consists of two parts. The first part is the evaluation of a set of items with ordered scores as a scale according to particular scaling criteria that are related to the monotone homogeneity model (Mokken, 1971; Sijtsma & Molenaar, 2002). This can be done in a confirmatory way for a set of items that are hypothesized to form a scale or in an exploratory way when an experimental set of items is analyzed to find out whether they constitute one or more scales. When none of the items satisfy the criteria of Mokken scale analysis, the result is that no scales can be constructed, but it happens more frequently that one or a few items in the set are unscalable whereas the majority of the items is scalable. The unscalable items are left out of the analysis. The scales that are produced by Mokken scale analysis are referred to as Mokken scales. The second part of Mokken scale analysis takes the scales found in the first part, and investigates several other interesting properties of the monotone homogeneity model that were not assessed explicitly in the first part of the analysis. This second part does not play a role in this study. Mokken scale analysis can be conducted using the stand-alone software package MSP5.0 for Windows (Molenaar & Sijtsma, 2000) and the R package mokken.0.2 (Van der Ark, 2007).

Mokken scales are defined by means of scalability coefficients (Mokken, 1971, pp. 148–153). The first part of Mokken scale analysis involves the testing of hypotheses about these scalability coefficients and the evaluation of their numerical values. The hypotheses involve testing whether scalability coefficients satisfy the criteria for a Mokken scale (Mokken, 1971, p. 184), and testing whether scalability coefficients are equal across items or across groups. We demonstrate that currently available methods do not allow us to test several interesting hypotheses about the scaling coefficients that are relevant in Mokken scale analysis, and we propose to use the marginal modelling framework for this purpose and also for testing hypotheses for which other solutions already exist (Mokken, 1971).

The paper is organized as follows. First, the principles of marginal modelling are explained. Second, Mokken scale analysis is discussed, including the monotone homogeneity model, the scalability coefficients, and the definition of a scale. Third, the scalability coefficients are discussed and it is shown how these coefficients can be reformulated so that they can be incorporated in marginal models. For the sake of readability, several important but rather cumbersome derivations have been diverted to appendices. Fourth, we give an overview of relevant hypotheses in Mokken scale analysis and we show how these hypotheses can be tested using marginal models. As an example, the marginal models were applied to data from a cognitive balance-task test (Van Maanen, Been, & Sijtsma, 1989). Fifth, the strengths and weaknesses of the marginal modelling approach are discussed, and recommendations are given for its practical use and for future improvements.

2 2. Marginal Models

Assume that a test consists of J dichotomously scored items, indexed by j and i. The random variable representing the scores on item j is denoted by X _j, and its realization by x _j (x _j ∈ {0, 1}). A vector containing the J item-score variables is denoted (X ₁,X ₂, …, X _J). The total score on the test is denoted by ${X_ + } = \sum\nolimits_{j = 1}^J {{X_j}} $. The popularity or the easiness of an item is defined as the probability that a randomly drawn respondent from the population of interest endorses a positively worded statement or answers an item correctly, respectively, and is denoted by π ¹_j . The probability that a randomly drawn respondent does not endorse a positively worded statement or answers an item incorrectly, is denoted by π ⁰_j . The joint probability of scores on X _i and X _j is denoted by π ^uν_ij [u, ν = 0, 1; π ^uν_ij can assume values for four different score pairs: (0, 0), (0, 1), (1, 0), and (1, 1)]. Without loss of generality, the items are ordered by decreasing popularity or easiness and numbered accordingly, such that

$$\pi _1^1 \ge \pi _2^1 \ge \cdots \pi _J^1$$

Equation (1) arbitrarily defines the most popular item to be item 1, the next popular item to be item 2, and so on. Equation (1) does not in any way restrict the data. Finally, the test data can be collected in a J-dimensional contingency table with L = 2^J cells.

Consider the example in Table 1 (upper left-hand panel), which shows the cross classification of J = 2 items in a two-way contingency table. The observed frequencies in the contingency table are denoted by n ^uν_ij (u, ν = 0, 1) and the marginal frequencies are denoted by n ^u_i , n ^ν_j , and n. Assuming a fixed sample size n, let m ^uν_ij be the theoretically expected frequency satisfying $m_{ij}^{u\nu } = n \times \pi _{ij}^{u\nu }$ (u, ν = 0, 1), with marginal frequencies m ^u_i , m ^ν_j , and m = n. Sample estimates of m ^uν_ij and π ^uν_ij are denoted by ^m ^uν_ij and ^π ^uν_ij , respectively. Without any constraints imposed upon the data, ^m ^uν_ij = n ^uν_ij and ^π ^uν_ij = n ^uν_ij /n. In Table 1 (upper left-hand panel), ^π ¹_i = 58/178 = 0.33 and ^π ¹_j = 44/178 = 0.25. Because ^π ¹_i > ^π ¹_j , item i is assumed to be more popular than item j in the population. The order of the indices i and j in the subscripts of, for example, n ^uν_ij , in general indicates that in the sample item i is more popular than item j.

Table 1 Example of a contingency table with observed frequencies for a dichotomous item pair (upper left-hand panel), the estimated expected frequencies under a marginal model of equal diagonal probabilities (upper right-hand panel), the estimated expected frequencies under a marginal model of homogeneous item popularity (lower left-hand panel), and the estimated expected frequencies under a marginal model with γ = .8 (lower right-hand panel).

Full size table

Marginal models for categorical data (Bartolucci & Forcina, 2002; Bartolucci, Forcina, & Dardanoni, 2001; Bergsma, 1997a; Bergsma & Rudas, 2002; Lang & Agresti, 1994; Rudas & Bergsma, 2004) constitute a family of models that impose restrictions on certain marginals (i.e., subsets) of contingency tables. These restrictions can have several forms. To illustrate this, we take the contingency table in the upper left-hand panel of Table 1 as a starting point.

The first example of a marginal model imposes equality constraints on two cell frequencies by hypothesizing that π ⁰⁰_ij = π ¹¹_ij . Estimation of this marginal model of equal diagonal probabilities yields estimated expected frequencies ^m ^uν_ij that are as close as possible to the observed frequencies n ^uν_ij (e.g., using a maximum likelihood or least-squares criterion) but with ^m ⁰⁰_ij = ^m ¹¹_ij . Table 1 (upper right-hand panel) shows the maximum likelihood estimates of the expected frequencies.

Throughout the paper we assume a multinomial sampling distribution that has the effect of reproducing the sample size n (here m = n = 178) in the marginal model. The fit of the marginal model is evaluated by comparing the observed and expected frequencies using commonly known fit statistics for contingency tables such as the likelihood ratio statistic, G ² (see Appendix A). Let C denote the number of nonredundant constraints on the frequencies in the contingency table. For large n, G ² approaches a chi-square distribution with C degrees of freedom (df = C). In the first example, it may be verified that G ² = 64.352; because there is one nonredundant constraint (i.e., m ⁰⁰_ij − m ¹¹_ij = 0), it follows that df = 1 and, as a result, p <.0001.

The second example of a marginal model imposes equality constraints on the marginal frequencies in Table 1 by hypothesizing that π ¹_i = π ¹_j , which implies π ⁰_i = π ⁰_j . Estimation of this marginal model of homogeneous item popularity yields estimated expected frequencies ^m ^uν_ij such that ^m ⁰_i = ^m ⁰_j and ^m ¹_i = ^m ¹_j . Table 1 (lower left-hand panel) shows the maximum likelihood estimates of the expected frequencies. It may be verified that G ₂ = 3.973; because there is one nonredundant constraint (i.e., m ⁰_i − m ⁰_j = 0), it follows that df = 1 and, as a result, p = .0462.

The third example of a marginal model imposes equality constraints on functions of the cell frequencies in Table 1 by restricting Goodman and Kruskal’s (1954) γ coefficient to a value that is hypothesized between two variables in a particular study. This application is interesting because it allows us to illustrate marginal modelling in greater detail than the previous, simpler examples. Coefficient γ can be written as a function of the expected cell frequencies,

$$\gamma = {{m_{ij}^{00}m_{ij}^{11} - m_{ij}^{01}m_{ij}^{10}} \over {m_{ij}^{00}m_{ij}^{11} + m_{ij}^{01}m_{ij}^{10}}}$$

. Bergsma and Croon (2005) described several interesting restrictions on γ that can be estimated using marginal models. A simple restriction is the arbitrary equality constraint γ = .8. For this marginal model the expected frequencies m ^uν_ij (u, ν = 0. 1) are estimated under the constraint that γ = .8. Table 1 (lower right-hand panel) shows the maximum likelihood estimates of the expected frequencies. It may be verified that G ² = 3.207; because there is one nonredundant constraint (i.e., γ − 0.8 = 0), it follows that df = 1 and, as a result, p = .0733.

In general, marginal models can be applied to multiway contingency tables with L cells. Let n be the (L × 1) vector of observed frequencies in the contingency table, and let m be the (L × 1) vector of expected frequencies given the marginal model. It is assumed that the order of the elements in both n and m corresponds to the following ordering of the item-score patterns collected in the L × J matrix R, defined as

$${\rm{R}} = \left( {\matrix{ 0 & 0 & 0 & \cdots & 0 & 0 & 0 \cr 0 & 0 & 0 & \cdots & 0 & 0 & 1 \cr 0 & 0 & 0 & \cdots & 0 & 1 & 0 \cr 0 & 0 & 0 & \cdots & 0 & 1 & 1 \cr 0 & 0 & 0 & \cdots & 1 & 0 & 0 \cr 0 & 0 & 0 & \cdots & 1 & 0 & 1 \cr 0 & 0 & 0 & \cdots & 1 & 1 & 0 \cr \vdots & \vdots & \vdots & {} & \vdots & \vdots & \vdots \cr 1 & 1 & 1 & \cdots & 1 & 1 & 0 \cr 1 & 1 & 1 & \cdots & 1 & 1 & 1 \cr } } \right)$$

. Given the ordering with respect to popularity or easiness in equation (1), the scores in the first column of R correspond to the most popular item, the scores in the second column to the next most popular item, and so on, and the scores in the last column correspond to the least popular item. Suppose that themarginal model consists of C nonredundant equality constraints, which are functions of m. The first inequality constraint is denoted by g ₁(m), the second by g ²(m), and the last by g _C(m). Setting each function equal to zero yields g ₁(m) = 0, g ₂(m) = 0, …, g _C(m) = 0. In vector notation these equality constraints can be written as

$${\rm{g}}({\rm{m}}) = \left( {\matrix{ {{g_1}({\rm{m}})} \cr \vdots \cr {{g_C}({\rm{m}})} \cr } } \right) = 0$$

. For the first example with respect to equal diagonal probabilities (Table 1, upper right-hand panel), equation (3) equals g(m) = g ₁(m) = m ⁰⁰_ij − m ¹¹_ij = 0; for the second example with respect to homogeneous item popularity (Table 1, lower left-hand panel), equation (3) equals ${\rm{g(m) = }}{{\rm{g}}_{\rm{1}}}{\rm{(m) = }}m_i^1 - m_j^1 = \left( {m_{ij}^{10} + m_{ij}^{11}} \right) - \left( {m_{ij}^{01} + m_{ij}^{11}} \right) = m_{ij}^{10} - m_{ij}^{01} = 0$; and for the third example that imposes restriction γ = .8 (Table 1, lower right-hand panel), equation (3) equals ${\rm{g(m) = }}{{\rm{g}}_{\rm{1}}}{\rm{(m)}} = {\rm{ }}\left( {m_{ij}^{00}m_{ij}^{11} - m_{ij}^{01}m_{ij}^{10}} \right)/\left( {m_{ij}^{00}m_{ij}^{11} + m_{ij}^{01}m_{ij}^{10}} \right) - .8 = 0$.

Bergsma (1997b) developed syntax for Mathematica (Wolfram, 1999) that produces maximum likelihood estimates and asymptotic standard errors for m. In the process of maximum likelihood estimation, the Jacobian of g(m) with respect to log(m) must be computed (see Appendix A). For different marginal models this Jacobian can have very different forms. Bergsma (1997a, p. 66) proposed to write the constraints in equation (3) in a single general matrix formula using a recursive exp-log notation (see also Kritzer, 1977). Once written in recursive exp-log notation, the derivation of the Jacobian is straightforward (Bergsma, 1997a, p. 68; see also Appendix A), and a simple recursive algorithm, which can be easily implemented in software, suffices to compute the Jacobian irrespective of the marginal model.

Given that A ₁,…,A _q are q design matrices, the general form of the recursive exp-log notation of a marginal model is

$${\rm{g}}({\rm{m}}) = {{\rm{A}}_q}\exp \left( {{{\rm{A}}_{q - 1}}\log \left( {{{\rm{A}}_{q - 2}} \ldots \exp \left( {{{\rm{A}}_2}\log \left( {{{\rm{A}}_1}{\rm{m}}} \right)} \right)} \right)} \right)$$

. For a particular marginal model, the appropriate design matrices must be derived in order to write g(m) in a recursive exp-log notation. There are no explicit rules for deriving design matrices and the same marginal model can often be written in different recursive exp-log notations. Finding the most parsimonious recursive exp-log notation may require some effort.

For the three examples of marginal models in Table 1, the expected frequencies are collected in the vector m = (m ⁰⁰_ij , m ⁰¹_ij , m ¹⁰_ij , m ¹¹_ij )^T (the superscript T denotes the transpose). The first example concerning equal diagonal probabilities has one design matrix, which is A ₁ = (1 0 0 − 1), and the recursive exp-log notation of the model constraints in equation (3) is equal to

$${\rm{g}}\left( {\rm{m}} \right) = {{\rm{g}}_1}\left( {\rm{m}} \right) = {{\rm{A}}_1}{\rm{m}} = \left( {100 - 1} \right)\left( {\matrix{ {m_{ij}^{00}} \cr {m_{ij}^{01}} \cr {m_{ij}^{10}} \cr {m_{ij}^{11}} \cr } } \right) = m_{ij}^{00} - m_{ij}^{11} = 0$$

. The second example with respect to homogeneous item popularity also has one design matrix, which is A ₁ = (0 1 − 10). The recursive exp-log notation of equation (3) is A ₁ m = 0, which results in $m_{ij}^{01} - m_{ij}^{10} = m_i^1 - m_j^1 = 0$.

For the third example that imposes γ = .8 upon the table, the design matrices were derived by Bergsma and Croon (2005), who showed that γ = A ₅.exp(A ₄.log(A ₃.exp(A ₂.log(A ₁.m)))), with

$${{\rm{A}}_1} = {{\rm{I}}_{4 \times 4}},{{\rm{A}}_2} = \left( {\matrix{ 1 & 0 & 0 & 1 \cr 0 & 1 & 1 & 0 \cr } } \right),{{\rm{A}}_3} = \left( {\matrix{ 1 & 0 \cr 0 & 1 \cr 1 & 1 \cr } } \right),{{\rm{A}}_4} = \left( {\matrix{ 1 & 0 & { - 1} \cr 0 & 1 & { - 1} \cr } } \right),{{\rm{A}}_{\rm{5}}} = \left( {1 - 1} \right)$$

. Hence the recursive exp-log notation of equation (3) is

$${\rm{g}}({\rm{m}}) = {{\rm{g}}_{\rm{1}}}({\rm{m}}) = {{\rm{A}}_{\rm{5}}}.\exp \left( {{{\rm{A}}_{\rm{4}}}.\log \left( {{{\rm{A}}_{\rm{3}}}.\exp \left( {{{\rm{A}}_{\rm{2}}}.\log \left( {{{\rm{A}}_{\rm{1}}}.{\rm{m}}} \right)} \right)} \right)} \right) - {\rm{0}}.{\rm{8}} = {\rm{0}}$$

. In Appendix A it is shown how maximum likelihood estimates of m are obtained subject to the constraints in equation (3), when these constraints are written in the recursive exp-log notation of equation (4).

3 3. Mokken Scale Analysis

The main purpose of this study is to use marginal models and the recursive exp-log notation to test hypotheses about scalability coefficients in the context of Mokken scale analysis (Mokken, 1971; Sijtsma & Molenaar, 2002). Before we explain this application, subsequently we introduce the monotone homogeneity model, the scalability coefficients, the relationships between the monotone homogeneity model and the scalability coefficients, the definition of a scale, two types of Mokken scale analysis, and some existing results for the distribution of the scalability coefficients.

3.1 3.1. The Monotone Homogeneity Model

The monotone homogeneity model (Mokken, 1971, Chap. 4; Sijtsma & Molenaar, 2002, pp. 22–23; Sijtsma & Meijer, 2007) is a nonparametric item response theory (IRT) model for ordinal person measurement (related theory was developed, e.g., by Molenaar, 1997; Ramsay, 1991; Scheiblechner, 2007; and Stout, 1990). Before we discuss the assumptions of this model, first we introduce some notation. Let θ denote the latent variable underlying performance on each of the items in the test. Let the probability of obtaining score x _j on item j be denoted by P(X _j = x _j|θ). This conditional response probability is known as the item response function (IRF). Further, let the joint probability of a particular score pattern on the J items in the test be denoted by P(X ₁ = x ₁, …, X _J = x _J|θ). The monotone homogeneity model is based on the following three assumptions.

Unidimensionality. The responses to the items are driven by a unidimensional latent variable denoted θ.

Local Independence. The joint distribution of the item scores conditional on θ can be written as the product of the J conditional marginal distributions: $P({X_1} = {x_1}, \ldots ,{X_J} = {x_J}\left| {\theta ) = \prod _{j = 1}^J\left( {{X_j} = {x_j}\left| \theta \right.} \right)} \right.$.

Monotonicity. As latent variable θ increases, the probability of a positive response to an item increases or stays the same across intervals of θ; that is, for two values of θ, say, θ _a and θ _b, and arbitrarily assuming that θ _a < θ _b, monotonicity means that $P({X_J} = 1\left| {\theta = {\theta _a}) = \le P\left( {{X_j} = 1\left| {\theta = } \right.{\theta _b}} \right)} \right.$. for j = 1, …, J.

For dichotomous items, the monotone homogeneity model implies the stochastic ordering of latent variable θ by total score X ₊; that is, for an arbitrary value t of θ, the probability P(θ > t|X ₊ = x ₊) is nondecreasing in x ₊ (Hemker, Sijtsma, Molenaar, & Junker, 1997; also, see Grayson, 1988). This property guarantees an ordinal person scale: Persons with higher X ₊ scores on average have higher θ values.

Mokken (1971, pp. 119–120) showed that for a J-item test the monotone homogeneity model implies that all interitem covariances or, equivalently, all interitem product-moment correlations, are nonnegative. Let σ _ij denote the covariance between items i and j; then, the monotone homogeneity model implies

$${\sigma _{ij}}{\rm{ = }} \ge {\rm{0 for all }}i{\rm{ < }}j$$

. Equation (5) is used throughout. Nonnegative interitem covariance is a special case of a more general interitem covariance result, known as conditional association, and proven to be true by Holland and Rosenbaum (1986) under more general conditions—multidimensional latent variables and continuous item scores, and local independence and monotonicity adapted to these conditions. In Holland and Rosenbaum’s (1986) conditional association framework, nonnegative interitem covariance in equation (5) is referred to as pairwise nonnegative association (Ellis & Van den Wollenberg, 1993). Other observable consequences, such as manifest monotonicity (Junker & Sijtsma, 2000), can be used to test the monotonicity assumption, but like conditional association (except pairwise nonnegative association) they do not play a role in this study.

3.2 3.2. Scalability Coefficients

The Guttman (1950) model is the basis of the scalability coefficients H _ij, H _j, and H (Mokken, 1971; cf. Loevinger, 1948). Given an ordering of the J items according to decreasing popularity (equation (1)), the Guttman model assumes that a respondent who endorses the less popular item in a pair of items also endorses the more popular item. Thus, if π ¹_i > π ¹_j , for any respondent the Guttman model excludes the item-score pattern (X _i, X _j) = (0, 1). This item-core pattern is called a Guttman error, and the other three item-score patterns [(0, 0), (1, 0), and (1, 1)] are called conformal patterns. Data that do not contain Guttman errors are in agreement with the Guttman model.

In a 2 × 2 contingency table for the scores on items i and j (with π ¹_i > π ¹_j and sample of size n), the expected number of Guttman errors, denoted F _ij, equals F _ij = n × π ⁰¹_ij , and the expected number of Guttman errors under marginal independence, denoted by E _ij, equals ${E_{ij}} = n \times \pi _i^0 \times \pi _j^1$. The scalability coefficient for items i and j, denoted by H _ij, is computed from

$${H_{ij}} = 1 - {{{F_{ij}}} \over {{E_{ij}}}} = 1 - {{\pi _{ij}^{01}} \over {\pi _i^0 \times \pi _j^1}} = 1 - {{n \times m_{ij}^{01}} \over {m_i^0 \times m_j^0}}$$

. For the example in Table 1 (upper left-hand panel), ^F _ij = 18 and ^E _ij = 29.663, yielding ^H _ij = .3932. To facilitate its interpretation, coefficient H _ij can be written as a normed covariance (e.g., Sijtsma & Molenaar, 2002, p. 55). Let σ ^max_ij be the maximum covariance between items i and j, given the marginal distributions of X _i and X _j. Given that items i and j have positive variance, equation (6) is equal to

$${H_{ij}} = {{{\sigma _{ij}}} \over {\sigma _{ij}^{\max }}}$$

.

The scalability coefficient for an individual item j, denoted H _j, j = 1, …, J (Mokken, 1971, p. 151), is defined as

$${H_j} = 1 - {{\sum\nolimits_{i \ne j} {{F_{ij}}} } \over {\sum\nolimits_{i \ne j} {{E_{ij}}} }} = 1 - {{n\left( {\sum\nolimits_{i = 1}^{j - 1} {m_{ij}^{01} + \sum\nolimits_{i = j + 1}^J m _{ji}^{01}} } \right)} \over {\sum\nolimits_{i = 1}^{j - 1} {m_i^0m_j^1 + \sum\nolimits_{i = j + 1}^J {m_j^0m_i^1} } }}$$

. Coefficient H _j can also be written in terms of interitem covariances and corresponding maximum covariances, given the marginal distributions of the item scores, as

$${H_j} = {{\sum\nolimits_{i \ne j} {{\sigma _{ij}}} } \over {\sum\nolimits_{i \ne j} {\sigma _{ij}^{\max }} }}$$

. Let rest score R _(j) be defined as the total score on J − 1 items excluding item j, then one can also write (Sijtsma & Molenaar, 2002, p. 57)

$${H_j} = {{{\sigma _X}{{_{_j}}_R}_{_{(j)}}} \over {\sigma _{{X_j}{R_{(j)}}}^{\max }}}$$

. Equation (9) shows that coefficient H _j expresses the strength of the relationship between item j and the other items in the test, comparable with a regression coefficient in a regression model.

For a set of J items, Mokken (1971, p. 149) proposed the total-scale coefficient H, which is defined as

$$H = 1 - {{\sum\nolimits_{i = 1}^{J - 1} {\sum\nolimits_{j = i + 1}^J {{F_{ij}}} } } \over {\sum\nolimits_{i = 1}^{J - 1} {\sum\nolimits_{j = i + 1}^J {{E_{ij}}} } }} = 1 - {{n\left( {\sum\nolimits_{i = 1}^{J - 1} {\sum\nolimits_{j = i + 1}^J {m_{ij}^{01}} } } \right)} \over {\sum\nolimits_{i = 1}^{J = 1} {\sum\nolimits_{j = i + 1}^J {m_i^0m_j^1} } }}$$

. Coefficient H can also be written in terms of interitem covariances and item rest-score covariances, which results in

$$H = {{\sum\nolimits_{i = 1}^{J - 1} {\sum\nolimits_{j = i + 1}^J {{\sigma _{ij}}} } } \over {\sum\nolimits_{i = 1}^{J - 1} {\sum\nolimits_{j = i + 1}^J {\sigma _{ij}^{\max }} } }} = {{\sum\nolimits_{j = 1}^J {{\sigma _X}{{_{_j}}_R}_{_{(j)}}} } \over {\sum\nolimits_{j = 1}^J {\sigma _{{X_j}{R_{(j)}}}^{\max }} }}$$

. If the data obey a perfect Guttman scalogram, H = 1, but this value is never found in practice.

Sijtsma and Molenaar (2002, Theorem 4.2; see also Hemker, Sijtsma, & Molenaar, 1995) showed that H _ij, H _j, and H are related such that

$$\mathop {\min }\limits_{i,j} \left( {{H_{ij}}} \right) \le \mathop {\min }\limits_j \left( {{H_j}} \right) \le H \le \mathop {\max }\limits_j \left( {{H_j}} \right) \le \mathop {\max }\limits_{i,j} \left( {{H_{ij}}} \right)$$

.

3.3 3.3. Relationships Between the Monotone Homogeneity Model and the Scalability Coefficients

The monotone homogeneity model implies observable consequences with respect to the scalability coefficients H _ij, H _j, and H. These observable consequences are used in data analysis to investigate whether the data support the fit of the monotone homogeneity model (Mokken, 1971; Sijtsma & Molenaar, 2002; Sijtsma & Meijer, 2007).

In particular, Mokken (1971, pp. 148–153; see also Sijtsma & Molenaar, 2002, Theorem 4.3) showed that the monotone homogeneity model implies that

$$\matrix{ {0 \le {H_{ij}} \le 1} & {{\rm{for all }}i < j,} \cr {0 \le {H_{ij}} \le 1} & {{\rm{for all }}j{\rm{, and }}} \cr {0 \le H \le 1} & {} \cr } $$

. Thus, negative scalability coefficients are in conflict with the monotone homogeneity model. These observable consequences are the basis of Mokken scale analysis.

3.4 3.4. Definition of a Scale and Two Types of Mokken Scale Analysis

3.4.1 3.4.1. Definition of a Scale.

A set of items is a scale (Mokken, 1971, p. 184; Molenaar & Sijtsma, 2000; Sijtsma & Molenaar, 2002, p. 68), in this study called a Mokken scale if, for product-moment correlation ρ, and for any constant value 0 < c ≤ 1,

$${\rho _{ij}} > 0{\rm{ }}({\rm{or, equivalently}},{H_{ij}} > 0){\rm{ for all }}i < j,{\rm{ and}}$$

$${H_j} \ge c > 0{\rm{ for all }}j$$

. Equation (13) is the first criterion of a Mokken scale, and equation (14) is the second criterion of a Mokken scale. Compared to equations (5) and (12), strict inequality is not crucial here due to continuity of the scales of ρ and H _j. Except for the strict inequalities, the monotone homogeneity model implies both equation (13) and H _j > 0 (which is part of equation (14)).

However, the monotone homogeneity model does not imply a specific positive values of c. Thus, the inclusion of positive c in the definition of a Mokken scale can be a source of confusion and needs to be explained. To understand the role of positive c, one may note that the monotone homogeneity model, and special cases of this model such as the one-, two-, and three-parameter logistic models, allow items in a scale which have (nearly) flat IRFs. Such items contribute little, if anything, to a reliable person ordering and may even attenuate the reliability of this ordering; thus, these items are unwanted in a scale. The inclusion of a positive c in the definition of a Mokken scale prevents the selection of such items in a scale by rejecting items with H _js which are smaller than c. Thus, Mokken scale analysis aims to produce “high-quality” scales, the definition of which depends on the researcher’s choice of lower bound c.

Mokken (1971, p. 184) proposed to always set c at least to .3. One may note that equation (11) implies that H ≥ min_j(H _j); thus, for lower bound c = .3, the total-scale H ≥ .3. The choice of c controls the quality of the individual items in the scale and of the total scale and, therefore, of the total-scale score X ₊ for ordering persons on latent variable θ. Mokken (1971, p. 185) proposed the following rules of thumb for the interpretation of H. A set of items is unscalable for all practical purposes if H < .3; and a scale is considered weak if .3 ≤ H < .4, moderate if .4 ≤ H < .5, and strong if H ≥ .5.

3.4.2 3.4.2. Two Types of Mokken Scale Analysis

Mokken scale analysis can have two forms (Mokken, 1971, pp. 187–199). The first possibility is that the researcher evaluates a given set of J items with respect to the definition of a scale for a chosen value of c. This is confirmatory Mokken scale analysis. The second possibility is to use an automated item selection algorithm (Mokken, 1971, pp. 190–199; Sijtsma & Molenaar, 2002, Chap. 5). This algorithm selects items one by one to obtain one or more scales (depending on the data structure) that agree with the definition of a Mokken scale. In each selection step, the item is chosen from the items not already selected, that not only agrees with equations (13) and (14) but also produces the greatest total-scale H coefficient with the items already selected in previous steps. This is exploratoryMokken scale analysis.

In the remainder of this paper we discuss the use of marginal modelling for testing hypotheses about the scalability coefficients. The term Mokken scale analysis refers to the use of scalability coefficients for scale construction both in a confirmatory and in an exploratory context.

3.5 3.5. Results for the Distribution of the Scalability Coefficients

Results for the distribution of the scalability coefficients are available for the null case (which refers to the null hypothesis that H = 0) and the nonnull case (which refers to the null hypothesis that H = w, w is some positive constant) (Mokken, 1971, pp. 160–169). Results for the null case Mokken (1971, pp. 160–164) are the following. Let S _ij be the sample covariance of items i and j, and let S _i and S _j be the sample standard deviations of items i and j, respectively; then for large n, in the null case, the statistics

$${Z_{ij}} = {{{S_{ij}}} \over {{S_i}{S_j}}}\sqrt {n - 1} $$

,

$${Z_j} = {{\sum\nolimits_{i \ne j} {{S_{ij}}} } \over {{S_j}\sum\nolimits_{i \ne j} {{S_j}} }}\sqrt {n - 1} $$

, and

$$Z = {{\sum\nolimits_{i = 1}^{J - 1} {\sum\nolimits_{j = i + 1}^J {{S_{ij}}} } } \over {\sum\nolimits_{i = 1}^{J - 1} {\sum\nolimits_{j = i + 1}^J {{S_i}{S_j}} } }}\sqrt {n - 1} $$

, converge to a standard normal distribution. In the available software for Mokken scale analysis, H _ij = 0 is tested against the alternative that H _ij > 0 to decide whether items satisfy the first criterion of a Mokken scale that ρ _ij > 0 (equation (13)). Results for the nonnull case yield asymptotic standard errors for ^H (Mokken, 1971, pp. 164–169). These results are not available in current software for Mokken scale analysis.

4 4. A Marginal Modelling Approach to the Scalability Coefficients

Coefficient H _ij can be written in the recursive exp-log notation, which is useful for testing hypotheses involving H _ij. Let m = (m ⁰⁰_ij , m ⁰¹_ij , m ¹⁰_ij , m ¹¹_ij )^T, and let A ₁ and A ₂ be the following design matrices:

$${{\rm{A}}_1} = \left( {\matrix{ 1 & 1 & 1 & 1 \cr 1 & 1 & 0 & 0 \cr 0 & 1 & 0 & 1 \cr 0 & 1 & 0 & 0 \cr } } \right){\rm{ and }}{{\rm{A}}_{\rm{2}}}{\rm{ = }}\left( {1 - 1 - 11} \right)$$

. Then, H _ij in equation (6) equals

$${H_{ij}} = 1 - \exp \left( {{{\rm{A}}_2}\log \left( {{{\rm{A}}_1}{\rm{m}}} \right)} \right)$$

. This can be verified by writing the term log(A ₁ m) in equation (15) as

$$\log ({{\rm{A}}_1}{\rm{m}}) = \log \left[ {\left( {\matrix{ 1 & 1 & 1 & 1 \cr 1 & 1 & 0 & 0 \cr 0 & 1 & 0 & 1 \cr 0 & 1 & 0 & 0 \cr } } \right) \cdot \left( {\matrix{ {m_{ij}^{00}} \cr {m_{ij}^{01}} \cr {m_{ij}^{10}} \cr {m_{ij}^{11}} \cr } } \right)} \right] = \log \left( {\matrix{ n \cr {m_i^0} \cr {m_j^1} \cr {m_{ij}^{01}} \cr } } \right)$$

, and noting that

$$\exp \left( {{{\rm{A}}_2}\log \left( {{{\rm{A}}_1}{\rm{m}}} \right)} \right) = \exp \left[ {\left( {\matrix{ 1 & { - 1} & { - 1} & 1 \cr } } \right)\log \left( {\matrix{ n \cr {m_i^0} \cr {m_j^1} \cr {m_{ij}^{01}} \cr } } \right)} \right] = {{n \times m_{ij}^{01}} \over {m_i^0 \times m_j^1}}$$

.

In the case of J items, there are K = 1/2J(J − 1) item pairs; hence, there are K coefficients H _ij. The recursive exp-log notation for the vector H ij = (H ₁₂,H ₁₃, …, H _J−1,J)^T containing all K item-pair coefficients H _ij (i < j) is derived in Appendix C

Based on previous results, a researcher may have reason to believe that for two particular key items in a test H _ij = w, with 0 ≤ w < 1, and (s)he may wish to test this hypothesis on a sample from another population. Using the recursive exp-log notation for H _ij (equation (15)), it may be verified that the marginal model imposing H _ij = w on the contingency table has one nonredundant constraint, which can be written in terms of equation (3) as

$${g_1}(m) = 1 - w - \exp \left( {{{\rm{A}}_2}\log \left( {{{\rm{A}}_1}{\rm{m}}} \right)} \right) = 0$$

. For the observed frequencies in Table 1 (upper left-hand panel), choosing w = .5 as an example, the marginal model with constraint H _ij = .5 yields the estimated expected frequencies shown in Table 2. This results in G ² = 1.2207, df = 1, and p = .2692.

Table 2 Estimated expected frequencies for the data in Table 1 under the marginal model imposing H_ij = .5 on the table.

Full size table

Item coefficient H _j can be written in a recursive exp-log notation, which is derived in Appendix D for the vector H _j = (H ₁,H ₂,…, H _J)^T containing all H _js. Total-scale coefficient H can be written in a recursive exp-log notation, which is derived in Appendix E.

5 5. Hypotheses in Mokken Scale Analysis

The use of marginal modelling for testing hypotheses in Mokken scale analysis is illustrated by means of the binary data from 484 children who were administered a 25-item balance-task test (Van Maanen et al. 1989). It was hypothesized that the tasks could be divided into five dimensionally different subscales based on the type of task. The subscales are named Distance, Weight, Conflict Weight, Conflict Balance, and Conflict Distance. For a convenient presentation in the tables, in each of the five scales the items are numbered 1, …, 5. Table 3 shows the proportions-correct (i.e., the ^π ¹_j s) of the 25 items.

Table 3 ^π ¹_j -Values for each of the five balance-task scales.

Full size table

5.1 5.1. Testing the First Criterion of a Mokken Scale

The first criterion of a Mokken scale is ρ _ij > 0 for all i < j, which is identical to H _ij > 0 for all i < j (equation (13)). In this section it is explained how marginal modelling can be used to test the global hypothesis that all K item-pair coefficients H _ijs are 0. This global test is a novel statistical tool in Mokken scale analysis. To appreciate its usefulness, first we discuss the exploratory analysis and then the confirmatory analysis. In doing this, we only discuss details of exploratory Mokken scale analysis that are relevant here, and skip many other details.

For exploratory Mokken scale analysis, assuming that already r − 1 items have been selected into a scale (and without worrying how this has been accomplished; for the details, see Mokken, 1971, pp. 190–199; Sijtsma & Molenaar, 2002, Chap. 5), the rth candidate item for selection must have positive correlations (or, equivalently, positive pairwise scalability coefficients) with each of the r − 1 items already selected (Mokken, 1971, p. 192, third step). This requirement assures us that the first criterion (equation (13)) of a scale is satisfied for the r items selected thus far. If, for the rth item, each of the r − 1 item-pair coefficients is significantly greater than 0, the first criterion is satisfied, and if this result is also found for other candidate items, each of these items remains in competition to be included in the scale (which of these candidates eventually is the rth item to be selected depends on the second criterion (equation (14)) and other decision rules not discussed here).

The tests of H _ij = 0 against H _ij > 0 are conducted by testing the marginal independence of X _i and X _j. This is a simple procedure which can be done with little computational effort. The type I error rate is controlled by a Bonferroni correction, which is very conservative here because the test statistics are dependent, and because tests are accumulated across different steps in the automated item selection algorithm (Mokken, 1971, pp. 196–198).

In confirmatory Mokken scale analysis, the researcher has to test the first criterion for each item pair separately, but here we propose to use a marginal model to test for all K H _ij coefficients simultaneously whether they are equal to zero, thus circumventing the Bonferroni correction. Formally, H _ij = (H ₁₂, H ₁₃, …, H _J−1,J)^T contains all K coefficients H _ij (i < j). If the global null hypothesis that H _ij = 0 is rejected, the researcher has to check next whether the sample values of the item-pair scalability coefficients are positive; that is, whether ^H _ij > 0. Only the combination of a rejected global null hypothesis and positive sample H _ijs leads to the conclusion that the first criterion (equation (13)) of a Mokken scale is satisfied. If not all sample H _ijs are positive, the next step is to identify items that may be rejected from the scale. This is done in the same way as when the global null hypothesis that H _ij = 0 is not rejected. We suggest identifying candidate items for rejection by testing for separate item pairs H _ij = 0 against the alternative that H _ij > 0, just as with the exploratory procedure. Item pairs for which the null hypothesis is not rejected are identified, and for each item involved in such a pair it is counted how often it is involved in negative ^H _ijs with other items. Items that are frequently involved in negative sample item-pair scalability coefficients are candidates for removal from the test. We now concentrate on the new global test, based on marginal modelling, that H _ij = 0.

Let u _K denote a vector of length K that contains 1s, and let A ₁ and A ₂ be design matrices (derived in Appendix C). In Appendix C it is shown that

$${{\rm{H}}_{ij}} = {{\rm{u}}_K} - \exp \left( {{{\rm{A}}_2}\log \left( {{{\rm{A}}_1}{\rm{m}}} \right)} \right)$$

. Hence, the recursive exp-log notation of the K restrictions (see equation (3)) for marginal model H _ij = 0 is

$${\rm{g}}({\rm{m}}) = {{\rm{u}}_K} - \exp \left( {{{\rm{A}}_2}\log \left( {{{\rm{A}}_1}{\rm{m}}} \right)} \right) = {0_K}$$

. If the marginal model in equation (17) is rejected and if in the sample ^H _ij > 0 for all i < j, then the first criterion (equation (13)) for a Mokken scale is met for all J items.

One advantage of this global test is that it does not require a Bonferroni correction. Another advantage is that it allows the first criterion for a Mokken scale to be strengthened, for example, by requiring that all H _ijs are greater than a positive value d so as to avoid values of H _ij close to 0. Values close to 0 may allow undesirable multidimensionality in a scale, and are not excluded by the second criterion for a Mokken scale, H _j ≥ c > 0 all j (equation (14)). What is a reasonable choice for d? Because, by equation (11), we have that min_i,j(H _ij) ≤ H _j ≤ maxi,j(H _ij), it seems reasonable to choose an a priori lower bound d for H _ij smaller than c. In this example, we arbitrarily set d = .1.

Let d _K be a vector of length K with all elements equal to d. Then the marginal model equals H _ij = d _K. Using the recursive exp-log notation for H _ij in equation (16), it may be verified that the recursive exp-log notation of the K restrictions (see equation (3)) for this marginal model is

$${\rm{g}}({\rm{m}}) = {{\rm{u}}_K} - \exp \left( {{{\rm{A}}_2}\log \left( {{{\rm{A}}_1}{\rm{m}}} \right)} \right) - {{\rm{d}}_K} = {0_K}$$

.

The marginal model with d = 0 (equation (17)) and the stronger marginal model with d = .1 (equation (18)) were tested on the balance-scale data. For each balance scale, the ^H _ijs and their standard errors, and the likelihood ratio statistic G ² and corresponding p-value, are shown in Table 4. For d = 0, using α = .05 the null model was rejected for all scales and, in addition, all sample ^H _ijs were found to be greater than zero. Thus, the first criterion of the Mokken scale (ρ _ij > 0; equation (13)) was assumed to be satisfied. For d = .1, implying the statistical test that simultaneously all H _ij > .1, four scales were found to satisfy this more demanding criterion but for the Conflict Balance scale the marginal model in equation (18) was not rejected.

Table 4 Estimated scalability coefficients ^H _ij with standard errors between parentheses for each of the five balance-task scales (upper panel); fit statistics (G ², p-value) for the marginal model defining H _ij = 0 for i = 1 …, 4; j = i + 1, …, 5 (middle panel); and fit statistics for the marginal model defining H _ij = .1 for i = 1 …, 4; j = i + 1, …, 5 (lower panel).

Full size table

5.2 5.2. Testing the Second Criterion of a Mokken Scale

The second criterion of a Mokken scale is that H _j ≥ c > 0 for all j = 1, …, J (equation (14)). The current practice is that for each item the null hypothesis is tested that H _j = 0. When this null hypothesis is rejected, it is checked in the data whether ^H _j exceeds lower bound c. If for each item the null hypothesis is rejected and ^H _j > c for all j, the second criterion for a Mokken scale is assumed to be satisfied. Currently, there is no test available for the null hypothesis that H _j = c against the alternative that H _j > c and, sometimes, when the automated item selection procedure is used, an item scalability coefficient is greater than c when the item enters the scale, but then drops below c as subsequent items enter the scale (e.g., Sijtsma & Molenaar, 2002, pp. 79–80).

The marginal modelling approach offers a solution. A marginal model may be tested in which, simultaneously, all H _j = c. H _j = (H ₁, …, H _J)^T contains all H _js, and let c _J be a vector of length J with all elements equal to lower bound c. The marginal model is then H _j = c. If the marginal model is rejected and all sample ^H _js exceed c, the second criterion is assumed to be satisfied.

Let A ₁, A ₂, A ₃, and A ₄ be design matrices (derived in Appendix D). Appendix D shows that

$${{\rm{H}}_j} = {{\rm{u}}_J} - \exp \left( {{{\rm{A}}_4}\log \left( {{{\rm{A}}_3}\exp \left( {{{\rm{A}}_2}\log \left( {{{\rm{A}}_1}{\rm{m}}} \right)} \right)} \right)} \right)$$

. Using the recursive exp-log notation for H _j in equation (19), it may be verified that the recursive exp-log notation of the J restrictions (see equation (3)) for the marginal model is

$${\rm{g}}({\rm{m}}) = {{\rm{u}}_J} - \exp \left( {{{\rm{A}}_4}\log \left( {{{\rm{A}}_3}\exp \left( {{{\rm{A}}_2}\log \left( {{{\rm{A}}_1}{\rm{m}}} \right)} \right)} \right)} \right) - {{\rm{c}}_J} = {0_J}$$

.

The marginal model in equation (20) with c = .3 (which is the default value in software for Mokken scale analysis) and the marginal model with the more demanding criterion c = .4 were tested on the balance-task data. For each balance-task scale, Table 5 shows the estimates of the H _j s and their standard errors, and the likelihood ratio statistic G ₂ and corresponding p-value. For lower bound c = .3, for four scales the marginal model was rejected. In addition, all the ^H _js exceeded .3. Thus, the four scales meet the second criterion of a Mokken scale. The exception was the Conflict Balance scale, for which the marginal null model was not rejected. Thus, Conflict Balance does not meet the second criterion of a Mokken scale.

Table 5 Estimated scalability coefficients ^H _ij with standard errors between parentheses for each of the five balance-task scales (upper panel); fit statistics (G ², p-value) for the marginal model defining H _j = .3 for j = 1, …, 5; (middle panel); and fit statistics for the marginal model defining H _j = .4 for j = 1, …, 5 (lower panel).

Full size table

For c = .4, for the Distance scale the marginal model was not rejected, and for the Conflict Balance scale this marginal model was rejected but all ^H _js were smaller than .4. Thus, for these two scales the more demanding second criterion of a Mokken scale was not satisfied. For the other three scales, the null model was rejected and all ^H _js exceeded .4; hence, the more demanding second criterion of a Mokken scale was satisfied.

5.3 5.3. Testing the Strength of the Scale

Testing the strength of the scale can be considered equivalent with testing for the total-scale coefficient that H ≤ c against the alternative that H > c. If the null model is rejected for c = .3 and if in the sample ^H > .3, then the scale can be considered to be at least a weak scale; if the null model is rejected for c = .4 and if ^H >. 4, then the scale can be considered to be at least a moderate scale; and if the null model is rejected for c = .5 and if ^H > .5, then the scale can be considered to be a strong scale. The statistical test can be performed using the asymptotic standard errors derived by Mokken (1971, pp. 164–169). From the asymptotic standard errors a (1 − α)% confidence interval is constructed, and if c exceeds the upper bound of the confidence interval, the null hypothesis is rejected. This test is not available in the current software.

Alternatively, the test may be conducted using a marginal model. Let A ₁, A ₂, A ₃, and A ₄ be design matrices. These matrices are derived in Appendix E. Appendix E shows that H can be written as

$$H = 1 - \exp \left( {{{\rm{A}}_4}\log \left( {{{\rm{A}}_3}\exp \left( {{{\rm{A}}_2}\log \left( {{{\rm{A}}_1}{\rm{m}}} \right)} \right)} \right)} \right)$$

. Using equation (21) it can be verified that the recursive exp-log notation of the restriction (see equation (3)) in the null model is

$${{\rm{g}}_1}({\rm{m)}} = 1 - \exp \left( {{{\rm{A}}_4}\log \left( {{{\rm{A}}_3}\exp \left( {{{\rm{A}}_2}\log \left( {{{\rm{A}}_1}{\rm{m}}} \right)} \right)} \right)} \right) - c = 0$$

. It may be noted that, in principle, in equation (22) lower bound c may be replaced by any constant w > 0.

The marginal models with c = .3, c = .4, and c = .5 (equation (22)) were tested on the balance-scale data. For each scale, Table 6 shows the estimate of coefficient H and its standard error, and the likelihood ratio statistic G ² and corresponding p-value. Using the rules of thumb for the interpretation of values of H, Weight and Conflict Distance were strong scales, Conflict Weight a moderate scale, Distance a weak scale, and Conflict Balance was found to be unscalable.

Table 6 For each of the five scales of the balance-task test: The estimated scalability coefficient åH with standard error between parentheses (first row); and the fit statistics (G ², p-value) for the marginal models defining H = .3, H = .4, and H = .5.

Full size table

5.4 5.4. Testing Equality of Item Coefficients

Coefficient H _j expresses the contribution of item j to the ordering of respondents by means of total score X ₊. Thus, it can be argued that coefficient H _j is a nonparametric IRT analogue to the discrimination power of an item (Van Abswoude, Van der Ark, & Sijtsma, 2004). The marginal modelling framework can be used to test whether the H _js of different items are equal. This may be interesting when one wants to know whether the items are different with respect to their contribution to the accuracy of the person ordering. Large differences may also provide the researcher with indications that different latent variables may drive the responses to different items (Sijtsma & Meijer, 2007). Currently, such a test is not available.

A statistical test for the null hypothesis “H ₁ = … = H _J” requires a slight modification of the marginal model in equation (20). Let A ₅ be a (J − 1) × J matrix with element (j, j) equal to 1 for j = 1, …, J − 1; and element (j, j +1) equal to −1 for j = 1, …, J −1; the remaining elements are equal to 0. Using equation (19), it may be verified that

$${{\rm{A}}_5}{{\rm{H}}_j} = \left( {\matrix{ {{H_1} - {H_2}} \cr {{H_2} - {H_3}} \cr \vdots \cr {{H_{J - 1}} - {H_J}} \cr } } \right)$$

, which should be equal to 0 _J−1 if all H _js are equal. Then using the design matrices A ₁, …, A ₄ from equation (20) (see Appendix D), the marginal model for equal item coefficients is

$${\rm{g}}({\rm{m}}) = {{\rm{A}}_5}\left( {\exp \left( {{{\rm{A}}_4}\log \left( {{{\rm{A}}_3}\exp \left( {{{\rm{A}}_2}\log \left( {{{\rm{A}}_1}{\rm{m}}} \right)} \right)} \right)} \right)} \right) = {0_{J - 1}}$$

.

Using the marginal model in equation (23), the null hypothesis that H ₁ = H ₂ = H ₃ = H ₄ = H ₅ was tested for each of the five balance-task scales. It may be noted that if all H _j s are equal, then equation (11) implies that H = H _j. For each scale, Table 7 shows the estimated total-scale H and its standard error, under the marginal model of equal H _js, and the likelihood ratio statistic G ² and corresponding p-value. For Conflict Weight the null model of equal H _js was rejected. For the other four scales the null model was not rejected, thus providing support for equal item contributions to the person ordering.

Table 7 For the marginal model defining H ₁ = … = H ₅: For each of five balance-task scales, estimated coefficient _H with standard error between parentheses (upper panel); and fit statistic G2, p-value (lower panel).

Full size table

5.5 5.5. Multiple-Group Hypotheses

Mokken (1971, pp. 164–169) provided the asymptotic sampling theory for testing the null hypothesis that the H values for the same test in different groups are equal. Under this null hypothesis, the same test orders respondents from different groups with equal accuracy. For example, the balance-task test was administered to both boys and girls, and it may be interesting to test if the test orders boys and girls equally well. MSP (Molenaar & Sijtsma, 2000) allows the possibility to compare the H _j and H values of different groups, but not to test hypotheses about (in-)equality of H in different groups.

Assume that there are G groups, and let superscript g index these groups. Then the null hypothesis of interest is “H ¹ = … =H ^G”. The recursive exp-log notation requires the following definitions. Let A ¹₁ , …, A ^G₁ , A ₂, A ₃, A ₄, and A ₅ be design matrices (derived in Appendix F). Let m* be a vector of length LG in which the vectors of expected frequencies from groups 1, …, G are stacked, such that m* = (m ¹, m ², …, m G). The symbol ⊕ indicates the direct product (see Appendix F). In Appendix F it is shown that

$$ \left( {\begin{array}{*{20}c} {H^1 } \\ {H^2 } \\ \vdots \\ {H^G } \\ \end{array} } \right) = u_G - \exp \left( {\mathop \oplus \limits_{g = 1}^G A_4 \log \left( {\mathop \oplus \limits_{g = 1}^G A_3 \exp \left( {\mathop \oplus \limits_{g = 1}^G A_2 \log \left( {\mathop \oplus \limits_{g = 1}^G A_1^g m*} \right)} \right)} \right)} \right) $$

. In Appendix F it is also shown that the recursive exp-log notation for the marginal model with “H ¹ = H ² = … =H ^G” is

$$ \begin{gathered} g\left( m \right) = \left( {\begin{array}{*{20}c} {H^1 - H^2 } \\ {H^2 - H^3 } \\ \vdots \\ {H^{G - 1} - H^G } \\ \end{array} } \right) \hfill \\ = A_5 \exp \left( {\mathop \oplus \limits_{g = 1}^G A_4 \log \left( {\mathop \oplus \limits_{g = 1}^G A_3 \exp \left( {\mathop \oplus \limits_{g = 1}^G A_2 \log \left( {\mathop \oplus \limits_{g = 1}^G A_1^g m*} \right)} \right)} \right)} \right) = 0_G \hfill \\ \end{gathered} $$

.

The marginal model in equation (25) was used to test equal H for boys (indexed g = 1) and girls (g = 2); that is, H ₁ = H ₂. For each balance-task scale, Table 8 shows coefficient H ^g and its standard error, and the likelihood ratio statistic G ² and corresponding p-value. For each scale, the sample ^H value was higher for girls than for boys but only for Conflict Distance was the difference significant. Notice that for G = 2, if estimated standard errors are available, this result can be approximated using a t-test.

Table 8 For the marginal model defining H ¹ = H ²: For each of the five balance-task scales, scalability coefficients H for boys and girls with standard error between parentheses (upper panel); and fit statistics G2, p-value (lower panel).

Full size table

A generalization of the multigroup hypothesis to coefficients H _j and H _ij is straightforward if the item ordering is the same in all subgroups. If the item ordering is different for some subgroups, the design matrices must be adapted.

6 6. Discussion

Marginal modelling offers a framework for testing many interesting hypotheses relevant to Mokken scale analysis that could not be tested before. In particular, new and exciting possibilities of the marginal modelling approach are:

(1)
The availability of global tests that evaluate all interitem scalability coefficients H _ij simultaneously and all item-scalability coefficients H _j simultaneously. This offers new opportunities for assessing item and test quality.
(2)
The possibility to test whether scalability coefficients are equal to a particular value. This is important for ascertaining item and test quality at a level deemed necessary by the researcher. This result also offers the possibility to test hypotheses about expected values of scalability coefficients (such as those derived from previous research).
(3)
The comparison of scalability coefficients between different groups. This provides the opportunity to assess whether the measurement quality of a test is the same in different groups.

This paper has presented several useful examples but the array of possibilities has not yet been fully explored. Exploring these possibilities and implementing the most useful ones in user-friendly software is the first topic for future research.

One possible limitation of the marginal modelling approach is that for the global tests assessing all scalability coefficients simultaneously and to a lesser degree for tests of coefficient H alone, the size of the matrices can grow rapidly as the number of items increases. The experience accumulated thus far did not reveal computational problems for tests up to J = 15. Matrix R (equation (2)), which is required to solve the marginal modelling problem, has L = 2¹⁵ = 32760 rows. For larger J, the maximum likelihood estimation of the models becomes impractical. One solution may be to use an estimation procedure that only evaluates the observed item-score patterns so that the size of vector m does not exceed n. An example is the minimum information discrimination approach (e.g., Kullback, 1971; Read & Cressie, 1988, pp. 34–40). Applying alternative estimation procedures to marginal modelling of the scalability coefficients for Mokken scale analysis is the second topic for future research.

The methods presented here are only applicable to dichotomous items. Thus, a useful generalization is to Mokken scale analysis for polytomous items. Whereas, for dichotomous items, some of the interesting hypotheses tested in Mokken scale analysis could also be tested without the use of marginal models, this is often not possible for polytomous items. Examples are the computation of standard errors and testing the strength of the scale. The generalization of results for dichotomous items to polytomous items has proven to be problematic in many ways (e.g., Hemker et al., 1997; Sijtsma & Meijer, 2007), and this may also be true in the marginal modelling framework. The derivation of the design matrices for marginal models is more complicated and the magnitude the computational problems is more troublesome. The generalization of the methods to polytomous items is the third topic for future research.

The syntax files for the marginal models used here are available upon request from the first author. Currently, researchers wishing to apply the marginal models presented in this paper need to have Mathematica installed on their computer.

References

Bartolucci, F., & Forcina, A. (2002). Extended RC association models allowing for order restrictions and marginal modeling. Journal of the American Statistical Association, 97, 1192–199.
Article Google Scholar
Bartolucci, F., Forcina, A., & Dardanoni, V. (2001). Positive quadrant dependence and marginal modeling in two-way dependence with ordered margins. Journal of the American Statistical Association, 96, 1497–505.
Article Google Scholar
Bergsma, W.P. (1997a). Marginal models for categorical data. Tilburg: Tilburg University Press. http://stats.lse.ac.uk/bergsma/pdf/bergsma_phdthesis.pdf.
Google Scholar
Bergsma, W.P. (1997b). marg_mod.nb [Mathematica computer code]. Retrieved from http://www.uvt.nl/mto/software2.html.
Bergsma, W.P., & Croon, M.A. (2005). Analyzing categorical data by marginal models. In L.A. van der Ark, M.A. Croon, & K. Sijtsma (Eds.), New developments in categorical data analysis for the social and behavioral sciences (pp. 83–01). Mahwah, NJ: Erlbaum.
Google Scholar
Bergsma, W.P., & Rudas, T. (2002). Marginal models for categorical data. The Annals of Statistics, 30, 140–59.
Article Google Scholar
Ellis, J.L., & Van den Wollenberg, A.L. (1993). Local homogeneity in latent trait models: A characterization of the homogeneous monotone latent trait model. Psychometrika, 58, 417–29.
Article Google Scholar
Goodman, L.A., & Kruskal, W.H. (1954). Measures of association for cross classification. Journal of the American Statistical Association, 49, 732–64.
Article Google Scholar
Grayson, D.A. (1988). Two-group classification in latent trait theory: Scores with monotone likelihood ratio. Psychometrika, 53, 383–92.
Article Google Scholar
Guttman, L. (1950). The basis for scalogram analysis. In S.A. Stouffer, L. Guttman, E.A. Suchman, P.F. Lazarsfeld, S.A. Star, & J.A. Clausen (Eds.), Measurement and prediction (pp. 60–0). Princeton, NJ: Princeton University Press.
Google Scholar
Hemker, B.T., Sijtsma, K., & Molenaar, I.W. (1995). Selection of unidimensional scales from a multidimensional item bank in the polytomous Mokken IRT model. Applied Psychological Measurement, 19, 337–52.
Article Google Scholar
Hemker, B.T., Sijtsma, K., Molenaar, I.W., & Junker, B.W. (1997). Stochastic ordering using the latent trait and the sum score in polytomous IRT models. Psychometrika, 62, 331–47.
Article Google Scholar
Holland, P.W., & Rosenbaum, P.R. (1986). Conditional association and unidimensionality in monotone latent variable models. The Annals of Statistics, 14, 1523–543.
Article Google Scholar
Junker, B.W., & Sijtsma, K. (2000). Latent and manifest monotonicity in item response models. Applied Psychological Measurement, 24, 65–1.
Article Google Scholar
Kritzer, H.M. (1977). Analyzing measures of association derived from contingency tables. Sociological Methods and Research, 5, 35–0.
Article Google Scholar
Kullback, S. (1971). Marginal homogeneity of multidimensional contingency tables. Annals of Mathematical Statistics, 42, 594–06.
Article Google Scholar
Lang, J.B., & Agresti, A. (1994). Simultaneously modeling the joint and marginal distributions of multivariate categorical responses. Journal of the American Statistical Association, 89, 625–32.
Article Google Scholar
Loevinger, J. (1948). The technique of homogeneous tests compared with some aspects of ‘scale analysis’ and factor analysis. Psychological Bulletin, 45, 507–29.
Article PubMed Google Scholar
Mokken, R.J. (1971). A theory and procedure of scale analysis. The Hague/Berlin: Mouton/De Gruyter.
Google Scholar
Molenaar, I.W. (1997). Nonparametric models for polytomous responses. In W.J. van der Linden & R.K. Hambleton (Eds.), Handbook of modern item response theory (pp. 369–80). New York: Springer.
Google Scholar
Molenaar, I.W., & Sijtsma, K. (2000). User’s manual MSP5 for Windows [software manual]. Groningen, The Netherlands: iec ProGAMMA.
Google Scholar
Ramsay, J.O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56, 611–30.
Article Google Scholar
Read, T.R.C., & Cressie, N.C. (1988). Goodness of fit statistics for discrete multivariate analysis. New York: Springer.
Google Scholar
Rudas, T., & Bergsma, W.P. (2004). On applications of marginal models for categorical data. Metron, 62, 1–3.
Google Scholar
Scheiblechner, H. (2007). A unified nonparametric IRT model for d-dimensional psychological test data (d-ISOP). Psychometrika, 72, 43–7.
Article Google Scholar
Sijtsma, K., & Meijer, R.R. (2007). Nonparametric item response theory and related topics. In C.R. Rao & S. Sinharay (Eds.), Handbook of statistics: Vol. 26. Psychometrics (pp. 719–46). Amsterdam: Elsevier.
Google Scholar
Sijtsma, K., & Molenaar, I.W. (2002). Introduction to nonparametric item response theory. Thousand Oaks, CA: Sage.
Google Scholar
Stout, W.F. (1990). A new item response modelling approach with applications to unidimensionality assessment and ability estimation. Psychometrika, 55, 293–25.
Article Google Scholar
Van Abswoude, A.A.H., Van der Ark, L.A., & Sijtsma, K. (2004). A comparative study of test dimensionality assessment procedures under nonparametric IRT models. Applied Psychological Measurement, 28, 3–4.
Article Google Scholar
Van der Ark, L.A. (2007). Mokken scale analysis in R. Journal of Statistical Software, 20(11), 1–9.
Google Scholar
Van Maanen, L., Been, P.H., & Sijtsma, K. (1989). The linear logistic test model and heterogeneity of cognitive strategies. In E.E. Roskam (Ed.), Mathematical psychology in progress (pp. 267–88). Berlin: Springer.
Google Scholar
Wolfram, S. (1999). The Mathematica book (4th ed.). Cambridge: Wolfram Media/Cambridge University Press.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Methodology and Statistics, FSW, Tilburg University, P.O. Box 90153, 5000 LE, Tilburg, The Netherlands
L. Andries van der Ark, Marcel A. Croon & Klaas Sijtsma

Authors

L. Andries van der Ark
View author publications
You can also search for this author in PubMed Google Scholar
Marcel A. Croon
View author publications
You can also search for this author in PubMed Google Scholar
Klaas Sijtsma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to L. Andries van der Ark.

Appendices

Appendix A. Estimation of Marginal Models

Appendix A discusses the details of the optimization algorithm for estimating and testing the marginal models discussed in this paper (see also Bergsma & Croon, 2005). Suppose that a sample of n respondents provided responses to J items that are dichotomously scored. The number of different item-score patterns (see equation (2)) is L = 2^J. (In the more general case where item j has ν _j ordered item scores, the number of different item-score patterns is given by L = Π_j ν _j.) Vectors n and m are both of length L, and contain the observed frequencies and expected frequencies of the item-score patterns, respectively. The marginal models discussed can be specified by a set of C equations that impose constraints on the theoretical expected frequencies in m, which are collected in equation (3),

$${\rm{g}}({\rm{m}}) = \left( {\matrix{ {{g_1}({\rm{m}})} \cr \vdots \cr {{g_C}({\rm{m}})} \cr } } \right) = 0$$

. Each constraint is defined recursively in terms of appropriate scalar functions and matrices as in equation (4).

Bergsma (1997a, pp. 89–95) developed a Fisher scoring algorithm to find the maximum likelihood (ML) estimates of the constrained theoretical expected frequencies in m (or, equivalently, the constrained cell probabilities). Assuming multinomial sampling and a vector μ that contains C unknown Lagrangian multipliers, the augmented likelihood or Lagrangian is

$$L\left( {{\rm{m}},\mu } \right) = {{\rm{n}}^{\rm{T}}}\log ({\rm{m}}) - {\mu^{\rm{T}}}{\rm{g}}({\rm{m}})$$

. The ML estimates of the expected frequencies in vector m are obtained by means of an iterative procedure that determines a saddlepoint of this Lagrangian.

Let G = G(m) be the Jacobian of g(m) with respect to logm. Hence, G is a C × C matrix with elements g _rs = ∂g _r (m)/∂logm _s. Derivation of G can be done using the same recursive exp-log notation that was used to specify g(m) in equation (4). First, let ϕ(x) be a function that either indicates an exponential (ϕ(x) = exp(x), ϕ′(x) = exp(x)), a logarithm (ϕ(x) = log(x), ϕ′(x) = 1/x), or a translation (ϕ(x) = x+c, where c is some constant value, ϕ′(x) = 1). Second, let f ₀(m), f ₁(m), f ₂(m), …, f _q(m) be a series of q + 1 functions, in which

$${{{\rm{f}}_i}({\rm{m}}) = \phi \left[ {{{\rm{A}}_i}{{\rm{f}}_{i - 1}}\left( {\rm{m}} \right)} \right]{\rm{for }}i = 1, \ldots ,q}$$

. The last function in equation (26) is

$${{\rm{f}}_{\rm{q}}}({\rm{m}}) = {\rm{g}}({\rm{m}})$$

as specified in equation (4). Third, the following recursive relationship can be derived for the partial derivatives of the functions f i(m). Let D(v) be a diagonal matrix with vector v on its main diagonal, then

$$ \frac{{\partial f_0 \left( m \right)}} {{\partial \log m}} = D\left( m \right) $$

and

$$ \frac{{\partial f_0 \left( m \right)}} {{\partial \log m}} = D\left[ {\varphi '\left( {A_i f_{i - 1} } \right)} \right]A_i \frac{{\partial f_{i - 1} \left( m \right)}} {{\partial \log m}} for i = 1,...,q $$

. Note that if ϕ indicates an exponential, then equation (27) equals

$${{\partial {{\rm{f}}_i}({\rm{m}})} \over {\partial \log {\rm{m}}}} = {\rm{D}}\left[ {\exp \left( {{{\rm{A}}_i}{{\rm{f}}_{i - 1}}} \right)} \right]{{\rm{A}}_i}{{\partial {{\rm{f}}_{i - {\rm{1}}}}({\rm{m}})} \over {\partial \log {\rm{m}}}}$$

; if ϕ indicates a logarithm, then equation (27) equals

$$\eqalign{ & {{\partial {{\rm{f}}_i}({\rm{m}})} \over {\partial \log {\rm{m}}}} = {\rm{D}}\left[ {\exp \left( {{{\rm{A}}_i}{{\rm{f}}_{i - 1}}} \right)} \right]{{\rm{A}}_i}{{\partial {{\rm{f}}_{i - {\rm{1}}}}({\rm{m}})} \over {\partial \log {\rm{m}}}} \cr & {{\partial {{\rm{f}}_i}({\rm{m}})} \over {\partial \log {\rm{m}}}} = {{\rm{D}}^{{\rm{ - 1}}}}\left( {{{\rm{A}}_i}{{\rm{f}}_{i - 1}}} \right){{\rm{A}}_i}{{\partial {{\rm{f}}_{i - {\rm{1}}}}({\rm{m}})} \over {\partial \log {\rm{m}}}} \cr} $$

; and if ϕ indicates a translation, then equation (27) equals

$${{\partial {{\rm{f}}_i}({\rm{m}})} \over {\partial \log {\rm{m}}}} = {{\rm{A}}_i}{{\partial {{\rm{f}}_{i - {\rm{1}}}}({\rm{m}})} \over {\partial \log {\rm{m}}}}$$

. Fourth, the Jacobian can be obtained as

$${\rm{G}} = {{\partial {{\rm{f}}_q}({\rm{m}})} \over {\partial \log {\rm{m}}}}$$

. Differentiating L(m, μ) with respect to logm yields

$${\rm{l}}({\rm{m}},\mu ) = {\rm{n}} - {\rm{m - G}}\mu $$

. Under suitable regularity conditions, the ML estimator ^m is a vector m for which there is a Lagrange multiplier vector μ such that the simultaneous equations

$${\rm{l(m,}}\mu {\rm{)}} = 0$$

and

$${\rm{g(m)}} = 0$$

are satisfied.

Then the expected value of the derivative matrix of the vector (l(m,μ), g(m )) with respect to (m,μ) is

$${\rm{V}}({\rm{m}}) = \left( {\matrix{ { - {\rm{D}}({\rm{m}})} & {\rm{G}} \cr {{{\rm{G}}^{\rm{T}}}} & 0 \cr } } \right)$$

. Let n ⁺ be equal to the vector n with zeros replaced by a small positive constant (say, 10⁻¹⁰), and define the Fisher scoring starting values

$$\left( {\matrix{ {\log {{\rm{m}}^{{\rm{(0)}}}}} \cr {{\mu^{(0)}}} \cr } } \right) = \left( {\matrix{ {\log {{\rm{n}}^+ }} \cr 0 \cr } } \right)$$

and, for k = 0, 1, …,

$$\eqalign{ & \left( {\matrix{ {\log {{\rm{m}}^{{\rm{(0)}}}}} \cr {{\mu^{(0)}}} \cr } } \right) = \left( {\matrix{ {\log {{\rm{n}}^+ }} \cr 0 \cr } } \right) \cr & \left( {\matrix{ {\log {{\rm{m}}^{{\rm{(}}k{\rm{ + 1)}}}}} \cr {{\mu^{{\rm{(}}k{\rm{ + 1)}}}}} \cr } } \right) = \left( {\matrix{ {\log {{\rm{m}}^{{\rm{(}}k{\rm{)}}}}} \cr {{\mu^{(k)}}} \cr } } \right) - {\rm{V}}{\left( {{{\rm{m}}^{{\rm{(}}k{\rm{)}}}}} \right)^{ - 1}} \cdot \left( {\matrix{ {{\rm{l}}\left( {{{\rm{m}}^{{\rm{(}}k{\rm{)}}}},{\mu^{(k)}}} \right)} \cr {{\rm{g}}\left( {{{\rm{m}}^{{\rm{(}}k{\rm{)}}}}} \right)} \cr } } \right) \cr} $$

. Then, as k → ∞, m ^(k) should go to ^m. Straightforward matrix algebra yields the simplified form

$${\rm{log }}{{\rm{m}}^{{\rm{(}}k{\rm{ + 1)}}}} = {\rm{log }}{{\rm{m}}^{{\rm{(}}k{\rm{)}}}} + {\rm{D}}{\left( {{{\rm{m}}^{{\rm{(}}k{\rm{)}}}}} \right)^{ - 1}}{\rm{l}}\left( {{{\rm{m}}^{{\rm{(}}k{\rm{)}}}},{\mu^{{\rm{(}}k + 1{\rm{)}}}}} \right),{\mu^{{\rm{(}}k + 1{\rm{)}}}} = - {\left( {{{\rm{G}}^{\rm{T}}}{\rm{D}}\left( {{{\rm{m}}^{{\rm{(}}k{\rm{)}}}}} \right){\rm{G}}} \right)^{ - 1}}{\left( {{{\rm{G}}^{\rm{T}}}{\rm{D}}\left( {{{\rm{m}}^{{\rm{(}}k{\rm{)}}}}} \right)} \right)^{ - 1}}\left( {{\rm{n}} - {{\rm{m}}^{{\rm{(}}k{\rm{)}}}}} \right) + {\rm{g}}\left( {{{\rm{m}}^{{\rm{(}}k{\rm{)}}}}} \right))$$

. This algorithm does not always converge, and it can be helpful to introduce a step size step^(k) ∈ (0, 1] as follows:

$${\rm{log }}{{\rm{m}}^{{\rm{(}}k{\rm{ + 1)}}}} = {\rm{log }}{{\rm{m}}^{{\rm{(}}k{\rm{)}}}} + {\rm{ste}}{{\rm{p}}^{{\rm{(}}k{\rm{)}}}}{\rm{D}}{\left( {{{\rm{m}}^{{\rm{(}}k{\rm{)}}}}} \right)^{ - 1}}{\rm{l}}\left( {{{\rm{m}}^{{\rm{(}}k{\rm{)}}}},{\mu^{{\rm{(}}k + 1{\rm{)}}}}} \right)$$

. Note that the update of μ is left unchanged.

The step size should be chosen so that the new estimate m ^(k+1) is “better” than the old estimate m ^(k). A criterion for deciding this is obtained by defining the following quadratic form measuring the “distance” from convergence:

$$\delta \left( {{{\rm{m}}^{{\rm{(}}k{\rm{)}}}}} \right) = {\rm{l}}\left( {{{\rm{m}}^{{\rm{(}}k{\rm{)}}}},{\mu^{{\rm{(}}k + 1{\rm{)}}}}} \right){\rm{D}}{\left( {{{\rm{m}}^{{\rm{(}}k{\rm{)}}}}} \right)^{ - 1}}{\rm{l}}\left( {{{\rm{m}}^{{\rm{(}}k{\rm{)}}}},{\mu^{{\rm{(}}k + 1{\rm{)}}}}} \right)$$

. Convergence is reached at m if and only if δ(m) = 0 and therefore, if possible, the step size should be chosen so that δ(m ^(k+1)) < δ(m ^(k)) for all k. This is possible if the tentative solution is sufficiently close to the ML estimate. Otherwise, a recommendation which seems to work very well in practice is to “jump” to another region by taking a step size equal to one.

After convergence of the estimation procedure, the null hypothesis that the model specified by the C constraints g(m) = 0 provides an acceptable fit to the data can be tested against the saturated model by means of a log likelihood ratio test. The test statistic is

$${G^2} = 2{{\rm{n}}^{\rm{T}}}\log \left( {{\rm{n}}/{\rm{\hat m}}} \right)$$

, which asymptotically follows a chi-square distribution with df = C.

Appendix B. Definition of Paired Row Product of Two Matrices

Let A and B be matrices of order n × m, let a ^T₁ , a ^T₂ , …, a ^T_n be the rows of A, and let b ^T₁ , b ^T₂ , …, b ^T_n be the rows of B. The paired row product of A and B is the elementwise or Hadamard product of row i of A and row j of B for i = 1, …, n − 1, j = i +1, h, n, and is denoted A⊛B. If A⊛B = C, then C is a ½n(n − 1) × m matrix with rows c ^T₁ , c ^T₂ , … c ^2)n(n−1)/T₍₁ . Let k = (i − 1)n − ½i(i − 1) + (j − i), then

$${{\rm{c}}_k} = {{\rm{a}}_i} \bullet {{\rm{b}}_j}{\rm{ for }}i = 1, \ldots ,n - 1,j = i + 1, \ldots ,n$$

, where • denotes the Hadamard product. For example,

$$\eqalign{ & {H_j} = 1 - {{\sum\nolimits_{i \ne j} {{F_{ij}}} } \over {\sum\nolimits_{i \ne j} {{E_{ij}}} }} = 1 - {{n\left( {\sum\nolimits_{i = 1}^{j - 1} {m_{ij}^{01} + \sum\nolimits_{i = j + 1}^J {m_{ji}^{01}} } } \right)} \over {\sum\nolimits_{i = 1}^{j - 1} {m_i^0m_j^1 + \sum\nolimits_{i = j + 1}^J {m_j^0m_i^1} } }}(8) \cr & \left( {\matrix{ {{\rm{a}}_1^{\rm{T}}} \cr {{\rm{a}}_2^{\rm{T}}} \cr {{\rm{a}}_3^{\rm{T}}} \cr {{\rm{a}}_4^{\rm{T}}} \cr } } \right) \otimes \left( {\matrix{ {{\rm{b}}_1^{\rm{T}}} \cr {{\rm{b}}_2^{\rm{T}}} \cr {{\rm{b}}_3^{\rm{T}}} \cr {{\rm{b}}_4^{\rm{T}}} \cr } } \right) = \left( {\matrix{ {{\rm{a}}_1^{\rm{T}} \bullet {\rm{b}}_2^{\rm{T}}} \cr {{\rm{a}}_1^{\rm{T}} \bullet {\rm{b}}_3^{\rm{T}}} \cr {{\rm{a}}_1^{\rm{T}} \bullet {\rm{b}}_4^{\rm{T}}} \cr {{\rm{a}}_2^{\rm{T}} \bullet {\rm{b}}_3^{\rm{T}}} \cr {{\rm{a}}_2^{\rm{T}} \bullet {\rm{b}}_4^{\rm{T}}} \cr {{\rm{a}}_3^{\rm{T}} \bullet {\rm{b}}_4^{\rm{T}}} \cr } } \right) \cr} $$

Appendix C. Recursive Exp-Log Notation for All Item-Pair Scalability Coefficients Simultaneously

In Appendix C, the recursive exp-log notation for equation (16) is derived. For the purpose of illustration, the design matrices are elaborated for a three-item test (hence, there are K = 3 item pairs and L = 8 item-score patterns). In these design matrices, dashed lines are displayed to facilitate readability.

Let u L be a vector of length L that consists of ones; let U = u _J u ^T_L , and let R be the L × J matrix that contains all possible item-score patterns defined in equation (2). The symbol ⊛ denotes the paired row product (Appendix B). Then, the (1 + 2J + K) × L design matrix A ₁ is a concatenation of four submatrices, that is,

$${{\rm{A}}_1} = \left( {\matrix{ {{\rm{u}}_L^{\rm{T}}} \cr {{\rm{U}} - {{\rm{R}}^{\rm{T}}}} \cr {{{\rm{R}}^{\rm{T}}}} \cr {({\rm{U}} - {{\rm{R}}^{\rm{T}}}) \otimes {{\rm{R}}^{\rm{T}}}} \cr } } \right)$$

. It may be verified that for three items (denoted by a, b, and c) of decreasing popularity, we have that

$$\log ({{\rm{A}}_1}{\rm{m}}) = \log \left[ {\left( {\matrix{ 11111111 \cr 11110000 \cr 11001100 \cr 10101010 \cr 00001111 \cr 00110011 \cr 01010101 \cr 00110000 \cr 01010000 \cr 01000100 \cr } } \right)\left( {\matrix{ {m_{abc}^{000}} \cr {m_{abc}^{001}} \cr {m_{abc}^{010}} \cr {m_{abc}^{011}} \cr {m_{abc}^{100}} \cr {m_{abc}^{101}} \cr {m_{abc}^{110}} \cr {m_{abc}^{111}} \cr } } \right)} \right] = \log \left( {\matrix{ n \cr {m_a^0} \cr {m_b^0} \cr {m_c^0} \cr {m_a^1} \cr {m_b^1} \cr {m_c^1} \cr {m_{ab}^{01}} \cr {m_{ac}^{01}} \cr {m_{bc}^{01}} \cr } } \right)$$

.

The K × (1 + 2J + K) design matrix A ₂ is a concatenation of three submatrices, that is, A ₂ = (u _K − Q ^T₁ I _K), in which I _K is the identity matrix of order K, and Q ₁ is a K × (2J) matrix containing zeros and ones. The rows of Q ₁ correspond to K item pairs, that is, item pair (i, j) (with i = 1, …, J − 1, j = i + 1, …, J) corresponds to the kth row of Q ₁ (k = (i − 1)J − 1/2 i(i − 1) + (j − i), see also Appendix B). The columns of Q ₁ can be divided into two sets: the first J columns and the last J columns. Each row of Q ₁ has 2(J − 1) zeros and two ones; the elements with value 1 are in the jth column of the first set of columns and in the ith column of the last set of columns (i.e., column J + i). Using equation (29) it may be verified that for three items (a, b, and c) of decreasing popularity, we have that exp(A ₂ log(A1m )) equals

$$\eqalign{ & \left( {\matrix{ {{\rm{a}}_1^{\rm{T}} \bullet {\rm{b}}_2^{\rm{T}}} \cr {{\rm{a}}_1^{\rm{T}} \bullet {\rm{b}}_3^{\rm{T}}} \cr {{\rm{a}}_1^{\rm{T}} \bullet {\rm{b}}_4^{\rm{T}}} \cr {{\rm{a}}_2^{\rm{T}} \bullet {\rm{b}}_3^{\rm{T}}} \cr {{\rm{a}}_2^{\rm{T}} \bullet {\rm{b}}_4^{\rm{T}}} \cr {{\rm{a}}_3^{\rm{T}} \bullet {\rm{b}}_4^{\rm{T}}} \cr } } \right) \cr & {{\rm{A}}_1} = \left( {\matrix{ {{\rm{u}}_L^{\rm{T}}} \cr {{\rm{U}} - {{\rm{R}}^{\rm{T}}}} \cr {{{\rm{R}}^{\rm{T}}}} \cr {({\rm{U}} - {{\rm{R}}^{\rm{T}}}) \otimes {{\rm{R}}^{\rm{T}}}} \cr } } \right) \cr & {\rm{exp}}\left[ {\left( {\matrix{ 1{ - 1}000{ - 1}0100 \cr 1{ - 1}0000{ - 1}010 \cr 10{ - 1}000{ - 1}001 \cr } } \right)\log \left( {\matrix{ n \cr {m_a^0} \cr {m_b^0} \cr {m_c^0} \cr {m_a^1} \cr {m_b^1} \cr {m_c^1} \cr {m_{ab}^{01}} \cr {m_{ac}^{01}} \cr {m_{bc}^{01}} \cr } } \right)} \right] = \left( {\matrix{ {\left[ {n \times m_{ab}^{01}} \right]/\left[ {m_a^0 \times m_b^1} \right]} \cr {\left[ {n \times m_{ac}^{01}} \right]/\left[ {m_a^0 \times m_c^1} \right]} \cr {\left[ {n \times m_{bc}^{01}} \right]/\left[ {m_b^0 \times m_c^1} \right]} \cr } } \right) \cr} $$

. For three items, it may be verified that substituting the term exp(A ₂ log(A ₁ m)) in equation (16) with the right-hand side of equation (30) produces coefficients H _ij as defined in equation (6).

Appendix D. Recursive Exp-Log Notation for All Item Scalability Coefficients Simultaneously

In Appendix D the recursive exp-log notation for equation (19) is derived. For the purpose of illustration, the design matrices are elaborated for a three-item test (hence, there are K = 3 item pairs and L = 8 item-score patterns). In the design matrices, dashed lines are displayed to facilitate readability.

Design matrix A ₁ was derived in Appendix C (equation (28)). The (1+2K)×(1+2J +K) design matrix A ₂ is the direct sum of the scalar 1, submatrix Q ₁ (Appendix C), and I K, that is,

$${{\rm{A}}_2} = 1 \oplus {{\rm{Q}}_1} \oplus {{\rm{I}}_K} = \left( {\matrix{ 1 & 0 & 0 \cr 0 & {{{\rm{Q}}_{\rm{1}}}} & 0 \cr 0 & 0 & {{{\rm{I}}_K}} \cr } } \right)$$

. Using equation (29) it may be verified that for three items (a, b, and c) in decreasing order of popularity, we have that exp(A ₂ log(A ₁ m)) equals

$$\exp \left[ {\left( {\matrix{ 1000000000 \cr 0100010000 \cr 0100001000 \cr 0010001000 \cr 0000000100 \cr 0000000010 \cr 0000000001 \cr } } \right)\log \left( {\matrix{ n \cr {m_a^0} \cr {m_b^0} \cr {m_c^0} \cr {m_a^1} \cr {m_b^1} \cr {m_c^1} \cr {m_{ab}^{01}} \cr {m_{ac}^{01}} \cr {m_{bc}^{01}} \cr } } \right)} \right] = \left( {\matrix{ n \cr {m_a^0m_b^1} \cr {m_a^0m_c^1} \cr {m_b^0m_c^1} \cr {m_{ab}^{01}} \cr {m_{ac}^{01}} \cr {m_{bc}^{01}} \cr } } \right)$$

.

The (1 + 2J) × (1 + 2K) design matrix A ₃ is the direct sum of the scalar 1, and twice the submatrix Q ₂, that is,

$${{\rm{A}}_3} = 1 \oplus {{\rm{Q}}_2} \oplus {{\rm{Q}}_2} = \left( {\matrix{ 1 & 0 & 0 \cr 0 & {{{\rm{Q}}_{\rm{2}}}} & 0 \cr 0 & 0 & {{{\rm{Q}}_{\rm{2}}}} \cr } } \right)$$

, where Q ₂ is a J × K matrix, where the rows correspond to the J items and the columns correspond to the K item pairs. Element i, j in Q ₂ equals 1 if the item corresponding to row i is in the item pair corresponding to column j and 0 otherwise. Using equation (32), it may be verified that for three items (a, b, and c) in decreasing order of popularity, we have that log(A ₃exp(A ₂log(A ₁ m))) equals

$$\log \left[ {\left( {\matrix{ 1 & 0 & 0 & 0 & 0 & 0 & 0 \cr 0 & 1 & 1 & 0 & 0 & 0 & 0 \cr 0 & 1 & 0 & 1 & 0 & 0 & 0 \cr 0 & 0 & 1 & 1 & 0 & 0 & 0 \cr 0 & 0 & 0 & 0 & 1 & 1 & 0 \cr 0 & 0 & 0 & 0 & 1 & 0 & 1 \cr 0 & 0 & 0 & 0 & 0 & 1 & 1 \cr } } \right)\left( {\matrix{ n \cr {m_a^0m_b^1} \cr {m_a^0m_c^1} \cr {m_b^0m_c^1} \cr {m_{ab}^{01}} \cr {m_{ac}^{01}} \cr {m_{bc}^{01}} \cr } } \right)} \right] = \log \left( {\matrix{ n \cr {m_a^0m_b^1 + m_a^0m_c^1} \cr {m_a^0m_b^1 + m_b^0m_c^1} \cr {m_a^0m_c^1 + m_b^0m_c^1} \cr {m_{ab}^{01} + m_{ac}^{01}} \cr {m_{ab}^{01} + m_{bc}^{01}} \cr {m_{ac}^{01} + m_{bc}^{01}} \cr } } \right)$$

. For the general case of J items, the middle part of the vector on the right-hand side of equation (33) is a subvector of length J with element j equal to $\sum\nolimits_{i = 1}^{j - 1} {m_i^0m_j^1 + \sum\nolimits_{i = j + 1}^J {m_j^0m_i^1} } $. Similarly, the lower part of the vector on the right-hand side of equation (33) is a subvector of length J with element j equal to $\sum\nolimits_{i = 1}^{j - 1} {m_{ij}^{01} + \sum\nolimits_{i = j + 1}^J {m_{ji}^{01}} } $.

The J × (1 + 2J) design matrix A ₄ is a concatenation of the unit vector, the negative of the identity matrix, and the identity matrix,

$${{\rm{A}}_4} = \left( {{1_J} - {{\rm{I}}_J}{{\rm{I}}_J}} \right)$$

. Using the right-hand side of equation (33), it may be verified that for three items (a, b, and c) ordered according to decreasing popularity, we have that exp(A ₄ log(A ₃ exp(A ₂log(A ₁ m)))) equals

$$\exp \left[ {\left( {\matrix{ 1 & { - 1} & 0 & 0 & 1 & 0 & 0 \cr 1 & 0 & { - 1} & 0 & 0 & 1 & 0 \cr 1 & 0 & 0 & { - 1} & 0 & 0 & 1 \cr } } \right)\log \left( {\matrix{ n \cr {m_a^0m_b^1 + m_a^0m_c^1} \cr {m_a^0m_b^1 + m_b^0m_c^1} \cr {m_a^0m_c^1 + m_b^0m_c^1} \cr {m_{ab}^{01} + m_{ac}^{01}} \cr {m_{ab}^{01} + m_{bc}^{01}} \cr {m_{ac}^{01} + m_{bc}^{01}} \cr } } \right)} \right] = \left( {\matrix{ {\left[ {n\left( {m_{ab}^{01} + m_{ac}^{01}} \right)} \right]/\left[ {m_a^0m_b^1 + m_a^0m_c^1} \right]} \cr {\left[ {n\left( {m_{ab}^{01} + m_{bc}^{01}} \right)} \right]/\left[ {m_a^0m_b^1 + m_b^0m_c^1} \right]} \cr {\left[ {n\left( {m_{ac}^{01} + m_{bc}^{01}} \right)} \right]/\left[ {m_a^0m_c^1 + m_b^0m_c^1} \right]} \cr } } \right)$$

. For the general case of J items, the vector on the right-hand side of equation (34) is a vector of length J with element j equal to

$${{n\left( {\sum\nolimits_{i = 1}^{j - 1} {m_{ij}^{01} + \sum\nolimits_{i = j + 1}^J {m_{ji}^{01}} } } \right)} \over {\sum\nolimits_{i = 1}^{j - 1} {m_i^0m_j^1 + \sum\nolimits_{i = j + 1}^J {m_j^0m_i^1} } }}$$

. For three items, it may be verified that substituting the term exp(A ₄ log(A ₃ exp(A ₂ log(A ₁ m)))) in equation (19) with the right-hand side of equation (34) produces coefficients H _j as defined in equation (8).

Appendix E. Recursive Exp-Log Notation for the Scalability Coefficient for a Set of Items

In Appendix E the recursive exp-log notation for equation (21) is derived. For the purpose of illustration, the design matrices are elaborated for the case of three items (hence, there are K = 3 item pairs and L = 8 item-score patterns). In these design matrices, dashed lines are displayed to facilitate readability.

The recursive exp-log notation for scale coefficient H requires four design matrices, A ₁, A ₂, A ₃, and A ₄, each consisting of submatrices. Design matrix A ₁ was derived in Appendix C (equation (28)), and design matrix A ₂ was derived in Appendix D (equation (31)). The 3 × (1 + 2K) design matrix A ₃ is the direct sum of the scalar 1, and twice the row vector u ^T_K , that is,

$${{\rm{A}}_3} = 1 \oplus {\rm{u}}_K^{\rm{T}} \oplus {\rm{u}}_K^{\rm{T}} = \left( {\matrix{ 1 & 0 & 0 \cr 0 & {{\rm{u}}_K^{\rm{T}}} & 0 \cr 0 & 0 & {{\rm{u}}_K^{\rm{T}}} \cr } } \right)$$

. Using equation (32), it may be verified that for three items (a, b, and c) ordered according to decreasing popularity, we have that log(A ₃ exp(A ₂ log(A ₁ m))) equals

$$\log \left[ {\left( {\matrix{ 1 & 0 & 0 & 0 & 0 & 0 & 0 \cr 0 & 1 & 1 & 1 & 0 & 0 & 0 \cr 0 & 0 & 0 & 0 & 1 & 1 & 1 \cr } } \right)\left( {\matrix{ n \cr {m_a^0m_b^1} \cr {m_a^0m_c^1} \cr {m_b^0m_c^1} \cr {m_{ab}^{01}} \cr {m_{ac}^{01}} \cr {m_{bc}^{01}} \cr } } \right)} \right] = \log \left( {\matrix{ n \cr {m_a^0m_b^1 + m_a^0m_c^1 + m_b^0m_c^1} \cr {m_{ab}^{01} + m_{ac}^{01} + m_{bc}^{01}} \cr } } \right)$$

. For the general case of J items, the middle element of the vector on the right-hand side of equation (36) (i.e., m ⁰_a m ¹_b + m ⁰_a m ¹_c ) equals $\sum\nolimits_{i = 1}^{J - 1} {\sum\nolimits_{j = i + 1}^J {m_i^0m_j^1} } $. Similarly, for the general case of J items the lower element of the vector on the right-hand side of equation (36) (i.e., m ⁰¹_ab + m ⁰¹_ac + m ⁰¹_bc ) equals $\sum\nolimits_{i = 1}^{J - 1} {\sum\nolimits_{j = i + 1}^J {m_{ij}^{01}} } $.

Design matrix A ₄ is a row vector with three elements, that is,

$${{\rm{A}}_4} = \left( {1 - 11} \right)$$

. Using equation (36), it may be verified that for three items (a, b, and c) ordered according to decreasing popularity, we have that exp(A ₄ log(A ₃ exp(A ₂ log(A ₁ m)))) equals

$$\exp \left[ {\left( {\matrix{ 1 & { - 1} & 1 \cr } } \right)\log \left( {\matrix{ n \cr {m_a^0m_b^1 + m_a^0m_c^1 + m_b^0m_c^1} \cr {m_{ab}^{01} + m_{ac}^{01} + m_{bc}^{01}} \cr } } \right)} \right] = {{n\left( {m_{ab}^{01} + m_{ac}^{01} + m_{bc}^{01}} \right)} \over {m_a^0m_b^1 + m_a^0m_c^1 + m_b^0m_c^1}}$$

. For the general case of J items, the ratio on right-hand side of equation (38) equals

$${{n\left( {\sum\nolimits_{i = 1}^{J - 1} {\sum\nolimits_{j = i + 1}^J {m_{ij}^{01}} } } \right)} \over {\sum\nolimits_{i = 1}^{J - 1} {\sum\nolimits_{j = i + 1}^J {m_i^0m_j^1} } }}$$

. For three items, it may be verified that substituting the term exp(A ₄ log(A ₃ exp(A ₂ log(A ₁ m)))) in equation (21) with the right-hand side of equation (38) produces coefficient H as defined in equation (10).

Appendix F. Recursive Exp-Log Notation for the Scalability Coefficient for a Set of Items for Several Groups Simultaneously

In Appendix F the recursive exp-log notation for equations (24) and (25) are derived. The recursive exp-log notation for the vector containing scalability coefficients H ¹, …, H ^G, requires four design matrices that are the same for each subgroup: A ₂, A ₃, A ₄, and A ₅, and one design matrix that may be different for each subgroup: A ^g₁ (g = 1, …, G). Design matrix A ^g₁ (derived in Appendix C, equation (28)) identifies the frequencies pertaining to Guttman errors. If the subgroup g has a different item ordering than subgroup g′ (g ≠ g′), then the cells in the contingency table that pertain to Guttman errors are not the same for g and g′ and A ^g₁ ≠ A ^g₁ .

Design matrix A ₂ was derived in Appendix D (equation (31)), and design matrices A ₃ and A ₄ were derived in Appendix E (equations (35) and (37), respectively). For a single group, H ¹ is given by equation (21). Using equation (21), it may be verified that, for two groups,

$$\left( {\matrix{ {{H^1}} \cr {{H^2}} \cr } } \right) = \left( {\matrix{ 1 \cr 1 \cr } } \right) - \exp \left[ {\left( {\matrix{ {{{\rm{A}}_4}} & 0 \cr 0 & {{{\rm{A}}_4}} \cr } } \right) \times \log \left[ {\left( {\matrix{ {{{\rm{A}}_3}} & 0 \cr 0 & {{{\rm{A}}_3}} \cr } } \right)\exp \left[ {\left( {\matrix{ {{{\rm{A}}_2}} & 0 \cr 0 & {{{\rm{A}}_2}} \cr } } \right)\log \left[ {\left( {\matrix{ {{\rm{A}}_1^1} & 0 \cr 0 & {{\rm{A}}_1^2} \cr } } \right)\left( {\matrix{ {{{\rm{m}}^1}} \cr {{{\rm{m}}^2}} \cr } } \right)} \right]} \right]} \right]} \right]$$

. A generalization of equation (39) to G subgroups yields equation (24).

Let A ₅ be a (G − 1) × G matrix with element (g, g) equal to 1 for g = 1, …, G − 1, and element (g, g + 1) equal to − 1 for g = 1, …, G − 1; then

$${{\rm{A}}_5}\left( {\matrix{ {{H^1}} \cr {{H^2}} \cr \vdots \cr {{H^G}} \cr } } \right) = \left( {\matrix{ 1 & { - 1} & 0 & \cdots & 0 & 0 \cr 0 & 1 & { - 1} & \cdots & 0 & 0 \cr \vdots & \vdots & \vdots & {} & \vdots & \vdots \cr 0 & 0 & 0 & \cdots & 1 & { - 1} \cr } } \right)\left( {\matrix{ {{H^1}} \cr {{H^2}} \cr \vdots \cr {{H^G}} \cr } } \right) = \left( {\matrix{ {{H^1} - {H^2}} \cr {{H^2} - {H^3}} \cr \vdots \cr {{H^{G - 1}} - {H^G}} \cr } } \right)$$

. The marginal model that implies that the vector on the right-hand side of equation (40) equals 0 can be found by substituting the vector (H ₁, …, H ^G)^T in equation (40) with the right-hand side of equation (24), and setting it equal to 0 _G:

$${\rm{g(m*) = }}{{\rm{A}}_5}\left\{ {{\rm{u}} - \exp \left( {\mathop \oplus \limits_{g = 1}^G {\rm{A}}_4^g\log \left( {\mathop \oplus \limits_{g = 1}^G {\rm{A}}_3^g\exp \left( {\mathop \oplus \limits_{g = 1}^G {\rm{A}}_2^g\log \left( {\mathop \oplus \limits_{g = 1}^G {\rm{A}}_1^g{\rm{m}}*} \right)} \right)} \right)} \right)} \right\} = {0_G}$$

. Because A ₅ u = 0 equation (41) reduces to equation (25).

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License ( https://creativecommons.org/licenses/by-nc/2.0 ), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

van der Ark, L.A., Croon, M.A. & Sijtsma, K. Mokken Scale Analysis for Dichotomous Items Using Marginal Models. Psychometrika 73, 183–208 (2008). https://doi.org/10.1007/s11336-007-9034-z

Download citation

Received: 27 February 2007
Revised: 25 June 2007
Published: 08 November 2007
Issue Date: June 2008
DOI: https://doi.org/10.1007/s11336-007-9034-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Mokken Scale Analysis for Dichotomous Items Using Marginal Models

Abstract

Similar content being viewed by others

Scalability Coefficients for Two-Level Polytomous Item Scores: An Introduction and an Application

Scale length does matter: Recommendations for measurement invariance testing with categorical factor analysis and item response theory approaches

Effect of Within-Group Dependency on Fit Statistics in Mokken Scale Analysis in the Presence of Two-Level Test Data

1 1. Introduction

2 2. Marginal Models

3 3. Mokken Scale Analysis

3.1 3.1. The Monotone Homogeneity Model

3.2 3.2. Scalability Coefficients

3.3 3.3. Relationships Between the Monotone Homogeneity Model and the Scalability Coefficients

3.4 3.4. Definition of a Scale and Two Types of Mokken Scale Analysis

3.4.1 3.4.1. Definition of a Scale.

3.4.2 3.4.2. Two Types of Mokken Scale Analysis

3.5 3.5. Results for the Distribution of the Scalability Coefficients

4 4. A Marginal Modelling Approach to the Scalability Coefficients

5 5. Hypotheses in Mokken Scale Analysis

5.1 5.1. Testing the First Criterion of a Mokken Scale

5.2 5.2. Testing the Second Criterion of a Mokken Scale

5.3 5.3. Testing the Strength of the Scale

5.4 5.4. Testing Equality of Item Coefficients

5.5 5.5. Multiple-Group Hypotheses

6 6. Discussion

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A. Estimation of Marginal Models

Appendix B. Definition of Paired Row Product of Two Matrices

Appendix C. Recursive Exp-Log Notation for All Item-Pair Scalability Coefficients Simultaneously

Appendix D. Recursive Exp-Log Notation for All Item Scalability Coefficients Simultaneously

Appendix E. Recursive Exp-Log Notation for the Scalability Coefficient for a Set of Items

Appendix F. Recursive Exp-Log Notation for the Scalability Coefficient for a Set of Items for Several Groups Simultaneously

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Mokken Scale Analysis for Dichotomous Items Using Marginal Models

Abstract

Similar content being viewed by others

Scalability Coefficients for Two-Level Polytomous Item Scores: An Introduction and an Application

Scale length does matter: Recommendations for measurement invariance testing with categorical factor analysis and item response theory approaches

Effect of Within-Group Dependency on Fit Statistics in Mokken Scale Analysis in the Presence of Two-Level Test Data

1 1. Introduction

2 2. Marginal Models

3 3. Mokken Scale Analysis

3.1 3.1. The Monotone Homogeneity Model

3.2 3.2. Scalability Coefficients

3.3 3.3. Relationships Between the Monotone Homogeneity Model and the Scalability Coefficients

3.4 3.4. Definition of a Scale and Two Types of Mokken Scale Analysis

3.4.1 3.4.1. Definition of a Scale.

3.4.2 3.4.2. Two Types of Mokken Scale Analysis

3.5 3.5. Results for the Distribution of the Scalability Coefficients

4 4. A Marginal Modelling Approach to the Scalability Coefficients

5 5. Hypotheses in Mokken Scale Analysis

5.1 5.1. Testing the First Criterion of a Mokken Scale

5.2 5.2. Testing the Second Criterion of a Mokken Scale

5.3 5.3. Testing the Strength of the Scale

5.4 5.4. Testing Equality of Item Coefficients

5.5 5.5. Multiple-Group Hypotheses

6 6. Discussion

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A. Estimation of Marginal Models

Appendix B. Definition of Paired Row Product of Two Matrices

Appendix C. Recursive Exp-Log Notation for All Item-Pair Scalability Coefficients Simultaneously

Appendix D. Recursive Exp-Log Notation for All Item Scalability Coefficients Simultaneously

Appendix E. Recursive Exp-Log Notation for the Scalability Coefficient for a Set of Items

Appendix F. Recursive Exp-Log Notation for the Scalability Coefficient for a Set of Items for Several Groups Simultaneously

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation