Psychometric Data Analysis: A Size/ﬁt Trade-oﬀ Evaluation Procedure for Knowledge Structures

: A crucial problem in knowledge space theory, a modern psychological test theory, is the derivation of a realistic knowledge structure representing the organization of knowledge in an information domain and examinee population under reference. Often, one is left with the problem of selecting among candidate competing knowledge structures. This article proposes a measure for the selection among competing knowledge structures. It is derived within an operational framework (prediction paradigm), and is partly based on the unitary method of proportional reduction in predictive error as advocated by the authors Guttman, Goodman, and Kruskal. In particular, this measure is designed to trade oﬀ the (descriptive) ﬁt and size of a knowledge structure, which is of high interest in knowledge space theory. The proposed approach is compared with the Correlational Agreement Coef-ﬁcient, which has been recently discussed for the selection among competing surmise relations. Their performances as selection measures are compared in a simulation study using the fundamental basic local independence model in knowledge space theory.


Introduction
Knowledge structures and surmise relations are mathematical models that belong to the theory of knowledge spaces (reviewed in Section 2). Knowledge space theory (KST) was introduced by Falmagne (1985, 1999), and it has been successfully applied for the computerized, adaptive assessment and training of knowledge; for instance, see the ALEKS (Assessment and LEarning in Knowledge Spaces) system, a fully automated math tutor on the Internet. 2 KST models have also been applied in such areas as the structuring of hypertexts, the analysis of organizational workflows, and the modeling of cross-cultural knowledge and inter/intra-cultural value systems. 3 However, a crucial problem in KST is the derivation of a 'realistic' knowledge structure from empirical data, representing the organization of 'knowledge' in an information domain and examinee population under reference. In this regard, often one has to make a choice among candidate competing knowledge structures. (For instance, Section 5 describes how the candidate competing knowledge structure models may be obtained data-analytically, based on a modified Item Tree Analysis procedure. Or, the competing models under consideration may be derived theoretically, based on different psychological theories/postulates.) A measure κ is proposed for the evaluation of knowledge structures. It is designed to trade off the (descriptive) fit of a knowledge structure to a given data set and its size. Such a trade-off is of high interest in KST. (For instance, Section 4.5 mentions that this type of trade-off is beneficial for the efficient application of adaptive knowledge assessment procedures. In general, for a knowledge structure of a smaller size only a fewer items have to be answered by an examinee to assess her/his state of knowledge. Of course, the assessment procedure must also be based on a, more or less, 'valid' (data fitting) knowledge structure underlying an examinee's response behavior. In addition, the more states are contained in a knowledge structure (larger size) the smaller are the distances of the observed response patterns to the closest states in the knowledge structure (better fit). Therefore, any such measure must realize a trade-off between the size of a knowledge structure and its fit to the data.) The measure κ is derived within an operational framework (prediction paradigm), partly based on the unitary method of proportional reduction in predictive error (Section 4).
The approach to 'model selection' among knowledge structures based on the measure κ is compared with the Correlational Agreement Coefficient (CA, reviewed in Section 3), which has been recently discussed for the selection among surmise relations. The performances of κ and CA as selection measures are compared in a simulation study using the basic local independence model, which is a fundamental finite mixture, latent variable model in KST.
On the structure of this article. Section 2 reviews basic deterministic and probabilistic concepts of KST that are relevant for this work. Section 3 recapitulates the Correlational Agreement Coefficient CA. Section 4 proposes the size/fit trade-off evaluation procedure κ. Section 5 introduces modified Item Tree Analysis. Section 6 discusses an application of κ and CA to simulated data. This article concludes with a discussion in Section 7, containing a summary and some suggestions for further research.

Knowledge Space Theory
This section starts with a motivating small example which is taken from Falmagne et al. (2003), and then briefly reviews some of the basic deterministic and probabilistic concepts of KST. For details, the reader is referred to Doignon and Falmagne (1999).

Example: elementary algebra
A natural starting point for a theory of knowledge assessment and training stems from the observation that some pieces of knowledge may imply other pieces of knowledge. In the context of this section, the mastery of some algebra problem may imply the mastery of other problems. Such implications between pieces of knowledge may be used to design efficient computer-based, adaptive knowledge assessment and training procedures (cf. Section 4.5).
Consider the six dichotomous problems in Elementary Algebra: a. A car travels on the freeway at an average speed of 52 miles per hour. How many miles does it travel in 5 hours and 30 minutes?
b. Using the pencil, mark the point at the coordinates (1, 3).
c. Perform the following multiplication: 4x 4 y 4 · 2x · 5y 2 and simplify your answer as much as possible.
d. Find the greatest common factor of the expressions 14t 6 y and 4tu 5 y 8 . Simplify your answer as much as possible.
f. Write an equation for the line that passes through the point (−5, 3) and is perpendicular to the line 8x + 5y = 11.
A plausible prerequisite diagram of mastery dependencies for the six Elementary Algebra problems may look like in Figure 1. (Reflexivity and transitivity are assumed to hold and are not explicitly depicted.) The mastery of Problem b is, for instance, a prerequisite for the mastery of Problem e. In other words, the mastery of Problem e implies that of Problem b. The prerequisite diagram in Figure 1 completely specifies the feasible knowledge states. A respondent can certainly master just Problem a. This does not imply mastery of any other problem. In that case, the knowledge state is {a}. However, if the respondent masters e, for instance, then a, b, and c must also be mastered. This gives the knowledge state {a, b, c, e}. In this way, one obtains exactly 10 knowledge states consistent with the prerequisite diagram: {a}, {b}, {a, b}, {a, c}, {a, b, c}, {a, b, c, d}, {a, b, c, e}, {a, b, c, d, e}, {a, b, c, d, e, f } . This set K of all possible knowledge states is called knowledge structure. These notions are next formalized mathematically in the following section.

Basic deterministic concepts
A general concept is that of a knowledge structure.

Definition 1.
A knowledge structure is a pair (Q, K), with Q a non-empty, finite set, and K a family of subsets of Q containing at least the empty set ∅ and Q. The set Q is called the domain of the knowledge structure. The elements q ∈ Q and K ∈ K are referred to as (test) items and (knowledge) states, respectively. We also say that K is a knowledge structure on Q.
The set Q is assumed to be a set of dichotomous items. In this article, Q is interpreted as a set of questions/problems that can either be solved (coded 1) or not be solved (coded 0). This stands for the observed responses of a subject (manifest level), and has to be distinguished from a subject's true, unobservable knowledge of the solution to an item (latent level). In the latter case, we say that the subject is capable of mastering or not capable of mastering the item. Let 2 Q denote the power-set of Q, that is, the set of all subsets of Q. Let |Q| stand for the size (number of elements) of Q. The observed responses of a subject are represented by the subset R ⊂ Q containing exactly the items solved by the subject. This subset R is called the response pattern of the subject. Similarly, the true latent state of knowledge of a subject is represented by the subset K ⊂ Q containing exactly the items the subject is capable of mastering. This subset K is called the knowledge state of the subject. Given a knowledge structure K, the only states of knowledge possible are assumed to be the ones in K. In this spirit, K captures the organization of knowledge in the domain and population under reference. Idealized, if no response errors would be committed, the only response patterns possible would be the knowledge states in K.
As an example knowledge structure consider the one described in Section 2.1, on the domain Q = {a, b, c, d, e, f } of the six Elementary Algebra problems.
Note that this example knowledge structure is closed under union and intersection.

Definition 2.
A knowledge structure (Q, K) is called a knowledge space if and only if (iff) the union of any two knowledge states is a knowledge state. A knowledge space (Q, K) is called quasi-ordinal iff the intersection of any two knowledge states is a knowledge state.
The notions of a knowledge structure and (quasi-ordinal) knowledge space are at the level of persons (representing collections of knowledge states of individuals). There is another important notion, that of a surmise relation, which is at the level of items (representing collections of mastery dependencies between items). Definition 3. Any quasi-order, that is, reflexive and transitive binary relation, on Q is called a surmise relation.
A surmise relation ≺ on Q may model a latent hierarchy among the items based on mastery dependencies of the following type: a subject capable of mastering item j ∈ Q is also capable of mastering item i ∈ Q (i.e., i ≺ j).
As an example surmise relation consider the surmise relation ≺ corresponding to the prerequisite diagram of mastery dependencies in Figure 1 Birkhoff's (1937) theorem (applied in KST, Theorem 1) provides a linkage between quasi-ordinal knowledge spaces and surmise relations on an item set. (This theorem is crucial in this article. It allows for formulating the modified Item Tree Analysis procedure (Section 5.2) and comparing the coefficients κ and CA (Section 6.3).) Theorem 1. There is a one-to-one correspondence between the family of all quasi-ordinal knowledge spaces K on a domain Q, and the family of all surmise relations ≺ on Q. Such a correspondence is defined through the two equivalences: Proof. See Doignon and Falmagne (1999, pp. 39-40, Theorem 1.49).
This theorem is important from a practical point of view. Though the quasiordinal knowledge space and surmise relation models are empirically interpreted at two different levels, at the levels of persons and items respectively, they are connected with each other, through Birkhoff's theorem, on a solid mathematical basis. Roughly speaking, it mathematically links two different levels of empirical interpretations.
In the example in Section 2.1, the 10 knowledge states consistent with the prerequisite diagram in Figure 1 are obtained by applying the second equivalence of Birkhoff's theorem.

Basic probabilistic concepts
Examinees are drawn randomly from a population under reference. Let N be the sample size. The data are represented by the absolute counts N (R) of response patterns (subsets of Q containing exactly the items solved by the examinees) R ⊂ Q. We assume that the examinees give their response patterns independent of each other. The true probability of occurrence ρ(R) of any response pattern R ⊂ Q is assumed to stay constant across the examinees. Hence the data are assumed to be multinomially distributed over all subsets of Q.
Let the maximum probability of occurrence be denoted by ρ(R max ), In the example in Section 6, we simulate multinomial response data using a basic local independence model.
and K ∈ K, and R⊂Q r(R, K) = 1 for any K ∈ K; with two constants β q , η q ∈ [0, 1) for each q ∈ Q, respectively called careless error and lucky guess probabilities at q.
To each knowledge state K ∈ K is attached a probability p(K) measuring the likelihood that a randomly sampled examinee is in state K (Part 2). For R ⊂ Q and K ∈ K, r(R, K) specifies the conditional probability of response pattern R for an examinee in state K (Part 3). (The BLIM takes into account the two ways in which probabilities must supplement deterministic knowledge structures. First, knowledge states may occur with different proportions in the population under reference. Second, response errors (careless errors and lucky guesses) may render impossible a priori specification of the observable responses of an examinee given her/his knowledge state.) The item responses of an examinee are assumed to be independent given the knowledge state of the examinee, and the response error probabilities β q , η q (q ∈ Q) are attached to the items (item-specific) and do not vary from state to state (state-independent) (Part 4).
Under the BLIM, the manifest multinomial probability distribution on the response patterns is governed by the latent state proportions and response error rates:

Correlational Agreement Coefficient CA
This section briefly reviews the Correlational Agreement Coefficient CA. For details, the reader is referred toÜnlü and Albert (2004); see also Schrepp (2006).
The Correlational Agreement Coefficient CA was introduced by Leeuwe (1974) within Item Tree Analysis, a data-analytic procedure for deriving surmise relations on sets of dichotomous items (Section 5.1). In Item Tree Analysis, CA is used as a descriptive goodness-of-fit measure for selecting out of competing surmise relations one with maximum CA value. It is a measure formulated at the level of items, for surmise relations (cf. Theorem 1).

Required notation and terminology
Let the non-empty, finite item set be denoted by Q = {I l : 1 ≤ l ≤ m}. (The definition of CA requires an indexing of the items. That is why we use this notation.) For the random sample of N examinees, let the corresponding binary (of type 0/1) data matrix (of item responses) be D. Let ≺ be a surmise relation on Q. We say that ≺ is consistent with the data matrix D iff for any item pair I i ≺ I j , every examinee solving item I j also solves item I i .
Empirical correlation r ij between items I i and I j is defined as the sample Pearson correlation between the corresponding columns s i and s j of D. That is, where Cov() and V ar() stand for the sample covariance and variance, respectively.
Theoretical correlation r * ij between items I i and I j is defined as where p I i and p I j are the sample proportions-correct of items I i and I j , respectively.

Definition of CA
A comparison of empirical and theoretical correlation gives the following result.
Proposition 1. Let ≺ be a surmise relation on Q that is consistent with the data matrix D. Let I i and I j be items for which the empirical correlation exists. For the difference δ ij = r ij − r * ij between empirical and theoretical correlation, it holds Proof. SeeÜnlü and Albert (2004, p. 287, Proposition 12).
Proposition 1 gives motivation for the definition of the Correlational Agreement Coefficient.
Definition 5. The Correlational Agreement Coefficient CA is defined by The decision rule for applications of CA is as follows. The greater the value of CA is, the 'better' a surmise relation is judged to fit the data.

Measure κ
This section proposes the measure κ for evaluating knowledge structures. It is specially designed to trade off the fit and size of a knowledge structure, and is derived within an operational framework (prediction paradigm).

Prediction paradigm
The prediction problem considered is as follows. An individual is randomly chosen from the population under reference, and we are asked to guess her/his response pattern, given either (no info). no further information (other than the multinomial distribution), or (info). the knowledge structure K assumed to underlie the responses of the individual.
The prediction strategies in the two cases are as follows. In the 'no info' case, we optimally guess some response pattern R max ⊂ Q with the largest probability of occurrence ρ(R max ). In the 'info' case, we proportionally guess the knowledge states K ∈ K with their probabilities of occurrence ρ(K). Since the latter probabilities may not add up to one, in general, there may be a non-vanishing residual probability 1 − K∈K ρ(K) > 0. To complete the prediction strategy, hence we abstain from guessing with probability 1 − K∈K ρ(K), and in the sequel, view that as a prediction error.
The probabilities of a prediction error in the two cases are as follows. In the 'no info' case, the probability of a prediction error is 1 − ρ(R max ); in the 'info' case, it is 1 − K∈K ρ 2 (K). (The (complementary) probabilities of a prediction success are ρ(R max ) and K∈K ρ 2 (K), respectively.)

First constituent of κ: Measure of fit
The measure κ consists of two constituents. The first constituent of κ measures the degree to which a knowledge structure descriptively reflects the response data; the fit. It expresses the extent to which the multinomial probability distribution on the response patterns is concentrated to the knowledge structure.
The first constituent of κ is derived on the basis of the method of proportional reduction in predictive error (PRPE); the method of PRPE was introduced by Guttman (1941), and it was systematically applied in the series of articles by Goodman and Kruskal (1954, 1959, 1963, 1972. The general probability formula of the method of PRPE quantifies the predictive utility, P U inf o , of given information: Prob. of error (no info) − Prob. of error (info) Prob. of error (no info) .
Inserting the afore mentioned prediction error probabilities into this formula, we obtain the population analog of the first constituent m 1 , Some remarks are in order with respect to m 1 .

It holds
2. Obviously, m 1 = 0 iff ρ(R max ) = K∈K ρ 2 (K). In other words, m 1 assumes its extreme value in the case of, and only of, guessing with the largest probability of a prediction success. In that case, we have zero residual probability ( K∈K ρ(K) = 1), and the distribution on the response patterns is completely concentrated to the knowledge structure K (ρ(R) = 0 for any R ⊂ Q, R ∈ K).
Inserting MLEs, we obtain the MLE m 1 for m 1 , (We assume that 1 − N (R max )/N = 0. Since, by assumption, ρ(R max ) = 1, and N (R max )/N is the MLE for ρ(R max ), this is likely the case for large samples.)

Second constituent of κ: Measure of size
The second constituent of κ captures the size of a knowledge structure. It expresses the extent to which the restricted multinomial probability distribution on the knowledge states is concentrated to a fraction of the knowledge structure. (In Section 4.5, a special choice of a fraction is determined in the context of 'model selection' among competing knowledge structures, based on the median match of the competing models.) The definition of the second constituent of κ is based on the following notion of a truncation of a knowledge structure. Let n ≥ 1 (a natural number) be a truncation constant. An n-truncation of K is any subset K nt of K which is derived as follows: 1. Order the knowledge states according to their occurrence probabilities, say, from left to right, ascending from smaller probabilities to larger ones. Knowledge states with equal probabilities are ordered arbitrarily.
2. Starting with the foremost right knowledge state, a knowledge state with the largest probability of occurrence, take the first min(|K|, n) knowledge states, descending from right to left. The set of all these knowledge states is called an n-truncation of K, denoted by K nt .
(Depending on the orderings of equiprobable knowledge states, the set K nt may vary. In general, there are multiple n-truncations of a knowledge structure. The definition of the second constituent, however, is invariant with respect to the choice of a particular n-truncation. In Section 4.5, we describe how a reasonable truncation constant can be chosen in the context of 'model selection' among competing knowledge structure models.) For a truncation constant n, and any n-truncation K nt of K, we obtain the population analog of the second constituent m 2 , . Some remarks are in order with respect to m 2 .
2. Obviously, m 2 = 1 iff K∈Knt ρ 2 (K) = K∈K ρ 2 (K). In other words, m 2 assumes its extreme value in the case of, and only of, no loss of the probability of a prediction success when guessing based on a fraction, an n-truncation, of K than on the whole knowledge structure K. In that case, K∈Knt ρ(K) = K∈K ρ(K), and the restricted distribution on the knowledge states is completely concentrated to an n-truncation K nt (ρ(K) = 0 for any K ∈ K, K ∈ K nt ).
Inserting MLEs, we obtain the MLE m 2 for m 2 , where K nt is analogously defined as K nt , replacing occurrence probabilities ρ(K) with their MLEs N (K)/N (for knowledge states K ∈ K). (We assume that K∈K N (K) = 0. Since, by assumption, K∈K ρ(K) = 0, and ( K∈K N (K))/N is the MLE for K∈K ρ(K), this is likely the case for large samples.)

Measure κ: Size/fit trade-off
The measure κ is defined as the product of the size measure m 2 and the shifted fit measure m 1 . (The shift by '−c' of the fit measure compensates for a zero value of that measure. This is necessary to guarantee a trade-off between the size and fit criteria. For more details, see below.) Definition 6. Let n ≥ 1 be a truncation constant, and let c ∈ [0, 0.01] be a non-negative shift constant. The measure κ is defined by Some remarks are in order with respect to κ.

It holds −∞ < κ ≤ −c.
2. For c = 0, κ = −c(= 0) iff m 1 = 0, with in general arbitrary values of m 2 . In other words, κ assumes its extreme value in the case of, and only of, 'complete association' as described for the fit measure m 1 , and there is no indication about the size component as measured by the size measure m 2 . In that case, K∈K ρ(K) = 1, and there is a total fit of K to the data.
(If κ = m 2 m 1 , that is, c = 0, a zero value of the fit measure m 1 would eliminate the impact of the size measure m 2 on the values of κ. Regardless of any value m 2 may take, κ would always be equal to zero. In that case, κ would not trade off the size and fit components.) 3. For c = 0, κ = −c(< 0) iff m 1 = 0 and m 2 = 1. In other words, κ assumes its extreme value in the case of, and only of, 'complete associations' as described for the measures m 1 and m 2 . In that case, (a) K∈K ρ(K) = 1, and there is a total fit of K to the data, and (b) ρ(K) = 0 for any K ∈ K, K ∈ K nt , and the size of K can actually be reduced to |K nt |.
5. The afore mentioned remark can be rephrased in operational parlance of the prediction paradigm. Larger values of κ imply larger probabilities of a prediction success, or larger relative probabilities of a prediction success for n-truncations, or both. In that case, K 2 'performs better' than K 1 with respect to at least one of the criteria size and fit. If K 2 'performs better' than K 1 with respect to both of the criteria, it necessarily holds κ(K 1 ) < κ(K 2 ). If K 2 'performs better' than K 1 with respect to one of the two criteria only, κ(K 1 ) < κ(K 2 ) may not be true in general.
Inserting the MLEs m 1 and m 2 for m 1 and m 2 , respectively, we obtain the MLE κ for κ, The decision rule for applications of κ is as follows. The greater the (population) value of κ is, the 'better' a knowledge structure 'performs' with respect to a trade-off between the criteria size and fit. The unknown ordering of the (population) κ values is estimated by the ordering of the corresponding MLEs.

Model selection and truncation constant
Next we describe a special choice for the truncation constant. That special truncation constant is derived in the context of 'model selection' among competing knowledge structures, say, K 1 , . . . , K p (on a domain Q).
The definition of the special truncation constant is based on the notion of the median match of the competing models under consideration. . . . , v p ) be the match vector of the 'model selection' problem. The (empirical) median of the matches v i is denoted by median(v), and is called the median match of the competing models of the 'model selection' problem. That is, . . . , v (p) with v (1) ≤ · · · ≤ v (p) is the ordered list of the matches v i .
For efficient applications of adaptive knowledge assessment procedures (Doignon and Falmagne, 1999), knowledge structures of a 'trade-off type' are beneficial. On the one hand, a knowledge structure should fit the data as well as possible, but on the other hand, it should also be of a smaller size. A trade-off between these criteria allows for an economic diagnosis of the knowledge state of an examinee; in general, for a knowledge structure of a smaller size only a fewer items have to be answered by an examinee to assess her/his knowledge state. The halfsplit rule in deterministic knowledge assessment, for instance, generally requires about log 2 (|K|) items for the diagnosis. (At this point it should be clear why a knowledge structure is not preferred to consist of all subsets of an item set. In such a case, no dependencies between the items would be postulated (except for reflexive ones), and hence all items of the domain would have to be worked through by an examinee. Compare also the simulation example in Section 6.) In other words, the measure κ (as a size/fit trade-off procedure) allows for a 'poorer' (descriptive) fit of a knowledge structure in favor of a smaller size.
The definition of the special truncation constant is furthermore based on the term 2 |Q|/2 . This term is introduced to express the extent to which a knowledge structure may be 'tailored' to n-truncations of sizes bounded from above by 2 |Q|/2 . (The half-split rule using a knowledge structure of a size bounded from above by 2 |Q|/2 , generally requires about at most half of the items for the diagnosis of the knowledge state of an examinee (log 2 (|K|) ≤ log 2 (2 |Q|/2 ) = |Q|/2).) The special truncation constant n s is defined by where for a real number x ≥ 0, [x] denotes the entier of x, that is, the nonnegative integer k with k ≤ x < k + 1. The MLE for the (population) special truncation constant is MLEs for the (population) matches and match vector, respectively.

Summary
In the example in Section 6, the measure κ is applied using the special truncation constant. Next we summarize the (population) κ procedure based on the special truncation constant, and the corresponding MLE in terms of the data. Consider the 'model selection' problem K 1 , . . . , K p with knowledge structures K i on Q. Assume that for the multinomial probability distribution on the response patterns, it holds 1. ρ(R max ) = 1 (R max ⊂ Q with ρ(R max ) = max R⊂Q ρ(R));

2.
K∈K i ρ(K) = 0 for any K i . Let n s be the (population) special truncation constant, and let c ∈ [0, 0.01] be a shift constant.
Under these conditions the (population) κ(K i ) values for the knowledge structures K i are well-defined, Let N (R) for R ⊂ Q be the data, and let N = R⊂Q N (R) be large. Assume that where n s is the (sample) MLE for n s , and K i ? nst is analogously defined as K i ? nst , replacing occurrence probabilities ρ(K) with their MLEs N (K)/N (for knowledge states K ∈ K i ).
Note that all K i (1 ≤ i ≤ p) are used for determining the (population) n s and (sample) n s values, and based on these n s and n s values, the (population) κ(K i ) and (sample) κ(K i ) values are obtained for each model separately, respectively. Eventually, a model

Modified Item Tree Analysis
This section describes how the candidate competing models are obtained dataanalytically, based on a modified version of Leeuwe's (1974) Item Tree Analysis. (Though we pursue a data-analytic approach, any other method for determining the competing models is conceivable. The measure κ can be applied to any 'model selection' problem among competing knowledge structures, independent of how the models may be obtained. For instance, κ may be used for selecting among knowledge structures theoretically derived from different psychological theories/postulates.)

STEP1.
Determine the binary relations ≺ L for L = 0, 1, . . . , N according to the ITA-rule Here we use the notation Q = {I l : 1 ≤ l ≤ m}, and for any two items I i and I j in Q, c ij denotes the absolute count of examinees solving I j but not I i . The tolerance level L quantifies the allowed maximum number of contradictions to an item pair in the relation ≺ L .

STEP2.
From the generated binary relations ≺ L (0 ≤ L ≤ N ), remove those that are not transitive.
STEP3. Set a critical value 0 < c ≤ 1 for the proportions p L of examinees not contradicting the respective surmise relations ≺ L in STEP2.

STEP4.
From the surmise relations in STEP2, remove those with p L < c.

STEP5.
From the remaining surmise relations (after STEP4)-≺ 0 is always contained-select one with maximum CA value.
(The Correlational Agreement Coefficient is used as a goodness-of-fit measure to handle the selection problem in STEP5. From the remaining surmise relations, select an 'optimal' one, here, one with maximum CA value.)

Modified version of item tree analysis
A modified version of ITA (MITA) is as follows. We keep the ITA-rule for generating the binary relations ≺ L for 0 ≤ L ≤ N . However, we do not remove those relations ≺ L that are not transitive, as it is done in the original ITA procedure. Instead, we take the transitive closure of such a binary relation turning it to a surmise relation. Moreover, the other steps of ITA are not considered anymore. This yields a collection of candidate models which contains the models derived from ITA. Eventually, from the collection of candidate surmise relation models, we select an 'optimal' one, here, one with maximum κ value. (Note that Theorem 1 is crucial at this point. We select among the quasi-ordinal knowledge spaces corresponding to the surmise relations according to Theorem 1. The measure κ is formulated at the level of persons, for knowledge structures.) This is the fully-automated version of MITA. In a user-controlled version, the user may narrow down this collection of competing models to a smaller one, based on important factors (e.g., psychological theory) not captured by the data analysis solely. (The fully-automated version of MITA is illustrated in Section 6.) MITA consists of three steps, STEP1-STEP3:
STEP2. Take the transitive closure for all ≺ L (0 ≤ L ≤ N ). Consider the collection of (quasi-ordinal) knowledge spaces corresponding to those surmise relations according to Theorem 1. In the fully-automated version of MITA, these knowledge spaces constitute the final collection of competing models. In a user-controlled version, that collection is further narrowed down to a smaller sub-collection, based on important factors not captured by the data analysis solely.

STEP3.
From the collection of knowledge structure models in STEP2, select one with maximum κ value.

Simulation Example
This section describes an application of the coefficients κ and CA. Their performances as selection measures are compared in a simulation study using the BLIM (Definition 4).

Data generating model
We consider the knowledge structure , {a}, {b}, {a, b}, {a, b, c}, {a, b, d}, {a, b, c, d}, {a, b, c, e}, Q on the domain Q = {a, b, c, d, e}. The Hasse diagram of the surmise relation ≺ K derived from K according to Theorem 1 is shown in Figure 2.
We assume that the knowledge states of K occur in the population under reference with the probabilities p(∅) = 0.04, p({a, b}) = 0.12, p({a, b, c}) = 0.11, p({a, b, d}) = 0.07, p({a, b, c, d}) = 0.13, p({a, b, c, e}) Let the careless error and lucky guess probabilities β q and η q at the items q ∈ Q, respectively, be specified as This BLIM was used for the simulation of the data (Section 6.2). Note that the specification of the model parameters p(K) (K ∈ K) and β q , η q (q ∈ Q) is a realistic one. We do not assume a single response error rate over all items; the careless error and lucky guess rates vary from item to item. We do not assume a uniform probability distribution on K; the knowledge states occur with different proportions in the population under reference. Furthermore, from an empirical point of view, the lower values for the lucky guess rates do not cause concern, because guessing effects can nearly be eliminated by appropriate item formulation.

Simulated data
We simulated a binary (of type 0/1) 1 200×5 data matrix of response patterns for 1 200 fictitious subjects. The data matrix contains all of the 32 possible response patterns, hence 1 168 response patterns are duplicates. This matrix of item scores is displayed in Table 1. (The 32 response patterns are shown with their absolute frequencies in the data. There are 73 response patterns '00000' (no item solved) and 88 response patterns '11111' (all items solved).)

Results: MITA of simulated data
We describe the results of an application of MITA to the simulated BLIM data.
We determined the binary relations ≺ L (0 ≤ L ≤ N = 1 200) according to the ITA-rule, and their transitive closures ≺ L (0 ≤ L ≤ 1 200). This resulted in a collection of 15 surmise relations, respectively, a collection of 15 (quasi-ordinal) knowledge spaces. These collections contained the true models underlying the data, ≺ K and K, respectively. (A complete list of the competing surmise relation and knowledge structure models in this example can be obtained from the first author.) From these collections, we selected 'optimal' models with maximum CA and κ values. Table 2 reports the values of CA and κ. (For n s and c = 0.01. Models are labeled with their tolerance levels. The true model is indicated by '(true)'. L CA and L κ denote maximum CA and κ solutions, respectively. A notation '0-58' means that the same surmise relation and knowledge space were obtained for the tolerance levels 0 ≤ L ≤ 58.) The coefficient CA decreases steadily (except for L = 96-100) with its maximum value assumed at the lowest tolerance range L CA = 0-58. The 'optimal' surmise relation selected based on CA is ≺ 0-58 , which is the diagonal i ≺ 0-58 i (i ∈ Q). It consists of 5 item pairs of altogether 5 2 = 25 possible pairs (20%). The knowledge space K 0-58 is the set of all subsets of Q, consisting of 32 knowledge states (log 2 (|K 0-58 |) = 5). These 'best' solutions based on CA do not reflect the true models at all. The true surmise relation ≺ K (= ≺ 101-150 ) and knowledge space K (= K 101-150 ) consist of 12 item pairs and 9 knowledge states, respectively.

Summary
In this article, we have proposed a 'Goodman-Kruskal type' measure κ for selecting among competing knowledge structure models in KST, as underlying latent explanations for discrete multivariate response data. We have utilized this measure in a new 'ITA type' data-analytic method for detecting knowledge structures from data.
This measure κ is suited for nominal data and is (operationally) interpretable in terms of prediction error (success) probabilities of a prediction paradigm. It is designed to combine and trade off the (descriptive) fit and size of a knowledge structure, which is of high interest in KST, especially in the context of adaptive knowledge assessment procedures.
We have compared κ with the Correlational Agreement Coefficient CA, which has been recently discussed as a selection measure for competing surmise relation models. The performances of the two coefficients have been investigated in a simulation study using the fundamental BLIM in KST. Based on the proposed MITA method, the candidate competing surmise relation and knowledge structure models have been obtained. The 'optimal' solutions based on CA have not reflected the true models at all, whereas the solutions based on κ have been quite acceptable.

Further research
The current simulation study is a starting point for more in-depth analyses of the measure κ. Further research may address in systematic, extensive simulation studies the effects of the variation of the sample size (especially for small sample sizes), the underlying knowledge structure model, and the BLIM parameters. In particular, inferential (asymptotic) statistics (e.g., confidence intervals) and applications to real psychological test data are important and indispensable directions for future research.
Measures of a 'Goodman-Kruskal type' could also be derived within the ordertheoretic (at the level of items) formulation of KST, for surmise relations or even surmise systems. The relationship between the set-theoretic and order-theoretic measures could then be investigated. In particular, alternative data analysis methods of an 'ITA type' could be derived, and these procedures could be compared with each other, and especially with other available related or unrelated data-analytic methods.