cd.cluster {ACTCD} | R Documentation |

`cd.cluster`

is used to classify examinees into unlabeled clusters based on cluster analysis. Available options include *K*-means and Hierarchical Agglomerative Cluster Analysis (HACA) with various links.

cd.cluster (Y, Q, method = c("HACA", "Kmeans"), Kmeans.centers = NULL, Kmeans.itermax = 10, Kmeans.nstart = 1, HACA.link = c("complete", "ward", "single", "average", "mcquitty", "median", "centroid"), HACA.cut = NULL)

`Y` |
A required |

`Q` |
A required |

`method` |
The clustering algorithm used to classify data. Two options are available, including |

`Kmeans.centers` |
The number of clusters when |

`Kmeans.itermax` |
The maximum number of iterations allowed when |

`Kmeans.nstart` |
The number of random sets to be chosen when |

`HACA.link` |
The link to be used with HACA. It must be one of |

`HACA.cut` |
The number of clusters when |

Based on the Asymptotic Classification Theory (Chiu, Douglas & Li, 2009), A sample statistic *\bm{W}* (See `ACTCD`

) is calculated using the response matrix and Q-matrix provided by the users and then taken as the input for cluster analysis (i.e. *K*-means and HACA).

The number of latent clusters can be specified by the users in `Kmeans.centers`

or `HACA.cut`

. It must be not less than 2 and not greater than *2^K*, where *K* is the number of attributes. Note that if the number of latent clusters is less than the default value (*2^K*), the clusters cannot be labeled in `labeling`

using `method="1"`

and `method="3"`

algorithms. See `labeling`

for more information.

`W` |
The |

`size` |
A set of integers, indicating the sizes of latent clusters. |

`mean.w` |
A matrix of cluster centers, representing the average |

`wss.w` |
The vector of within-cluster sum of squares of |

`sqmwss.w` |
The vector of square root of mean of within-cluster sum of squares of |

`mean.y` |
The vector of the mean of sum scores of the clusters. |

`class` |
The vector of estimated memberships for examinees. |

Chiu, C. Y., Douglas, J. A., & Li, X. (2009). Cluster analysis for cognitive diagnosis: theory and applications. *Psychometrika, 74*(4), 633-665.

`print.cd.cluster`

, `labeling`

, `npar.CDM`

, `ACTCD`

# Classification based on the simulated data and Q matrix data(sim.dat) data(sim.Q) # Information about the dataset N <- nrow(sim.dat) #number of examinees J <- nrow(sim.Q) #number of items K <- ncol(sim.Q) #number of attributes #the default number of latent clusters is 2^K cluster.obj <- cd.cluster(sim.dat, sim.Q) #cluster size sizeofc <- cluster.obj$size #W statistics W <- cluster.obj$W #User-specified number of latent clusters M <- 5 # the number of clusters is fixed to 5 cluster.obj <- cd.cluster(sim.dat, sim.Q, method="HACA", HACA.cut=M) #cluster size sizeofc <- cluster.obj$size #W statistics W <- cluster.obj$W M <- 5 # the number of clusters is fixed to 5 cluster.obj <- cd.cluster(sim.dat, sim.Q, method="Kmeans", Kmeans.centers =M) #cluster size sizeofc <- cluster.obj$size #W statistics W <- cluster.obj$W

[Package *ACTCD* version 1.2-0 Index]