This function takes an indicator matrix with rows representing objects and columns representing sets and computes a minimal redundancy free set using the greedy setcover optimization algorithm. The aim is to find a minimal set of clusters which covers all objects (or a minimum proportion rat).

Alternatively the number of clusters k can be specified. Then the problem becomes a maximum covergae problem. Both versions also permit weights such as frequencies (weighted setcover/maximum coverage).

setcover(x, k = NULL, rat = 1, s = NULL, w = NULL, check = TRUE)

Arguments

x

The indicator matrix.

k

An optional number of clusters.

rat

The minimum proportion of objects that is to be covered by the cluster set. If weights are specified in w then those are respected.

s

If weights are specified but not all objects are covered by one of the sets it can be necessary to specify the total weight in order to compute a sensible ratio.

w

Optional weights per object.

check

Whether or not to check for redundancies.

Value

The indices of the clusters in the minimal redundancy-free set. The result is not always the globally optiomal solution since the algorithm is greedy.

Note

This is written supporting the GSAC algorithm.

See also

Examples

# compute 100 clusterings with 24 clusters each: sc <- scale(olives[,3:10]) km100 <- as.data.frame(replicate(100, kmeans(sc,centers = 24)$cluster)) # convert to indicator matrix I100 <- idat(km100) # select from all clusters a minimum set: scover <- setcover(as.matrix(I100)) cdata <- subtable( as.data.frame(cbind(olives[,1:2], I100[,scover])),1:(length(scover)+2)) scpcp(cdata,sel="Area")