Cohen’s Kappa Statistic

ckappa.factor R Documentation

Description

A generic function for Cohen’s \(\kappa\)-statistic. Use weighted.ckappa() for the weighted \(\kappa\)-statistic.

Usage

## S3 method for class 'factor'
ckappa(actual, predicted, beta = 0, ...)

## S3 method for class 'factor'
weighted.ckappa(actual, predicted, w, beta = 0, ...)

## S3 method for class 'cmatrix'
ckappa(x, beta = 0, ...)

ckappa(
 ...,
 beta = 0
)

weighted.ckappa(
 ...,
 w,
 beta = 0
)

Arguments

actual

A vector of - of length \(n\), and \(k\) levels.

predicted

A vector of -vector of length \(n\), and \(k\) levels.

beta

A <numeric> value of length 1 (default: 0). If \(\beta \neq 0\) the off-diagonals of the confusion matrix are penalized with a factor of \((y_{+} - y_{i,-})^\beta\).

micro = NULL, na.rm = TRUE Arguments passed into other methods

w

A <numeric>-vector of length \(n\). NULL by default.

x

A confusion matrix created cmatrix().

Value

If micro is NULL (the default), a named <numeric>-vector of length k

If micro is TRUE or FALSE, a <numeric>-vector of length 1

Definition

Let \(\kappa \in [0, 1]\) be the inter-rater (intra-rater) reliability. The inter-rater (intra-rater) reliability is calculated as,

\[ \kappa = \frac{\rho_p - \rho_e}{1-\rho_e} \]

Where:

  • \(\rho_p\) is the empirical probability of agreement between predicted and actual values

  • \(\rho_e\) is the expected probability of agreement under random chance

If \(\beta \neq 0\) the off-diagonals in the confusion matrix is penalized before \(\rho\) is calculated. More formally,

\[ \chi = X \circ Y^{\beta} \]

Where:

  • \(X\) is the confusion matrix

  • \(Y\) is the penalizing matrix and

  • \(\beta\) is the penalizing factor

Examples

# 1) recode Iris
# to binary classification
# problem
iris$species_num <- as.numeric(
  iris$Species == "virginica"
)

# 2) fit the logistic
# regression
model <- glm(
  formula = species_num ~ Sepal.Length + Sepal.Width,
  data    = iris,
  family  = binomial(
    link = "logit"
  )
)

# 3) generate predicted
# classes
predicted <- factor(
  as.numeric(
    predict(model, type = "response") > 0.5
  ),
  levels = c(1,0),
  labels = c("Virginica", "Others")
)

# 3.1) generate actual
# classes
actual <- factor(
  x = iris$species_num,
  levels = c(1,0),
  labels = c("Virginica", "Others")
)

# 4) evaluate model performance with
# Cohens Kappa statistic
cat(
  "Kappa", ckappa(
    actual    = actual,
    predicted = predicted
  ),
  "Kappa (penalized)", ckappa(
    actual    = actual,
    predicted = predicted,
    beta      = 2
  ),
  "Kappa (weigthed)", weighted.ckappa(
    actual    = actual,
    predicted = predicted,
    w         = iris$Petal.Length/mean(iris$Petal.Length)
  ),
  sep = "\n"
)