Jaccard Score

jaccard.factor

R Documentation

Description

The jaccard()-function computes the Jaccard Index, also known as the Intersection over Union, between two vectors of predicted and observed factor() values. The weighted.jaccard() function computes the weighted Jaccard Index.

Usage

## S3 method for class 'factor'
jaccard(actual, predicted, micro = NULL, na.rm = TRUE, ...)

## S3 method for class 'factor'
weighted.jaccard(actual, predicted, w, micro = NULL, na.rm = TRUE, ...)

## S3 method for class 'cmatrix'
jaccard(x, micro = NULL, na.rm = TRUE, ...)

## S3 method for class 'factor'
csi(actual, predicted, micro = NULL, na.rm = TRUE, ...)

## S3 method for class 'factor'
weighted.csi(actual, predicted, w, micro = NULL, na.rm = TRUE, ...)

## S3 method for class 'cmatrix'
csi(x, micro = NULL, na.rm = TRUE, ...)

## S3 method for class 'factor'
tscore(actual, predicted, micro = NULL, na.rm = TRUE, ...)

## S3 method for class 'factor'
weighted.tscore(actual, predicted, w, micro = NULL, na.rm = TRUE, ...)

## S3 method for class 'cmatrix'
tscore(x, micro = NULL, na.rm = TRUE, ...)

## Generic S3 method
jaccard(
 ...,
 micro = NULL,
 na.rm = TRUE
)

## Generic S3 method
csi(
 ...,
 micro = NULL,
 na.rm = TRUE
)

## Generic S3 method
tscore(
 ...,
 micro = NULL,
 na.rm = TRUE
)

## Generic S3 method
weighted.jaccard(
 ...,
 w,
 micro = NULL,
 na.rm = TRUE
)

## Generic S3 method
weighted.csi(
 ...,
 w,
 micro = NULL,
 na.rm = TRUE
)

## Generic S3 method
weighted.tscore(
 ...,
 w,
 micro = NULL,
 na.rm = TRUE
)

Arguments

`actual`	A vector of values of length \(n\), and \(k\) levels.
`predicted`	A vector of values of length \(n\), and \(k\) levels.
`micro`	A -value of length \(1\) (default: NULL). If TRUE it returns the micro average across all \(k\) classes, if FALSE it returns the macro average.
`na.rm`	A value of length \(1\) (default: TRUE). If TRUE, NA values are removed from the computation. This argument is only relevant when `micro != NULL`. When `na.rm = TRUE`, the computation corresponds to `sum(c(1, 2, NA), na.rm = TRUE) / length(na.omit(c(1, 2, NA)))`. When `na.rm = FALSE`, the computation corresponds to `sum(c(1, 2, NA), na.rm = TRUE) / length(c(1, 2, NA))`.
`…`	Arguments passed into other methods
`w`	A `<numeric>`-vector of length \(n\). NULL by default.
`x`	A confusion matrix created `cmatrix()`.

Value

If micro is NULL (the default), a named <numeric>-vector of length k

If micro is TRUE or FALSE, a <numeric>-vector of length 1

Definition

The metric is calculated for each class \(k\) as follows,

\[ \frac{\#TP_k}{\#TP_k + \#FP_k + \#FN_k} \]

Where \(\#TP_k\), \(\#FP_k\), and \(\#FN_k\) represent the number of true positives, false positives, and false negatives for each class \(k\), respectively.

Examples

# 1) recode Iris
# to binary classification
# problem
iris$species_num <- as.numeric(
  iris$Species == "virginica"
)

# 2) fit the logistic
# regression
model <- glm(
  formula = species_num ~ Sepal.Length + Sepal.Width,
  data    = iris,
  family  = binomial(
    link = "logit"
  )
)

# 3) generate predicted
# classes
predicted <- factor(
  as.numeric(
    predict(model, type = "response") > 0.5
  ),
  levels = c(1,0),
  labels = c("Virginica", "Others")
)

# 3.1) generate actual
# classes
actual <- factor(
  x = iris$species_num,
  levels = c(1,0),
  labels = c("Virginica", "Others")
)

# 4) evaluate class-wise performance
# using Jaccard Index

# 4.1) unweighted Jaccard Index
jaccard(
  actual    = actual,
  predicted = predicted
)

# 4.2) weighted Jaccard Index
weighted.jaccard(
  actual    = actual,
  predicted = predicted,
  w         = iris$Petal.Length/mean(iris$Petal.Length)
)

# 5) evaluate overall performance
# using micro-averaged Jaccard Index
cat(
  "Micro-averaged Jaccard Index", jaccard(
    actual    = actual,
    predicted = predicted,
    micro     = TRUE
  ),
  "Micro-averaged Jaccard Index (weighted)", weighted.jaccard(
    actual    = actual,
    predicted = predicted,
    w         = iris$Petal.Length/mean(iris$Petal.Length),
    micro     = TRUE
  ),
  sep = "\n"
)