## set seed
set.seed(1903)
## actual
<- factor(
actual x = sample(x = 1:3, size = 10, replace = TRUE),
levels = c(1, 2, 3),
labels = c("A", "B", "C")
)
## print values
print(actual)
#> [1] B A B B A C B C C A
#> Levels: A B C
In this section all available classification metrics and related documentation is described. Common for all classifcation functions is that they use the method foo.factor
or foo.cmatrix
.
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor
values is defined as follows:
## set seed
set.seed(1903)
## actual
<- factor(
actual x = sample(x = 1:3, size = 10, replace = TRUE),
levels = c(1, 2, 3),
labels = c("A", "B", "C")
)
## print values
print(actual)
#> [1] B A B B A C B C C A
#> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
’s. The predicted vector of factor
values would be defined as follows:
## set seed
set.seed(1903)
## predicted
<- factor(
predicted x = sample(x = c(1, 3), size = 10, replace = TRUE),
levels = c(1, 2, 3),
labels = c("A", "B", "C")
)
## print values
print(predicted)
#> [1] C A C C C C C C A C
#> Levels: A B C
In both cases, \(k = 3\), determined indirectly by the levels
argument.
In this section a brief introduction to the two methods are given.
## factor method
::accuracy(
SLmetrics
actual,
predicted )
#> [1] 0.3
## 1) generate confusion
## matrix (cmatrix class)
<- SLmetrics::cmatrix(
confusion_matrix
actual,
predicted
)
## 2) check class
class(confusion_matrix)
#> [1] "cmatrix"
## 3) summarise
summary(confusion_matrix)
#> Confusion Matrix (3 x 3)
#> ================================================================================
#> A B C
#> A 1 0 2
#> B 0 0 4
#> C 1 0 2
#> ================================================================================
#> Overall Statistics (micro average)
#> - Accuracy: 0.30
#> - Balanced Accuracy: 0.33
#> - Sensitivity: 0.30
#> - Specificity: 0.65
#> - Precision: 0.30
The confusion_matrix
can be passed into accuracy()
as follows:
::accuracy(
SLmetrics
confusion_matrix )
#> [1] 0.3
Using the cmatrix
-method is more efficient if more than one classification metric is going to be calculated, as the metrics are calculated directly from the cmatrix
-object, instead of looping though all the values in actual
and predicted
values for each metrics. See below:
cat(
sep = "\n",
paste("Accuracy:", SLmetrics::accuracy(
confusion_matrix)),paste("Balanced Accuracy:", SLmetrics::baccuracy(
confusion_matrix)) )
#> Accuracy: 0.3
#> Balanced Accuracy: 0.333333333333333