## S3 method for class 'factor'
logloss(actual, response, normalize = TRUE, ...)
## S3 method for class 'factor'
weighted.logloss(actual, response, w, normalize = TRUE, ...)
## S3 method for class 'integer'
logloss(actual, response, normalize = TRUE, ...)
## S3 method for class 'integer'
weighted.logloss(actual, response, w, normalize = TRUE, ...)
## Generic S3 method
logloss(
actual,
response,normalize = TRUE,
...
)
## Generic S3 method
weighted.logloss(
actual,
response,
w,normalize = TRUE,
... )
Log Loss
logloss.factor | R Documentation |
Description
The logloss()
function computes the Log Loss between observed classes (as a <numeric>
matrix). The weighted.logloss()
function is the weighted version, applying observation-specific weights.
Usage
Arguments
actual
|
A vector of |
response
|
A \(n \times k\) |
normalize
|
A |
…
|
micro = NULL, na.rm = TRUE Arguments passed into other methods |
w
|
A |
Value
A <numeric>
-vector of length 1
Definition
\[H(p, response) = -\sum_{i} \sum_{j} y_{ij} \log_2(response_{ij})\]
where:
-
\(y_{ij}\) is the
actual
-values, where \(y_{ij}\) = 1 if thei
-th sample belongs to classj
, and 0 otherwise. -
\(response_{ij}\) is the estimated probability for the
i
-th sample belonging to classj
.
Examples
# 1) Recode the iris data set to a binary classification problem
# Here, the positive class ("Virginica") is coded as 1,
# and the rest ("Others") is coded as 0.
$species_num <- as.numeric(iris$Species == "virginica")
iris
# 2) Fit a logistic regression model predicting species_num from Sepal.Length & Sepal.Width
<- glm(
model formula = species_num ~ Sepal.Length + Sepal.Width,
data = iris,
family = binomial(link = "logit")
)
# 3) Generate predicted classes: "Virginica" vs. "Others"
<- factor(
predicted as.numeric(predict(model, type = "response") > 0.5),
levels = c(1, 0),
labels = c("Virginica", "Others")
)
# 3.1) Generate actual classes
<- factor(
actual x = iris$species_num,
levels = c(1, 0),
labels = c("Virginica", "Others")
)
# For Log Loss, we need predicted probabilities for each class.
# Since it's a binary model, we create a 2-column matrix:
# 1st column = P("Virginica")
# 2nd column = P("Others") = 1 - P("Virginica")
<- predict(model, type = "response")
predicted_probs <- cbind(predicted_probs, 1 - predicted_probs)
response_matrix
# 4) Evaluate unweighted Log Loss
# 'logloss' takes (actual, response_matrix, normalize=TRUE/FALSE).
# The factor 'actual' must have the positive class (Virginica) as its first level.
<- logloss(
unweighted_LogLoss actual = actual, # factor
response = response_matrix, # numeric matrix of probabilities
normalize = TRUE # normalize = TRUE
)
# 5) Evaluate weighted Log Loss
# We introduce a weight vector, for example:
<- iris$Petal.Length / mean(iris$Petal.Length)
weights <- weighted.logloss(
weighted_LogLoss actual = actual,
response = response_matrix,
w = weights,
normalize = TRUE
)
# 6) Print Results
cat(
"Unweighted Log Loss:", unweighted_LogLoss,
"Weighted Log Loss:", weighted_LogLoss,
sep = "\n"
)