What is a Root Mean Squared Logarithmic Error (RMSLE)? RMSLE Explained
Root Mean Squared Logarithmic Error (RMSLE) is an evaluation metric commonly used in regression problems, particularly when the target variable has a wide range of values. It measures the average difference between the logarithm of the predicted and actual values, taking into account both the magnitude and direction of the error.
The RMSLE is calculated by taking the square root of the mean of the squared logarithmic differences between the predicted values (ŷ) and the actual values (y) of the target variable:
RMSLE = sqrt(mean((log(ŷ + 1) – log(y + 1))^2))
Here’s a step-by-step explanation of how to calculate RMSLE:
Add 1 to both the predicted values (ŷ) and the actual values (y) of the target variable to ensure that the logarithm is taken for positive values. This helps avoid undefined values when dealing with zero or negative values.
Compute the natural logarithm of the modified predicted values and the actual values. log_predicted = log(ŷ + 1) log_actual = log(y + 1)
Calculate the difference between the logarithms of the predicted and actual values. log_diff = log_predicted – log_actual
Square each logarithmic difference to eliminate the negative signs and emphasize larger errors. squared_log_diff = log_diff^2
Calculate the mean of the squared logarithmic differences. mean_squared_log_diff = mean(squared_log_diff)
Take the square root of the mean squared logarithmic difference to obtain the RMSLE. RMSLE = sqrt(mean_squared_log_diff)
The RMSLE penalizes underestimations and overestimations proportionally and is particularly useful when the target variable has a wide range of values. By taking the logarithm of the values, it reduces the impact of large differences in the high-value range and focuses more on the relative differences between the predicted and actual values.
RMSLE is often used in competitions and challenges, especially when the target variable has a skewed distribution or contains outliers. It provides a measure of the average logarithmic difference between predictions and actuals, allowing for better comparison and assessment of model performance.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.