Model Evaluation Metrics

Before, Moving for next model we have to understand how the model is evaluated. Whenever we build any model, we have some inputs already and after building model we get some output according the input. Output is depend upon various factors presents in inputs. When we get output we have to evaluate the built model. Measuring the performance of the trained model is just as crucial as preparing the data and training the machine learning model, which are both essential steps in the process. Machine learning models are classified as adaptive or non-adaptive based on how well they generalise on new data.

Before we deploy our model for production on unseen data, we should be able to enhance its overall predictive capacity by evaluating its performance using several measures. If the machine learning model is not properly evaluated using several metrics and is solely based on accuracy, it may cause issues when applied to unidentified data and produce inaccurate predictions. This occurs because our models can't generalise adequately on unseen data in situations like these since they memorise information instead of learning. Now that we have defined the evaluation measures, let's look at how a machine learning model performs, which is a crucial part of any data science project. Its goal is to calculate a model's generalisation accuracy using future (unseen/out-of-sample) data.

A] Confusion Matrix:

When describing the performance of the classification model (also known as the "classifier") on a set of test data for which the true values are known, a confusion matrix. A matrix representation of the prediction outcomes of any binary testing is frequently employed.
Although the confusion matrix itself is not too difficult to understand, there can be some confusion with the associated language.

[1]

Confusion Matrix

Each prediction can be one of the four outcomes, based on how it matches up to the actual value:

True Positive (TP): Predicted True and True in reality.

All that constitutes a true positive is the situation in which both the predicted and actual values are true. In addition to the model's prediction that the patient had cancer, the patient has received a cancer diagnosis.

True Negative (TN): Predicted False and False in reality.

In this instance, both the projected value and the actual value are false. Stated differently, our model predicted that the patient did not have cancer, and the patient has not received a cancer diagnosis.

False Positive (FP): Predicted True and False in reality.

When there is a false negative, the patient has cancer even though the model predicted that they did not. This is known as an actual value being true while the anticipated value is false.

False Negative (FN): Predicted False and True in reality.

In this instance, the actual result is false even if the predicted value is true. In this case, the patient does not actually have cancer, despite the model's prediction to the contrary.

B] Performance Matrix:

Performance Metrices for

1]Classification:

1: Accuracy:

One of the easiest classification metrics to use is accuracy, which can be calculated as the ratio of accurate predictions to total predictions.

[2]

2: Precision:

The percentage of all the predictions we generated using our predictive model that come true is what we mean by "precision" in this context.

[3]

3: Recall and Sensitivity:

All recall is is a metric that indicates what percentage of patients with cancer were also expected to have cancer. "How sensitive is the classifier in detecting positive instances?" is the question it addresses.

[4]

4: Specifivity:

In predicting positive cases, it provides an answer to the question, "How specific or selective is the classifier?"

[5]

5: F1 Score:

This is nothing but the harmonic mean of precision and recall.

[6]

2] Regression:

1: Mean Absolute Error:

One of the most basic metrics, mean absolute error, or MAE, quantifies the absolute difference between actual and anticipated values. Absolute refers to taking a number as positive.

[7]

2: Mean Squared Error:

When evaluating regression, one of the best metrics is mean squared error, or MSE. It calculates the average of the squared difference between the model's actual value and the values that were predicted.

[8]

3: R2 Score:

Coefficient of Determination, another widely used statistic for evaluating regression models, is also known as R squared error. We may assess the model's performance by comparing it to a constant baseline using the R-squared metric. We must take the data mean and draw a line at the mean in order to choose the constant baseline.

[9]

4: Adjusted R2:

As the name implies, adjusted R squared is an enhanced form of R squared error. R square may deceive data scientists and has a limit of raising a score on increasing the terms even when the model is not improving. Adjusted R squared is used to get around the R square problem, however it always displays a lower number than R². This is so that it only displays improvement when there is a true improvement and modifies the values of growing predictors.

[10]

References:

[1] https://www.kdnuggets.com/2020/05/model-evaluation-metrics-machine-learning.html

[2,3,4,5,6] https://intellipaat.com/blog/confusion-matrix-python/
[7,8,9,10] https://www.javatpoint.com/performance-metrics-in-machine-learning

A Beginner’s Dive into Machine Learning.

Sunday, March 31, 2024

STAIR TWO: Model Evaluation Metrics

Model Evaluation Metrics

No comments:

Post a Comment

Probability and Statistical Operation Using Python

Report Abuse