The mcm
function is a simple tool for analyzed different metrics from a confusion matrix. Can be very useful to analyze the performance of a binary classification model.
This function depends of pandas
and the function confusion_matrix
from sklearn.metrics
. With confusion_matrix
the True Negative (TN) cases, True Positive (TP) cases, False Negative (FN) cases and False Positive (FP cases) are calculated.
The following table categorizes predicted and actual conditions into four fundamental outcomes: True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN).
Here are some key sections and their purpose:
Basic Definitions:
Rates and Values:
Predictive Values:
Composite Metrics:
Advanced Ratios:
The table is designed to help in the interpretation of test results, particularly in the context of diagnostic tests or in the performance evaluation of classification algorithms. Each metric provides a different insight into the accuracy and reliability of the test or model, allowing practitioners to assess its effectiveness comprehensively.
The table below illustrates the confusion matrix used in our analysis. This matrix visualizes the performance of the classification algorithm, highlighting the true positives, true negatives, false positives, and false negatives. Understanding these values is crucial for evaluating the accuracy and reliability of our model.
The following Python code demonstrates how to generate a confusion matrix using the pandas
and sklearn.metrics
libraries. This matrix is pivotal for calculating various performance metrics, such as precision and recall, which help further assess the effectiveness of the model.
import pandas as pd
from sklearn.metrics import confusion_matrix
?confusion_matrix
Compute confusion matrix to evaluate the accuracy of a classification.
By definition a confusion matrix :math:`C` is such that :math:`C_{i, j}`
is equal to the number of observations known to be in group :math:`i` and
predicted to be in group :math:`j`.
Thus in binary classification, the count of true negatives is
:math:`C_{0,0}`, false negatives is :math:`C_{1,0}`, true positives is
:math:`C_{1,1}` and false positives is :math:`C_{0,1}`.
Read more in the :ref:`User Guide <confusion_matrix>`.
Parameters
----------
y_true : array-like of shape (n_samples,)
Ground truth (correct) target values.
y_pred : array-like of shape (n_samples,)
Estimated targets as returned by a classifier.
labels : array-like of shape (n_classes), default=None
List of labels to index the matrix. This may be used to reorder
or select a subset of labels.
If ``None`` is given, those that appear at least once
in ``y_true`` or ``y_pred`` are used in sorted order.
sample_weight : array-like of shape (n_samples,), default=None
Sample weights.
normalize : {'true', 'pred', 'all'}, default=None
Normalizes confusion matrix over the true (rows), predicted (columns)
conditions or all the population. If None, confusion matrix will not be
normalized.
Returns
-------
C : ndarray of shape (n_classes, n_classes)
Confusion matrix.
References
----------
.. [1] `Wikipedia entry for the Confusion matrix
<https://en.wikipedia.org/wiki/Confusion_matrix>`_
(Wikipedia and other references may use a different
convention for axes)
Examples
--------
>>> from sklearn.metrics import confusion_matrix
>>> y_true = [2, 0, 2, 2, 0, 1]
>>> y_pred = [0, 0, 2, 2, 0, 2]
>>> confusion_matrix(y_true, y_pred)
array([[2, 0, 0],
[0, 0, 1],
[1, 0, 2]])
>>> y_true = ["cat", "ant", "cat", "cat", "ant", "bird"]
>>> y_pred = ["ant", "ant", "cat", "cat", "ant", "cat"]
>>> confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"])
array([[2, 0, 0],
[0, 0, 1],
[1, 0, 2]])
In the binary case, we can extract true positives, etc as follows:
>>> tn, fp, fn, tp = confusion_matrix([0, 1, 0, 1], [1, 1, 1, 0]).ravel()
>>> (tn, fp, fn, tp)
(0, 2, 1, 1)
data = pd.DataFrame({
'y_true': ['Positive']*47 + ['Negative']*18,
'y_pred': ['Positive']*37 + ['Negative']*10 + ['Positive']*5 + ['Negative']*13})
confusion_matrix(y_true = data.y_true,
y_pred = data.y_pred,
labels = ['Negative', 'Positive'])
array([[13, 5],
[10, 37]], dtype=int64)
In this case, from the confusion matrix we have the following results:
tn, fp, fn, tp = confusion_matrix(y_true = data.y_true,
y_pred = data.y_pred,
labels = ['Negative', 'Positive']).ravel()
(tn, fp, fn, tp)
(13, 5, 10, 37)
Metrics are statistical measures of the performance of a binary classification model.
Sensitivity refers to the test’s ability to correctly detect negative class who do have the condition.
\[\Large \mbox{Sensitivity} = \dfrac{TP}{TP + FN}\]Specificity relates to the test’s ability to correctly reject positive class without a condition.
\[\Large \mbox{Specifity} = \dfrac{TN}{TN + FP}\]The mcm
has been developed as:
def mcm(tn, fp, fn, tp):
"""Let be a confusion matrix like this:
N P
+----+----+
| | |
| TN | FP |
| | |
+----+----+
| | |
| FN | TP |
| | |
+----+----+
The observed values by columns and the expected values by rows and the positive class in right column. With these definitions, the TN, FP, FN and TP values are that order.
Parameters
----------
TN : integer
True Negatives (TN) is the total number of outcomes where the model correctly predicts the negative class.
FP : integer
False Positives (FP) is the total number of outcomes where the model incorrectly predicts the positive class.
FN : integer
False Negatives (FN) is the total number of outcomes where the model incorrectly predicts the negative class.
TP : integer
True Positives (TP) is the total number of outcomes where the model correctly predicts the positive class.
Returns
-------
sum : float
Sum of values
Notes
-----
https://en.wikipedia.org/wiki/Confusion_matrix
https://developer.lsst.io/python/numpydoc.html
https://www.mathworks.com/help/risk/explore-fairness-metrics-for-credit-scoring-model.html
Examples
--------
data = pd.DataFrame({
'y_true': ['Positive']*47 + ['Negative']*18,
'y_pred': ['Positive']*37 + ['Negative']*10 + ['Positive']*5 + ['Negative']*13})
tn, fp, fn, tp = confusion_matrix(y_true = data.y_true,
y_pred = data.y_pred,
labels = ['Negative', 'Positive']).ravel()
"""
mcm = []
mcm.append(['Sensitivity', tp / (tp + fn)])
mcm.append(['Recall', tp / (tp + fn)])
mcm.append(['True Positive rate (TPR)', tp / (tp + fn)])
mcm.append(['Specificity', tn / (tn + fp)])
mcm.append(['True Negative Rate (TNR)', tn / (tn + fp)])
mcm.append(['Precision', tp / (tp + fp)])
mcm.append(['Positive Predictive Value (PPV)', tp / (tp + fp)])
mcm.append(['Negative Predictive Value (NPV)', tn / (tn + fn)])
mcm.append(['False Negative Rate (FNR)', fn / (fn + tp)])
mcm.append(['False Positive Rate (FPR)', fp / (fp + tn)])
mcm.append(['False Discovery Rate (FDR)', fp / (fp + tp)])
df_mcm.append(['Rate of Positive Predictions (PRR)'], (fp + tp) / (tn + tp + fn + fp))
df_mcm.append(['Rate of Negative Predictions (RNP)'], (fn + tn) / (tn + tp + fn + fp))
mcm.append(['Accuracy', (tp + tn) / (tp + tn + fp + fn)])
mcm.append(['F1 Score', 2*tp / (2*tp + fp + fn)])
tpr = tp / (tp + fn)
fpr = fp / (fp + tn)
fnr = fn / (fn + tp)
tnr = tn / (tn + fp)
mcm.append(['Positive Likelihood Ratio (LR+)', tpr / fpr])
mcm.append(['Negative Likelihood Ratio (LR-)', fnr / tnr])
return pd.DataFrame(mcm, columns = ['Metric', 'Value'])
?mcm
Let be a confusion matrix like this:
N P
+----+----+
| | |
| TN | FP |
| | |
+----+----+
| | |
| FN | TP |
| | |
+----+----+
The observed values by columns and the expected values by rows and the positive class in right column. With these definitions, the TN, FP, FN and TP values are that order.
Parameters
----------
TN : integer
True Negatives (TN) is the total number of outcomes where the model correctly predicts the negative class.
FP : integer
False Positives (FP) is the total number of outcomes where the model incorrectly predicts the positive class.
FN : integer
False Negatives (FN) is the total number of outcomes where the model incorrectly predicts the negative class.
TP : integer
True Positives (TP) is the total number of outcomes where the model correctly predicts the positive class.
Returns
-------
sum : DataFrame
DataFrame with several metrics
Notes
-----
https://en.wikipedia.org/wiki/Confusion_matrix
https://developer.lsst.io/python/numpydoc.html
https://www.mathworks.com/help/risk/explore-fairness-metrics-for-credit-scoring-model.html
Examples
--------
data = pd.DataFrame({
'y_true': ['Positive']*47 + ['Negative']*18,
'y_pred': ['Positive']*37 + ['Negative']*10 + ['Positive']*5 + ['Negative']*13})
tn, fp, fn, tp = confusion_matrix(y_true = data.y_true,
y_pred = data.y_pred,
labels = ['Negative', 'Positive']).ravel()
The arguments of mcm
function are: true positive (tn), false positive (fp), false negative (fn) and true positive (tp) is this order.
mcm(tn, fp, fn, tp)
Metric | Value | |
---|---|---|
0 | Sensitivity | 0.787234 |
1 | Recall | 0.787234 |
2 | True Positive rate (TPR) | 0.787234 |
3 | Specificity | 0.722222 |
4 | True Negative Rate (TNR) | 0.722222 |
5 | Precision | 0.880952 |
6 | Positive Predictive Value (PPV) | 0.880952 |
7 | Negative Predictive Value (NPV) | 0.565217 |
8 | False Negative Rate (FNR) | 0.212766 |
9 | False Positive Rate (FPR) | 0.277778 |
10 | False Discovery Rate (FDR) | 0.119048 |
11 | Rate of Positive Predictions (PRR) | 0.646154 |
12 | Rate of Negative Predictions (RNP) | 0.353846 |
13 | Accuracy | 0.769231 |
14 | F1 Score | 0.831461 |
15 | Positive Likelihood Ratio (LR+) | 2.834043 |
16 | Negative Likelihood Ratio (LR-) | 0.294599 |
We can call the function mcm
several times with a percentage of samples and estimate the distribution of each metrics.
In the following example, a 80% of the samples has been used in each iteration.
data = pd.DataFrame({
'y_true': ['Positive']*47 + ['Negative']*18,
'y_pred': ['Positive']*37 + ['Negative']*10 + ['Positive']*5 + ['Negative']*13})
mcm_bootstrap = []
for i in range(100):
aux = data.sample(frac = 0.8) # 80% of the samples
tn, fp, fn, tp =\
confusion_matrix(y_true = aux.y_true,
y_pred = aux.y_pred,
labels = ['Negative', 'Positive']).ravel()
mcm_bootstrap.append(mcm(tn, fp, fn, tp))
After 100 iterations we can be evaluate the mean, median, minimum, maximum and standar deviation for each metric.
pd\
.concat(mcm_bootstrap)\
.groupby('Metric')\
.agg({'Value' : ['mean', 'median', 'min', 'max', 'std']})
Value | |||||
---|---|---|---|---|---|
mean | median | min | max | std | |
Metric | |||||
Accuracy | 0.769423 | 0.769231 | 0.711538 | 0.826923 | 0.026687 |
F1 Score | 0.830781 | 0.828571 | 0.769231 | 0.883117 | 0.022230 |
False Discovery Rate (FDR) | 0.120614 | 0.121212 | 0.055556 | 0.166667 | 0.024679 |
False Negative Rate (FNR) | 0.212029 | 0.210526 | 0.131579 | 0.285714 | 0.031126 |
False Positive Rate (FPR) | 0.278755 | 0.285714 | 0.142857 | 0.454545 | 0.053325 |
Negative Likelihood Ratio (LR-) | 0.295672 | 0.294840 | 0.184211 | 0.416667 | 0.049432 |
Negative Predictive Value (NPV) | 0.568801 | 0.571429 | 0.375000 | 0.722222 | 0.054842 |
Positive Likelihood Ratio (LR+) | 2.942235 | 2.855263 | 1.824390 | 5.710526 | 0.641786 |
Positive Predictive Value (PPV) | 0.879386 | 0.878788 | 0.833333 | 0.944444 | 0.024679 |
Precision | 0.879386 | 0.878788 | 0.833333 | 0.944444 | 0.024679 |
Rate of Negative Predictions (RNP) | 0.354346 | 0.365385 | 0.250000 | 0.423077 | 0.030707 |
Rate of Positive Predictions (PRR) | 0.645654 | 0.634615 | 0.576923 | 0.750000 | 0.030707 |
Recall | 0.787971 | 0.789474 | 0.714286 | 0.868421 | 0.031126 |
Sensitivity | 0.787971 | 0.789474 | 0.714286 | 0.868421 | 0.031126 |
Specificity | 0.721245 | 0.714286 | 0.545455 | 0.857143 | 0.053325 |
True Negative Rate (TNR) | 0.721245 | 0.714286 | 0.545455 | 0.857143 | 0.053325 |
True Positive rate (TPR) | 0.787971 | 0.789474 | 0.714286 | 0.868421 | 0.031126 |
Loading the matplotlib and seaborn to display the results.
import matplotlib.pyplot as plt
import seaborn as sns
For example, if we want to display the distribution of accuracy we can execute the follwoing code.
aux = pd.concat(mcm_bootstrap)
aux\
.query('Metric == "Accuracy"')\
.plot(kind = 'density', label = 'Accuracy')
plt.show()
With seaborn is easy to display the distribution of all metrics.
g = sns.FacetGrid(pd.concat(mcm_bootstrap),
col = 'Metric',
col_wrap = 5,
sharey = False,
sharex = False)
g = g.map(sns.histplot, 'Value', bins = 12, kde = True, ec = 'white', )
g = g.set_titles('{col_name}', size = 12)
The mcm
function can help us to analyse a confusion matrix. If this confusion matrix come from a performed model, we can evaluate it with this function: Sensitivity and Specificity as principal metrics.
$ pylint mcm.py --good-names=tn,fp,fn,tp
--------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)