Evaluate classification by compiling a report#

Specific metrics have been developed to evaluate classifier which has been trained using imbalanced data. imblearn provides a classification report similar to sklearn, with additional metrics specific to imbalanced learning problem.

                   pre       rec       spe        f1       geo       iba       sup

          0       0.42      0.84      0.88      0.56      0.86      0.73       123
          1       0.98      0.88      0.84      0.93      0.86      0.74      1127

avg / total       0.93      0.87      0.84      0.89      0.86      0.74      1250

# Authors: Guillaume Lemaitre <g.lemaitre58@gmail.com>
# License: MIT


from sklearn import datasets
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

from imblearn import over_sampling as os
from imblearn import pipeline as pl
from imblearn.metrics import classification_report_imbalanced

print(__doc__)

RANDOM_STATE = 42

# Generate a dataset
X, y = datasets.make_classification(
    n_classes=2,
    class_sep=2,
    weights=[0.1, 0.9],
    n_informative=10,
    n_redundant=1,
    flip_y=0,
    n_features=20,
    n_clusters_per_class=4,
    n_samples=5000,
    random_state=RANDOM_STATE,
)

pipeline = pl.make_pipeline(
    StandardScaler(),
    os.SMOTE(random_state=RANDOM_STATE),
    LogisticRegression(max_iter=10_000),
)

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=RANDOM_STATE)

# Train the classifier with balancing
pipeline.fit(X_train, y_train)

# Test the classifier and get the prediction
y_pred_bal = pipeline.predict(X_test)

# Show the classification report
print(classification_report_imbalanced(y_test, y_pred_bal))

Total running time of the script: ( 0 minutes 0.015 seconds)

Related examples

Multiclass classification with under-sampling

Multiclass classification with under-sampling

Metrics specific to imbalanced learning

Metrics specific to imbalanced learning

Usage of pipeline embedding samplers

Usage of pipeline embedding samplers

Example of topic classification in text documents

Example of topic classification in text documents

Compare over-sampling samplers

Compare over-sampling samplers

Gallery generated by Sphinx-Gallery