Ranking 4 Classification

Class imbalance is pervasive in certain classification domains. That is, the class distribution is not uniform, in some cases in an extreme fashion. This is a common problem in medicine and health care where there is a wide dispersion of patients suffering from different disease severities; it is inherent in fraud and fault detection where the anomaly is rare; and in many other fields.

These models were produced during the development of the following paper. Please cite the paper if you use them.

Cruz, R., Fernandes, K., Cardoso, J. S., & Costa, J. F. P. (2016, July). Tackling class imbalance with ranking. In Neural Networks (IJCNN), 2016 International Joint Conference on (pp. 2182-2187). IEEE. [paper]

These models can be used for traditional ranking problems or be used for classification by using the Threshold wrapper. They were only tested in some contexts, they need to be tested more throughly --- let me know if you have problems.

Usage example:

from sklearn.datasets import load_breast_cancer
X, y = load_breast_cancer(True)  # binary dataset

from sklearn.metrics import f1_score
from sklearn.model_selection import train_test_split
from adaboost import RankBoost

Xtr, Xts, ytr, yts = train_test_split(X, y)

model = Threshold(RankBoost(100))
model.fit(Xtr, ytr)
yp = model.predict(Xts)
print(f1_score(yts, yp))

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
README.md		README.md
adaboost.py		adaboost.py
gbrank.py		gbrank.py
neuralnet.py		neuralnet.py
ranksvm.py		ranksvm.py
run.py		run.py
threshold.py		threshold.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ranking 4 Classification

About

Releases

Packages

Languages

rpmcruz/ranking4classification

Folders and files

Latest commit

History

Repository files navigation

Ranking 4 Classification

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages