How to set class weights for OneVsRestClassifier in scikit-learn? -
i need svm working multilabel classifier, decided use onevsrestclassifier wrapper. problem arises training set becomes highly unbalanced: given class there more negative examples positive. solved class_weight parameter, if use in classifier wrapped in onevsrestclassifier, error:
from sklearn.svm import linearsvc sklearn.multiclass import onevsrestclassifier weights = {'ham': 1, 'eggs': 2} svm = onevsrestclassifier(linearsvc(class_weight=weights)) x = [[1, 2], [3, 4], [5, 6], [7, 8], [9, 0]] y = [['ham'], [], ['eggs', 'spam'], ['spam'], ['eggs']] svm.fit(x, y)
traceback (most recent call last): file "", line 1, in file "/usr/local/lib/python2.7/site-packages/sklearn/multiclass.py", line 197, in fit n_jobs=self.n_jobs) file "/usr/local/lib/python2.7/site-packages/sklearn/multiclass.py", line 87, in fit_ovr in range(y.shape[1])) file "/usr/local/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 514, in __call__ self.dispatch(function, args, kwargs) file "/usr/local/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 311, in dispatch job = immediateapply(func, args, kwargs) file "/usr/local/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 135, in __init__ self.results = func(*args, **kwargs) file "/usr/local/lib/python2.7/site-packages/sklearn/multiclass.py", line 56, in _fit_binary estimator.fit(x, y) file "/usr/local/lib/python2.7/site-packages/sklearn/svm/base.py", line 681, in fit self.classes_, y) file "/usr/local/lib/python2.7/site-packages/sklearn/utils/class_weight.py", line 49, in compute_class_weight if classes[i] != c: indexerror: index 2 out of bounds axis 0 size 2
the problem linearsvc expects binary class [0, 1]. giving weights non-binary classes ('ham', 'egg' or [0,1,2]) fails. can use 'auto' weights instead, automatically "balances" classes choosing appropriate weights. work multiclass onevsrest classifier.
svm = onevsrestclassifier(linearsvc(class_weight='auto')) x = [[1, 2], [3, 4], [5, 4]] y = [0,1,2] svm.fit(x, y)
Comments
Post a Comment