Warm tip: This article is reproduced from stackoverflow.com, please click
python scikit-learn svm libsvm

Confuse about "dual_coef_" of OneClassSVM

发布于 2020-03-29 12:47:31

I have read the paper "Estimating the Support of a High-Dimensional Distribution" about one-class-svm which provided in sklearn.

I notice that the dual variables have a constrain that ∑_i α_i =1.

But when I tried to call api clf._dual_coef_, I found the sum of the clf._dual_coef_ wouldn't be 1.

Did I miss any detail about it?

Thanks

Questioner
Gu.lol
Viewed
44
hychou 2020-01-31 17:28

Result

For one-class SVM, LIBSVM solves a scaled problem that every α_i is multiplied by (νℓ), where ν is the hyper-parameter and ℓ is the number of instances. So the constraint becomes α_i≤1 and ∑_i α_i = νℓ.


Reason

In Section 2.3 of LIBSVM

Similar to the case of ν-SVC, in LIBSVM, we solve a scaled version of (7).

where in Section 2.2 (ν-Support Vector Classification) quoted

In LIBSVM, we solve a scaled version of problem (5) because numerically α_i may be too small due to the constraint α_i≤1/ℓ.

So for one-class SVM, LIBSVM solves a scaled problem because numerically α_i may be too small due to the constraint α_i≤1/(νℓ).


Verification

Specifically, because the question is about sklearn, I modify the code from here to confirm thought, though from my understanding sklearn.svm.OneClassSVM use LIBSVM in the backend.

from sklearn.svm import OneClassSVM
from sklearn.datasets import load_boston

X = load_boston()['data'][:, [8, 10]]
clf = OneClassSVM(nu=0.261, gamma=0.05)
clf.fit(X)

print(clf.nu*X.shape[0])
print(clf._dual_coef_.sum())

gives

132.066
132.06599999999918