I have read the paper "Estimating the Support of a High-Dimensional Distribution" about one-class-svm which provided in sklearn.
I notice that the dual variables have a constrain that ∑_i α_i =1.
But when I tried to call api clf._dual_coef_
, I found the sum of the clf._dual_coef_
wouldn't be 1.
Did I miss any detail about it?
Thanks
For one-class SVM, LIBSVM solves a scaled problem that every α_i is multiplied by (νℓ), where ν is the hyper-parameter and ℓ is the number of instances. So the constraint becomes α_i≤1 and ∑_i α_i = νℓ.
In Section 2.3 of LIBSVM
Similar to the case of ν-SVC, in LIBSVM, we solve a scaled version of (7).
where in Section 2.2 (ν-Support Vector Classification) quoted
In LIBSVM, we solve a scaled version of problem (5) because numerically α_i may be too small due to the constraint α_i≤1/ℓ.
So for one-class SVM, LIBSVM solves a scaled problem because numerically α_i may be too small due to the constraint α_i≤1/(νℓ).
Specifically, because the question is about sklearn, I modify the code from here to confirm thought, though from my understanding sklearn.svm.OneClassSVM use LIBSVM in the backend.
from sklearn.svm import OneClassSVM
from sklearn.datasets import load_boston
X = load_boston()['data'][:, [8, 10]]
clf = OneClassSVM(nu=0.261, gamma=0.05)
clf.fit(X)
print(clf.nu*X.shape[0])
print(clf._dual_coef_.sum())
gives
132.066
132.06599999999918