I use the Python interface for libsvm, and I notice that after choosing the best Cand gammaparameters (RBF kernel) using a grid search, when I train the model and cross-check it (5 times, if relevant), then the accuracy I I get the same as the label ratio in my training dataset.
I have 3947 samples, and 2898 of them have a label of -1, and the rest have a label of 1. Thus, 73.4229% of the samples.
And when I train the model and cross-check it 5 times, this is what I get -
optimization finished,
nu = 0.531517 obj = -209.738688,
rho = 0.997250 nSV = 1847, nBSV = 1534
Total nSV = 1847
Cross Validation Accuracy = 73.4229%
Does this mean that SVM does not take these functions into account? Or is this fault data here? Are they both connected at all? I just could not get past the number 73.4229. In addition, the number of supporting vectors should be much smaller than the size of the data set, but in this case it is not.
In general, what does this mean when the accuracy of checking cross-references matches the label relationship in the dataset?
source
share