Implementing vector machine support - EFFECTIVELY computes the K-gram matrix

I am implementing SVM for mnist data in Python while I use cvxopt to solve QP and get alpha.

But my problem is to calculate the K-gram matrix ** EFFECTIVELY **, I started with two classes (numbers 6 and 0), the number of training examples is less than the first 1k, the next 10K.

To calculate the whole 1k x 1k matrix faster, I use Process and give different input data for the calculation. But still it takes 2 minutes - its rbf is Gaussian. (10k alone is still working!)

If someone worked on this or maybe a Python lover, he can help me here, it will be great!

PS: If someone does not know the computational gram matrix, here are the details: Its simple:

for i in range(1k):
    for j in range(1k):
         for K[i,j] = some_fun(x[i], x[j])

where some_fun is a point product or a fantastic Gaussian.

I am using python 2.7, numpy and Mac Air 4G RAM, 128G solid state.

[EDIT] If anyone comes here! Yes, SVM takes more time ... and if you do multiple classification, you need to calculate the k-gram matrix again .. so it takes a lot of time, so I suggest implementing an implementation algorithm and double-checking it and letting it work overnight! But the next day you will see a good result, which is for sure! :)

+5
source share
1 answer

numpy, ? , numpy, , , Python, . , , x - ( , ):

# get a matrix where the (i, j)th element is |x[i] - x[j]|^2
# using the identity (x - y)^T (x - y) = x^T x + y^T y - 2 x^T y
pt_sq_norms = (x ** 2).sum(axis=1)
dists_sq = np.dot(x, x.T)
dists_sq *= -2
dists_sq += pt_sq_norms.reshape(-1, 1)
dists_sq += pt_sq_norms

# turn into an RBF gram matrix
km = dists_sq; del dists_sq
km *= (-sigma**2 / 2)
np.exp(km, km)  # exponentiates in-place

np.random.normal(size=(1000, 784)), 70 i5 iMac. 10 . 7 .

sklearn.metrics.pairwise.rbf_kernel , ..

, python 2 xrange(1000), range(1000). range , , , , . 10 000 , , , , .

+6

All Articles