As I point out in http://www.jianping-lai.com/2017/03/10/guassian-process/, kernel can be decomposed into $$\vec{\epsilon}$$, where $$\vec{x} = \vec{m} +\vec{\epsilon}$$ and $$\vec{\epsilon}=\bf{L}*\vec{u}$$.

For linear kernel $$k(x_i,x_j) = x_i*x_j$$,

import numpy as np def k(a,b): return a*b x = np.arange(0,1.0,0.2) n = x.shape[0] c = np.zeros((n,n)) for i in range(n): for j in range(n): c[i,j] = k(x[i],x[j]) A,S,B = np.linalg.svd(c, full_matrices=False) L = np.linalg.cholesky(c+1e-15*np.eye(n)) z = np.matmul(A,np.sqrt(np.diag(S))) def Print(m): for row in m: for e in row: print '{:+.2f}'.format(e), print Print(z) print "########" Print(L)

[simterm]

%%%%%%%%%%%% SVD %%%%%%%%%%%%%

+0.00 +0.00 +0.00 +0.00 +0.00

-0.20 +0.00 -0.00 -0.00 +0.00

-0.40 +0.00 +0.00 -0.00 +0.00

-0.60 -0.00 -0.00 -0.00 +0.00

-0.80 +0.00 +0.00 +0.00 +0.00

%%%%%%%%% Cholesky %%%%%%%%%%%

+0.00 +0.00 +0.00 +0.00 +0.00

+0.00 +0.20 +0.00 +0.00 +0.00

+0.00 +0.40 +0.00 +0.00 +0.00

+0.00 +0.60 +0.00 +0.00 +0.00

+0.00 +0.80 +0.00 +0.00 +0.00

[/simterm]

both the SVD and Cholesky decomposition leads to that

$$x_0 = 0.0 u$$

$$x_1 = 0.2 u$$

$$x_2 = 0.4 u$$

$$x_3 = 0.6 u$$

$$x_4 = 0.8 u$$

as we have $$u_1 \sim u_2 \sim u \sim -u$$. These results leads to a straight line.

import numpy as np def k(a,b): return min(a,b) x = np.arange(0,1.0,0.1) n = x.shape[0] c = np.zeros((n,n)) for i in range(n): for j in range(n): c[i,j] = k(x[i],x[j]) A,S,B = np.linalg.svd(c, full_matrices=False) L = np.linalg.cholesky(c+1e-15*np.eye(n)) z = np.matmul(A,np.sqrt(np.diag(S))) def diff(m): for idx in range(m.shape[0]-1): print np.sum(np.square(m[idx+1] - m[idx])) diff(z) print "########" diff(L)

[simterm]

%%%%%%%%%%%% SVD %%%%%%%%%%%%%

0.1

0.1

0.1

0.1

0.1

0.1

0.1

0.1

0.1

%%%%%%%%% Cholesky %%%%%%%%%%%

0.1

0.1

0.1

0.1

0.1

0.1

0.1

0.1

0.1

[/simterm]

This result is essentially saying

Difference between y’s of two nearby data points $$\epsilon_{i+1} – \epsilon_{i}$$ has a constant standard deviation of $$\sqrt{0.1}$$ (subscript represents for data points sequentially). This gives you a randomized but continuous data structure.