## Gaussian Process Kernels

As I point out in http://www.jianping-lai.com/2017/03/10/guassian-process/, kernel can be decomposed into $$\vec{\epsilon}$$, where $$\vec{x} = \vec{m} +\vec{\epsilon}$$ and $$\vec{\epsilon}=\bf{L}*\vec{u}$$.

For linear kernel $$k(x_i,x_j) = x_i*x_j$$,

[simterm]
%%%%%%%%%%%% SVD %%%%%%%%%%%%%
+0.00 +0.00 +0.00 +0.00 +0.00
-0.20 +0.00 -0.00 -0.00 +0.00
-0.40 +0.00 +0.00 -0.00 +0.00
-0.60 -0.00 -0.00 -0.00 +0.00
-0.80 +0.00 +0.00 +0.00 +0.00
%%%%%%%%% Cholesky %%%%%%%%%%%
+0.00 +0.00 +0.00 +0.00 +0.00
+0.00 +0.20 +0.00 +0.00 +0.00
+0.00 +0.40 +0.00 +0.00 +0.00
+0.00 +0.60 +0.00 +0.00 +0.00
+0.00 +0.80 +0.00 +0.00 +0.00
[/simterm]

both the SVD and Cholesky decomposition leads to that
$$x_0 = 0.0 u$$
$$x_1 = 0.2 u$$
$$x_2 = 0.4 u$$
$$x_3 = 0.6 u$$
$$x_4 = 0.8 u$$

as we have $$u_1 \sim u_2 \sim u \sim -u$$. These results leads to a straight line.

[simterm]
%%%%%%%%%%%% SVD %%%%%%%%%%%%%
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
%%%%%%%%% Cholesky %%%%%%%%%%%
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
[/simterm]
This result is essentially saying

Difference between y’s of two nearby data points $$\epsilon_{i+1} – \epsilon_{i}$$ has a constant standard deviation of $$\sqrt{0.1}$$ (subscript represents for data points sequentially). This gives you a randomized but continuous data structure.

## Guassian Process

for Gaussian Process, SVD should be equivalent to Cholesky decomposition; when co-variance matrix is positive definite, they should be equal.

as random variable

$$!\vec{x} =\vec{m}+\vec{\epsilon}$$
$$!\vec{\epsilon} = \bf{L}*\vec{u}$$
where $$\bf{L}$$ is the covariance matrix and $$\vec{u}\sim N(\vec{0},\bf{I})$$

## Chapter 1. Probability and inference

The core of this chapter is the conditional probability:

$$P(\theta|y)=\frac{P(y|\theta)P(\theta)}{\sum_\theta P(y|\theta)P(\theta)}}$$
Continue reading “Chapter 1. Probability and inference”