GPFLow: get the full covariance matrix and find its entropy

I would like to compute the determinant of the covariance matrix of a GP regression in GPFlow. I am guessing I can get the covariance matrix with this function:

GPModel.predict_f_full_cov

This function was suggested here:

https://gpflow.readthedocs.io/en/develop/notebooks/regression.html

However, I have no idea how to use this function or what it returns. I need to know a function that returns the covariance matrix for my entire model and then I need to know how to compute the determinant of it.

After some effort, I figured out how to give predict_f_full_cov some points I am interested in, as we see here:

c = m.predict_f_full_cov(np.array([[.2],[.4],[.6],[.8]])))

This returned two arrays, the first of which is the mean of the predicted function for points I asked for along the x-axis. The second array is a bit of a mystery. I am guessing this is the covariance matrix. I pulled it out using this:

covMatrix = m.predict_f_full_cov(np.array([[.2],[.4],[.6],[.8]]))[1][0]

Then I looked up how to compute the determinant, like so:

x = np.linalg.det(covMatrix)

Then I computed the log of this to get an entropy for the covariance matrix:

print(-10*math.log(np.linalg.det(covMatrix)))

I ran this twice using two different sets of data. The first had high noise, the second had low noise. Strangely, the entropy went up for the lower noise data set. I am at a loss.

I found that if I just compute the covariance matrix on a small region, which should be linear, turning the noise up and down does not do what I expect. Also, if I regress the GP to a large number of points, the determinant goes to 0.0.

Here is the code I am using:

import gpflow

import numpy as np

N = 300

noiseSize = 0.01

X = np.random.rand(N,1)

Y = np.sin(12*X) + 0.66*np.cos(25*X)  + np.random.randn(N,1)*noiseSize + 3

k = gpflow.kernels.Matern52(1, lengthscales=0.3)

m = gpflow.models.GPR(X, Y, kern=k)

m.likelihood.variance = 0.01

aRange = np.linspace(0.1,0.9,200)

newRange = 

for point in aRange:

    newRange.append([point])

covMatrix = m.predict_f_full_cov(newRange)[1][0]

import math

print("Determinant: " + str(np.linalg.det(covMatrix)))

print(-10*math.log(np.linalg.det(covMatrix)))

edited Nov 20 '18 at 14:19

jdehesa

23.5k43454

asked Nov 16 '18 at 21:26

user442920

48821237

add a comment |

I would like to compute the determinant of the covariance matrix of a GP regression in GPFlow. I am guessing I can get the covariance matrix with this function:

GPModel.predict_f_full_cov

This function was suggested here:

https://gpflow.readthedocs.io/en/develop/notebooks/regression.html

After some effort, I figured out how to give predict_f_full_cov some points I am interested in, as we see here:

c = m.predict_f_full_cov(np.array([[.2],[.4],[.6],[.8]])))

covMatrix = m.predict_f_full_cov(np.array([[.2],[.4],[.6],[.8]]))[1][0]

Then I looked up how to compute the determinant, like so:

x = np.linalg.det(covMatrix)

Then I computed the log of this to get an entropy for the covariance matrix:

print(-10*math.log(np.linalg.det(covMatrix)))

I ran this twice using two different sets of data. The first had high noise, the second had low noise. Strangely, the entropy went up for the lower noise data set. I am at a loss.

Here is the code I am using:

import gpflow

import numpy as np

N = 300

noiseSize = 0.01

X = np.random.rand(N,1)

Y = np.sin(12*X) + 0.66*np.cos(25*X)  + np.random.randn(N,1)*noiseSize + 3

k = gpflow.kernels.Matern52(1, lengthscales=0.3)

m = gpflow.models.GPR(X, Y, kern=k)

m.likelihood.variance = 0.01

aRange = np.linspace(0.1,0.9,200)

newRange = 

for point in aRange:

    newRange.append([point])

covMatrix = m.predict_f_full_cov(newRange)[1][0]

import math

print("Determinant: " + str(np.linalg.det(covMatrix)))

print(-10*math.log(np.linalg.det(covMatrix)))

edited Nov 20 '18 at 14:19

jdehesa

23.5k43454

asked Nov 16 '18 at 21:26

user442920

48821237

add a comment |

I would like to compute the determinant of the covariance matrix of a GP regression in GPFlow. I am guessing I can get the covariance matrix with this function:

GPModel.predict_f_full_cov

This function was suggested here:

https://gpflow.readthedocs.io/en/develop/notebooks/regression.html

After some effort, I figured out how to give predict_f_full_cov some points I am interested in, as we see here:

c = m.predict_f_full_cov(np.array([[.2],[.4],[.6],[.8]])))

covMatrix = m.predict_f_full_cov(np.array([[.2],[.4],[.6],[.8]]))[1][0]

Then I looked up how to compute the determinant, like so:

x = np.linalg.det(covMatrix)

Then I computed the log of this to get an entropy for the covariance matrix:

print(-10*math.log(np.linalg.det(covMatrix)))

I ran this twice using two different sets of data. The first had high noise, the second had low noise. Strangely, the entropy went up for the lower noise data set. I am at a loss.

Here is the code I am using:

import gpflow

import numpy as np

N = 300

noiseSize = 0.01

X = np.random.rand(N,1)

Y = np.sin(12*X) + 0.66*np.cos(25*X)  + np.random.randn(N,1)*noiseSize + 3

k = gpflow.kernels.Matern52(1, lengthscales=0.3)

m = gpflow.models.GPR(X, Y, kern=k)

m.likelihood.variance = 0.01

aRange = np.linspace(0.1,0.9,200)

newRange = 

for point in aRange:

    newRange.append([point])

covMatrix = m.predict_f_full_cov(newRange)[1][0]

import math

print("Determinant: " + str(np.linalg.det(covMatrix)))

print(-10*math.log(np.linalg.det(covMatrix)))

edited Nov 20 '18 at 14:19

jdehesa

23.5k43454

asked Nov 16 '18 at 21:26

user442920

48821237

I would like to compute the determinant of the covariance matrix of a GP regression in GPFlow. I am guessing I can get the covariance matrix with this function:

GPModel.predict_f_full_cov

This function was suggested here:

https://gpflow.readthedocs.io/en/develop/notebooks/regression.html

After some effort, I figured out how to give predict_f_full_cov some points I am interested in, as we see here:

c = m.predict_f_full_cov(np.array([[.2],[.4],[.6],[.8]])))

covMatrix = m.predict_f_full_cov(np.array([[.2],[.4],[.6],[.8]]))[1][0]

Then I looked up how to compute the determinant, like so:

x = np.linalg.det(covMatrix)

Then I computed the log of this to get an entropy for the covariance matrix:

print(-10*math.log(np.linalg.det(covMatrix)))

I ran this twice using two different sets of data. The first had high noise, the second had low noise. Strangely, the entropy went up for the lower noise data set. I am at a loss.

Here is the code I am using:

import gpflow

import numpy as np

N = 300

noiseSize = 0.01

X = np.random.rand(N,1)

Y = np.sin(12*X) + 0.66*np.cos(25*X)  + np.random.randn(N,1)*noiseSize + 3

k = gpflow.kernels.Matern52(1, lengthscales=0.3)

m = gpflow.models.GPR(X, Y, kern=k)

m.likelihood.variance = 0.01

aRange = np.linspace(0.1,0.9,200)

newRange = 

for point in aRange:

    newRange.append([point])

covMatrix = m.predict_f_full_cov(newRange)[1][0]

import math

print("Determinant: " + str(np.linalg.det(covMatrix)))

print(-10*math.log(np.linalg.det(covMatrix)))

python tensorflow statistics gaussian gpflow

edited Nov 20 '18 at 14:19

jdehesa

23.5k43454

asked Nov 16 '18 at 21:26

user442920

48821237

edited Nov 20 '18 at 14:19

jdehesa

23.5k43454

asked Nov 16 '18 at 21:26

user442920

48821237

edited Nov 20 '18 at 14:19

jdehesa

23.5k43454

edited Nov 20 '18 at 14:19

jdehesa

23.5k43454

edited Nov 20 '18 at 14:19

jdehesa

23.5k43454

asked Nov 16 '18 at 21:26

user442920

48821237

asked Nov 16 '18 at 21:26

user442920

48821237

asked Nov 16 '18 at 21:26

user442920

48821237

add a comment |

1 Answer
1

active

oldest

votes

So, first things first, the entropy of a multivariate normal (and a GP, given a fixed set of points on which it's evaluated) only depends on its covariance matrix.

Answers to your questions:

Yes - when you make the set $X$ more and more dense, you're making the covariance matrix larger and larger, and for many simple covariance kernels, this makes the determinant smaller and smaller. My guess is that this is due to the fact that determinants of large matrices have a lot of product terms (see the Leibniz formula) and products of terms less than one tend to zero faster than their sums. You can verify this easily:

Dimension and Covar

Code for this:

import numpy as np

import matplotlib.pyplot as plt

import sklearn.gaussian_process.kernels as k



plt.style.use("ggplot"); plt.ion()



n = np.linspace(2, 25, 23, dtype = int)

d = np.zeros(len(n))



for i in range(len(n)):

    X = np.linspace(-1, 1, n[i]).reshape(-1, 1)

    S = k.RBF()(X)

    d[i] = np.log(np.linalg.det(S))



plt.scatter(n, d)

plt.ylabel("Log Determinant of Covariance Matrix")

plt.xlabel("Dimension of Covariance Matrix")

Before moving onto the next point, do note that the entropy of a multivariate normal also has a contribution from size of the matrix, so even though the determinant shoots off to zero, there's a small contribution from the dimension.

With decreasing noise, as one would expect, the entropy & determinant do decrease but not tend to zero exactly; they'll decrease to the determinant due to the other kernels present in the covariance. For the demonstration below, the dimension of the covariance is kept constant ($10*10$) and the noise level is increased from 0:

Error and Determinant

Code:

e = np.logspace(1, -10, 30)

d = np.zeros(len(e))

X = np.linspace(-1, 1, 10).reshape(-1, 1)



for i in range(len(e)):

    S = (k.RBF() + k.WhiteKernel(e[i])) (X)

    d[i] = np.log(np.linalg.det(S))



e = np.log(e)



plt.scatter(e, d)

plt.ylabel("Log Determinant")

plt.xlabel("Log Error")

answered Nov 26 '18 at 18:07

user442920

48821237

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53345624%2fgpflow-get-the-full-covariance-matrix-and-find-its-entropy%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

So, first things first, the entropy of a multivariate normal (and a GP, given a fixed set of points on which it's evaluated) only depends on its covariance matrix.

Answers to your questions:

Yes - when you make the set $X$ more and more dense, you're making the covariance matrix larger and larger, and for many simple covariance kernels, this makes the determinant smaller and smaller. My guess is that this is due to the fact that determinants of large matrices have a lot of product terms (see the Leibniz formula) and products of terms less than one tend to zero faster than their sums. You can verify this easily:

Dimension and Covar

Code for this:

import numpy as np

import matplotlib.pyplot as plt

import sklearn.gaussian_process.kernels as k



plt.style.use("ggplot"); plt.ion()



n = np.linspace(2, 25, 23, dtype = int)

d = np.zeros(len(n))



for i in range(len(n)):

    X = np.linspace(-1, 1, n[i]).reshape(-1, 1)

    S = k.RBF()(X)

    d[i] = np.log(np.linalg.det(S))



plt.scatter(n, d)

plt.ylabel("Log Determinant of Covariance Matrix")

plt.xlabel("Dimension of Covariance Matrix")

With decreasing noise, as one would expect, the entropy & determinant do decrease but not tend to zero exactly; they'll decrease to the determinant due to the other kernels present in the covariance. For the demonstration below, the dimension of the covariance is kept constant ($10*10$) and the noise level is increased from 0:

Error and Determinant

Code:

e = np.logspace(1, -10, 30)

d = np.zeros(len(e))

X = np.linspace(-1, 1, 10).reshape(-1, 1)



for i in range(len(e)):

    S = (k.RBF() + k.WhiteKernel(e[i])) (X)

    d[i] = np.log(np.linalg.det(S))



e = np.log(e)



plt.scatter(e, d)

plt.ylabel("Log Determinant")

plt.xlabel("Log Error")

answered Nov 26 '18 at 18:07

user442920

48821237

add a comment |

So, first things first, the entropy of a multivariate normal (and a GP, given a fixed set of points on which it's evaluated) only depends on its covariance matrix.

Answers to your questions:

Yes - when you make the set $X$ more and more dense, you're making the covariance matrix larger and larger, and for many simple covariance kernels, this makes the determinant smaller and smaller. My guess is that this is due to the fact that determinants of large matrices have a lot of product terms (see the Leibniz formula) and products of terms less than one tend to zero faster than their sums. You can verify this easily:

Dimension and Covar

Code for this:

import numpy as np

import matplotlib.pyplot as plt

import sklearn.gaussian_process.kernels as k



plt.style.use("ggplot"); plt.ion()



n = np.linspace(2, 25, 23, dtype = int)

d = np.zeros(len(n))



for i in range(len(n)):

    X = np.linspace(-1, 1, n[i]).reshape(-1, 1)

    S = k.RBF()(X)

    d[i] = np.log(np.linalg.det(S))



plt.scatter(n, d)

plt.ylabel("Log Determinant of Covariance Matrix")

plt.xlabel("Dimension of Covariance Matrix")

With decreasing noise, as one would expect, the entropy & determinant do decrease but not tend to zero exactly; they'll decrease to the determinant due to the other kernels present in the covariance. For the demonstration below, the dimension of the covariance is kept constant ($10*10$) and the noise level is increased from 0:

Error and Determinant

Code:

e = np.logspace(1, -10, 30)

d = np.zeros(len(e))

X = np.linspace(-1, 1, 10).reshape(-1, 1)



for i in range(len(e)):

    S = (k.RBF() + k.WhiteKernel(e[i])) (X)

    d[i] = np.log(np.linalg.det(S))



e = np.log(e)



plt.scatter(e, d)

plt.ylabel("Log Determinant")

plt.xlabel("Log Error")

answered Nov 26 '18 at 18:07

user442920

48821237

add a comment |

So, first things first, the entropy of a multivariate normal (and a GP, given a fixed set of points on which it's evaluated) only depends on its covariance matrix.

Answers to your questions:

Yes - when you make the set $X$ more and more dense, you're making the covariance matrix larger and larger, and for many simple covariance kernels, this makes the determinant smaller and smaller. My guess is that this is due to the fact that determinants of large matrices have a lot of product terms (see the Leibniz formula) and products of terms less than one tend to zero faster than their sums. You can verify this easily:

Dimension and Covar

Code for this:

import numpy as np

import matplotlib.pyplot as plt

import sklearn.gaussian_process.kernels as k



plt.style.use("ggplot"); plt.ion()



n = np.linspace(2, 25, 23, dtype = int)

d = np.zeros(len(n))



for i in range(len(n)):

    X = np.linspace(-1, 1, n[i]).reshape(-1, 1)

    S = k.RBF()(X)

    d[i] = np.log(np.linalg.det(S))



plt.scatter(n, d)

plt.ylabel("Log Determinant of Covariance Matrix")

plt.xlabel("Dimension of Covariance Matrix")

With decreasing noise, as one would expect, the entropy & determinant do decrease but not tend to zero exactly; they'll decrease to the determinant due to the other kernels present in the covariance. For the demonstration below, the dimension of the covariance is kept constant ($10*10$) and the noise level is increased from 0:

Error and Determinant

Code:

e = np.logspace(1, -10, 30)

d = np.zeros(len(e))

X = np.linspace(-1, 1, 10).reshape(-1, 1)



for i in range(len(e)):

    S = (k.RBF() + k.WhiteKernel(e[i])) (X)

    d[i] = np.log(np.linalg.det(S))



e = np.log(e)



plt.scatter(e, d)

plt.ylabel("Log Determinant")

plt.xlabel("Log Error")

answered Nov 26 '18 at 18:07

user442920

48821237

So, first things first, the entropy of a multivariate normal (and a GP, given a fixed set of points on which it's evaluated) only depends on its covariance matrix.

Answers to your questions:

Yes - when you make the set $X$ more and more dense, you're making the covariance matrix larger and larger, and for many simple covariance kernels, this makes the determinant smaller and smaller. My guess is that this is due to the fact that determinants of large matrices have a lot of product terms (see the Leibniz formula) and products of terms less than one tend to zero faster than their sums. You can verify this easily:

Dimension and Covar

Code for this:

import numpy as np

import matplotlib.pyplot as plt

import sklearn.gaussian_process.kernels as k



plt.style.use("ggplot"); plt.ion()



n = np.linspace(2, 25, 23, dtype = int)

d = np.zeros(len(n))



for i in range(len(n)):

    X = np.linspace(-1, 1, n[i]).reshape(-1, 1)

    S = k.RBF()(X)

    d[i] = np.log(np.linalg.det(S))



plt.scatter(n, d)

plt.ylabel("Log Determinant of Covariance Matrix")

plt.xlabel("Dimension of Covariance Matrix")

With decreasing noise, as one would expect, the entropy & determinant do decrease but not tend to zero exactly; they'll decrease to the determinant due to the other kernels present in the covariance. For the demonstration below, the dimension of the covariance is kept constant ($10*10$) and the noise level is increased from 0:

Error and Determinant

Code:

e = np.logspace(1, -10, 30)

d = np.zeros(len(e))

X = np.linspace(-1, 1, 10).reshape(-1, 1)



for i in range(len(e)):

    S = (k.RBF() + k.WhiteKernel(e[i])) (X)

    d[i] = np.log(np.linalg.det(S))



e = np.log(e)



plt.scatter(e, d)

plt.ylabel("Log Determinant")

plt.xlabel("Log Error")

answered Nov 26 '18 at 18:07

user442920

48821237

answered Nov 26 '18 at 18:07

user442920

48821237

answered Nov 26 '18 at 18:07

user442920

48821237

answered Nov 26 '18 at 18:07

user442920

48821237

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Wsrtjtyk