GPFLow: get the full covariance matrix and find its entropy












2















I would like to compute the determinant of the covariance matrix of a GP regression in GPFlow. I am guessing I can get the covariance matrix with this function:



GPModel.predict_f_full_cov


This function was suggested here:



https://gpflow.readthedocs.io/en/develop/notebooks/regression.html



However, I have no idea how to use this function or what it returns. I need to know a function that returns the covariance matrix for my entire model and then I need to know how to compute the determinant of it.



After some effort, I figured out how to give predict_f_full_cov some points I am interested in, as we see here:



c = m.predict_f_full_cov(np.array([[.2],[.4],[.6],[.8]])))


This returned two arrays, the first of which is the mean of the predicted function for points I asked for along the x-axis. The second array is a bit of a mystery. I am guessing this is the covariance matrix. I pulled it out using this:



covMatrix = m.predict_f_full_cov(np.array([[.2],[.4],[.6],[.8]]))[1][0]


Then I looked up how to compute the determinant, like so:



x = np.linalg.det(covMatrix)


Then I computed the log of this to get an entropy for the covariance matrix:



print(-10*math.log(np.linalg.det(covMatrix)))


I ran this twice using two different sets of data. The first had high noise, the second had low noise. Strangely, the entropy went up for the lower noise data set. I am at a loss.



I found that if I just compute the covariance matrix on a small region, which should be linear, turning the noise up and down does not do what I expect. Also, if I regress the GP to a large number of points, the determinant goes to 0.0.



Here is the code I am using:



import gpflow
import numpy as np
N = 300
noiseSize = 0.01
X = np.random.rand(N,1)
Y = np.sin(12*X) + 0.66*np.cos(25*X) + np.random.randn(N,1)*noiseSize + 3
k = gpflow.kernels.Matern52(1, lengthscales=0.3)
m = gpflow.models.GPR(X, Y, kern=k)
m.likelihood.variance = 0.01
aRange = np.linspace(0.1,0.9,200)
newRange =
for point in aRange:
newRange.append([point])
covMatrix = m.predict_f_full_cov(newRange)[1][0]
import math
print("Determinant: " + str(np.linalg.det(covMatrix)))
print(-10*math.log(np.linalg.det(covMatrix)))









share|improve this question





























    2















    I would like to compute the determinant of the covariance matrix of a GP regression in GPFlow. I am guessing I can get the covariance matrix with this function:



    GPModel.predict_f_full_cov


    This function was suggested here:



    https://gpflow.readthedocs.io/en/develop/notebooks/regression.html



    However, I have no idea how to use this function or what it returns. I need to know a function that returns the covariance matrix for my entire model and then I need to know how to compute the determinant of it.



    After some effort, I figured out how to give predict_f_full_cov some points I am interested in, as we see here:



    c = m.predict_f_full_cov(np.array([[.2],[.4],[.6],[.8]])))


    This returned two arrays, the first of which is the mean of the predicted function for points I asked for along the x-axis. The second array is a bit of a mystery. I am guessing this is the covariance matrix. I pulled it out using this:



    covMatrix = m.predict_f_full_cov(np.array([[.2],[.4],[.6],[.8]]))[1][0]


    Then I looked up how to compute the determinant, like so:



    x = np.linalg.det(covMatrix)


    Then I computed the log of this to get an entropy for the covariance matrix:



    print(-10*math.log(np.linalg.det(covMatrix)))


    I ran this twice using two different sets of data. The first had high noise, the second had low noise. Strangely, the entropy went up for the lower noise data set. I am at a loss.



    I found that if I just compute the covariance matrix on a small region, which should be linear, turning the noise up and down does not do what I expect. Also, if I regress the GP to a large number of points, the determinant goes to 0.0.



    Here is the code I am using:



    import gpflow
    import numpy as np
    N = 300
    noiseSize = 0.01
    X = np.random.rand(N,1)
    Y = np.sin(12*X) + 0.66*np.cos(25*X) + np.random.randn(N,1)*noiseSize + 3
    k = gpflow.kernels.Matern52(1, lengthscales=0.3)
    m = gpflow.models.GPR(X, Y, kern=k)
    m.likelihood.variance = 0.01
    aRange = np.linspace(0.1,0.9,200)
    newRange =
    for point in aRange:
    newRange.append([point])
    covMatrix = m.predict_f_full_cov(newRange)[1][0]
    import math
    print("Determinant: " + str(np.linalg.det(covMatrix)))
    print(-10*math.log(np.linalg.det(covMatrix)))









    share|improve this question



























      2












      2








      2








      I would like to compute the determinant of the covariance matrix of a GP regression in GPFlow. I am guessing I can get the covariance matrix with this function:



      GPModel.predict_f_full_cov


      This function was suggested here:



      https://gpflow.readthedocs.io/en/develop/notebooks/regression.html



      However, I have no idea how to use this function or what it returns. I need to know a function that returns the covariance matrix for my entire model and then I need to know how to compute the determinant of it.



      After some effort, I figured out how to give predict_f_full_cov some points I am interested in, as we see here:



      c = m.predict_f_full_cov(np.array([[.2],[.4],[.6],[.8]])))


      This returned two arrays, the first of which is the mean of the predicted function for points I asked for along the x-axis. The second array is a bit of a mystery. I am guessing this is the covariance matrix. I pulled it out using this:



      covMatrix = m.predict_f_full_cov(np.array([[.2],[.4],[.6],[.8]]))[1][0]


      Then I looked up how to compute the determinant, like so:



      x = np.linalg.det(covMatrix)


      Then I computed the log of this to get an entropy for the covariance matrix:



      print(-10*math.log(np.linalg.det(covMatrix)))


      I ran this twice using two different sets of data. The first had high noise, the second had low noise. Strangely, the entropy went up for the lower noise data set. I am at a loss.



      I found that if I just compute the covariance matrix on a small region, which should be linear, turning the noise up and down does not do what I expect. Also, if I regress the GP to a large number of points, the determinant goes to 0.0.



      Here is the code I am using:



      import gpflow
      import numpy as np
      N = 300
      noiseSize = 0.01
      X = np.random.rand(N,1)
      Y = np.sin(12*X) + 0.66*np.cos(25*X) + np.random.randn(N,1)*noiseSize + 3
      k = gpflow.kernels.Matern52(1, lengthscales=0.3)
      m = gpflow.models.GPR(X, Y, kern=k)
      m.likelihood.variance = 0.01
      aRange = np.linspace(0.1,0.9,200)
      newRange =
      for point in aRange:
      newRange.append([point])
      covMatrix = m.predict_f_full_cov(newRange)[1][0]
      import math
      print("Determinant: " + str(np.linalg.det(covMatrix)))
      print(-10*math.log(np.linalg.det(covMatrix)))









      share|improve this question
















      I would like to compute the determinant of the covariance matrix of a GP regression in GPFlow. I am guessing I can get the covariance matrix with this function:



      GPModel.predict_f_full_cov


      This function was suggested here:



      https://gpflow.readthedocs.io/en/develop/notebooks/regression.html



      However, I have no idea how to use this function or what it returns. I need to know a function that returns the covariance matrix for my entire model and then I need to know how to compute the determinant of it.



      After some effort, I figured out how to give predict_f_full_cov some points I am interested in, as we see here:



      c = m.predict_f_full_cov(np.array([[.2],[.4],[.6],[.8]])))


      This returned two arrays, the first of which is the mean of the predicted function for points I asked for along the x-axis. The second array is a bit of a mystery. I am guessing this is the covariance matrix. I pulled it out using this:



      covMatrix = m.predict_f_full_cov(np.array([[.2],[.4],[.6],[.8]]))[1][0]


      Then I looked up how to compute the determinant, like so:



      x = np.linalg.det(covMatrix)


      Then I computed the log of this to get an entropy for the covariance matrix:



      print(-10*math.log(np.linalg.det(covMatrix)))


      I ran this twice using two different sets of data. The first had high noise, the second had low noise. Strangely, the entropy went up for the lower noise data set. I am at a loss.



      I found that if I just compute the covariance matrix on a small region, which should be linear, turning the noise up and down does not do what I expect. Also, if I regress the GP to a large number of points, the determinant goes to 0.0.



      Here is the code I am using:



      import gpflow
      import numpy as np
      N = 300
      noiseSize = 0.01
      X = np.random.rand(N,1)
      Y = np.sin(12*X) + 0.66*np.cos(25*X) + np.random.randn(N,1)*noiseSize + 3
      k = gpflow.kernels.Matern52(1, lengthscales=0.3)
      m = gpflow.models.GPR(X, Y, kern=k)
      m.likelihood.variance = 0.01
      aRange = np.linspace(0.1,0.9,200)
      newRange =
      for point in aRange:
      newRange.append([point])
      covMatrix = m.predict_f_full_cov(newRange)[1][0]
      import math
      print("Determinant: " + str(np.linalg.det(covMatrix)))
      print(-10*math.log(np.linalg.det(covMatrix)))






      python tensorflow statistics gaussian gpflow






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 20 '18 at 14:19









      jdehesa

      23.5k43454




      23.5k43454










      asked Nov 16 '18 at 21:26









      user442920user442920

      48821237




      48821237
























          1 Answer
          1






          active

          oldest

          votes


















          0














          So, first things first, the entropy of a multivariate normal (and a GP, given a fixed set of points on which it's evaluated) only depends on its covariance matrix.



          Answers to your questions:




          1. Yes - when you make the set $X$ more and more dense, you're making the covariance matrix larger and larger, and for many simple covariance kernels, this makes the determinant smaller and smaller. My guess is that this is due to the fact that determinants of large matrices have a lot of product terms (see the Leibniz formula) and products of terms less than one tend to zero faster than their sums. You can verify this easily:


          Dimension and Covar



          Code for this:



          import numpy as np
          import matplotlib.pyplot as plt
          import sklearn.gaussian_process.kernels as k

          plt.style.use("ggplot"); plt.ion()

          n = np.linspace(2, 25, 23, dtype = int)
          d = np.zeros(len(n))

          for i in range(len(n)):
          X = np.linspace(-1, 1, n[i]).reshape(-1, 1)
          S = k.RBF()(X)
          d[i] = np.log(np.linalg.det(S))

          plt.scatter(n, d)
          plt.ylabel("Log Determinant of Covariance Matrix")
          plt.xlabel("Dimension of Covariance Matrix")


          Before moving onto the next point, do note that the entropy of a multivariate normal also has a contribution from size of the matrix, so even though the determinant shoots off to zero, there's a small contribution from the dimension.




          1. With decreasing noise, as one would expect, the entropy & determinant do decrease but not tend to zero exactly; they'll decrease to the determinant due to the other kernels present in the covariance. For the demonstration below, the dimension of the covariance is kept constant ($10*10$) and the noise level is increased from 0:


          Error and Determinant



          Code:



          e = np.logspace(1, -10, 30)
          d = np.zeros(len(e))
          X = np.linspace(-1, 1, 10).reshape(-1, 1)

          for i in range(len(e)):
          S = (k.RBF() + k.WhiteKernel(e[i])) (X)
          d[i] = np.log(np.linalg.det(S))

          e = np.log(e)

          plt.scatter(e, d)
          plt.ylabel("Log Determinant")
          plt.xlabel("Log Error")





          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53345624%2fgpflow-get-the-full-covariance-matrix-and-find-its-entropy%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0














            So, first things first, the entropy of a multivariate normal (and a GP, given a fixed set of points on which it's evaluated) only depends on its covariance matrix.



            Answers to your questions:




            1. Yes - when you make the set $X$ more and more dense, you're making the covariance matrix larger and larger, and for many simple covariance kernels, this makes the determinant smaller and smaller. My guess is that this is due to the fact that determinants of large matrices have a lot of product terms (see the Leibniz formula) and products of terms less than one tend to zero faster than their sums. You can verify this easily:


            Dimension and Covar



            Code for this:



            import numpy as np
            import matplotlib.pyplot as plt
            import sklearn.gaussian_process.kernels as k

            plt.style.use("ggplot"); plt.ion()

            n = np.linspace(2, 25, 23, dtype = int)
            d = np.zeros(len(n))

            for i in range(len(n)):
            X = np.linspace(-1, 1, n[i]).reshape(-1, 1)
            S = k.RBF()(X)
            d[i] = np.log(np.linalg.det(S))

            plt.scatter(n, d)
            plt.ylabel("Log Determinant of Covariance Matrix")
            plt.xlabel("Dimension of Covariance Matrix")


            Before moving onto the next point, do note that the entropy of a multivariate normal also has a contribution from size of the matrix, so even though the determinant shoots off to zero, there's a small contribution from the dimension.




            1. With decreasing noise, as one would expect, the entropy & determinant do decrease but not tend to zero exactly; they'll decrease to the determinant due to the other kernels present in the covariance. For the demonstration below, the dimension of the covariance is kept constant ($10*10$) and the noise level is increased from 0:


            Error and Determinant



            Code:



            e = np.logspace(1, -10, 30)
            d = np.zeros(len(e))
            X = np.linspace(-1, 1, 10).reshape(-1, 1)

            for i in range(len(e)):
            S = (k.RBF() + k.WhiteKernel(e[i])) (X)
            d[i] = np.log(np.linalg.det(S))

            e = np.log(e)

            plt.scatter(e, d)
            plt.ylabel("Log Determinant")
            plt.xlabel("Log Error")





            share|improve this answer




























              0














              So, first things first, the entropy of a multivariate normal (and a GP, given a fixed set of points on which it's evaluated) only depends on its covariance matrix.



              Answers to your questions:




              1. Yes - when you make the set $X$ more and more dense, you're making the covariance matrix larger and larger, and for many simple covariance kernels, this makes the determinant smaller and smaller. My guess is that this is due to the fact that determinants of large matrices have a lot of product terms (see the Leibniz formula) and products of terms less than one tend to zero faster than their sums. You can verify this easily:


              Dimension and Covar



              Code for this:



              import numpy as np
              import matplotlib.pyplot as plt
              import sklearn.gaussian_process.kernels as k

              plt.style.use("ggplot"); plt.ion()

              n = np.linspace(2, 25, 23, dtype = int)
              d = np.zeros(len(n))

              for i in range(len(n)):
              X = np.linspace(-1, 1, n[i]).reshape(-1, 1)
              S = k.RBF()(X)
              d[i] = np.log(np.linalg.det(S))

              plt.scatter(n, d)
              plt.ylabel("Log Determinant of Covariance Matrix")
              plt.xlabel("Dimension of Covariance Matrix")


              Before moving onto the next point, do note that the entropy of a multivariate normal also has a contribution from size of the matrix, so even though the determinant shoots off to zero, there's a small contribution from the dimension.




              1. With decreasing noise, as one would expect, the entropy & determinant do decrease but not tend to zero exactly; they'll decrease to the determinant due to the other kernels present in the covariance. For the demonstration below, the dimension of the covariance is kept constant ($10*10$) and the noise level is increased from 0:


              Error and Determinant



              Code:



              e = np.logspace(1, -10, 30)
              d = np.zeros(len(e))
              X = np.linspace(-1, 1, 10).reshape(-1, 1)

              for i in range(len(e)):
              S = (k.RBF() + k.WhiteKernel(e[i])) (X)
              d[i] = np.log(np.linalg.det(S))

              e = np.log(e)

              plt.scatter(e, d)
              plt.ylabel("Log Determinant")
              plt.xlabel("Log Error")





              share|improve this answer


























                0












                0








                0







                So, first things first, the entropy of a multivariate normal (and a GP, given a fixed set of points on which it's evaluated) only depends on its covariance matrix.



                Answers to your questions:




                1. Yes - when you make the set $X$ more and more dense, you're making the covariance matrix larger and larger, and for many simple covariance kernels, this makes the determinant smaller and smaller. My guess is that this is due to the fact that determinants of large matrices have a lot of product terms (see the Leibniz formula) and products of terms less than one tend to zero faster than their sums. You can verify this easily:


                Dimension and Covar



                Code for this:



                import numpy as np
                import matplotlib.pyplot as plt
                import sklearn.gaussian_process.kernels as k

                plt.style.use("ggplot"); plt.ion()

                n = np.linspace(2, 25, 23, dtype = int)
                d = np.zeros(len(n))

                for i in range(len(n)):
                X = np.linspace(-1, 1, n[i]).reshape(-1, 1)
                S = k.RBF()(X)
                d[i] = np.log(np.linalg.det(S))

                plt.scatter(n, d)
                plt.ylabel("Log Determinant of Covariance Matrix")
                plt.xlabel("Dimension of Covariance Matrix")


                Before moving onto the next point, do note that the entropy of a multivariate normal also has a contribution from size of the matrix, so even though the determinant shoots off to zero, there's a small contribution from the dimension.




                1. With decreasing noise, as one would expect, the entropy & determinant do decrease but not tend to zero exactly; they'll decrease to the determinant due to the other kernels present in the covariance. For the demonstration below, the dimension of the covariance is kept constant ($10*10$) and the noise level is increased from 0:


                Error and Determinant



                Code:



                e = np.logspace(1, -10, 30)
                d = np.zeros(len(e))
                X = np.linspace(-1, 1, 10).reshape(-1, 1)

                for i in range(len(e)):
                S = (k.RBF() + k.WhiteKernel(e[i])) (X)
                d[i] = np.log(np.linalg.det(S))

                e = np.log(e)

                plt.scatter(e, d)
                plt.ylabel("Log Determinant")
                plt.xlabel("Log Error")





                share|improve this answer













                So, first things first, the entropy of a multivariate normal (and a GP, given a fixed set of points on which it's evaluated) only depends on its covariance matrix.



                Answers to your questions:




                1. Yes - when you make the set $X$ more and more dense, you're making the covariance matrix larger and larger, and for many simple covariance kernels, this makes the determinant smaller and smaller. My guess is that this is due to the fact that determinants of large matrices have a lot of product terms (see the Leibniz formula) and products of terms less than one tend to zero faster than their sums. You can verify this easily:


                Dimension and Covar



                Code for this:



                import numpy as np
                import matplotlib.pyplot as plt
                import sklearn.gaussian_process.kernels as k

                plt.style.use("ggplot"); plt.ion()

                n = np.linspace(2, 25, 23, dtype = int)
                d = np.zeros(len(n))

                for i in range(len(n)):
                X = np.linspace(-1, 1, n[i]).reshape(-1, 1)
                S = k.RBF()(X)
                d[i] = np.log(np.linalg.det(S))

                plt.scatter(n, d)
                plt.ylabel("Log Determinant of Covariance Matrix")
                plt.xlabel("Dimension of Covariance Matrix")


                Before moving onto the next point, do note that the entropy of a multivariate normal also has a contribution from size of the matrix, so even though the determinant shoots off to zero, there's a small contribution from the dimension.




                1. With decreasing noise, as one would expect, the entropy & determinant do decrease but not tend to zero exactly; they'll decrease to the determinant due to the other kernels present in the covariance. For the demonstration below, the dimension of the covariance is kept constant ($10*10$) and the noise level is increased from 0:


                Error and Determinant



                Code:



                e = np.logspace(1, -10, 30)
                d = np.zeros(len(e))
                X = np.linspace(-1, 1, 10).reshape(-1, 1)

                for i in range(len(e)):
                S = (k.RBF() + k.WhiteKernel(e[i])) (X)
                d[i] = np.log(np.linalg.det(S))

                e = np.log(e)

                plt.scatter(e, d)
                plt.ylabel("Log Determinant")
                plt.xlabel("Log Error")






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 26 '18 at 18:07









                user442920user442920

                48821237




                48821237






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53345624%2fgpflow-get-the-full-covariance-matrix-and-find-its-entropy%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    這個網誌中的熱門文章

                    Tangent Lines Diagram Along Smooth Curve

                    Yusuf al-Mu'taman ibn Hud

                    Zucchini