Multidimensional gradient descent in Tensorflow












0















What does Tensorflow really do when the Gradient descent optimizer is applied to a "loss" placeholder that is not a number (a tensor of size 1) but rather a vector (a 1-dimensional tensor of size 2, 3, 4, or more)?



Is it like doing the descent on the sum of the components?










share|improve this question



























    0















    What does Tensorflow really do when the Gradient descent optimizer is applied to a "loss" placeholder that is not a number (a tensor of size 1) but rather a vector (a 1-dimensional tensor of size 2, 3, 4, or more)?



    Is it like doing the descent on the sum of the components?










    share|improve this question

























      0












      0








      0








      What does Tensorflow really do when the Gradient descent optimizer is applied to a "loss" placeholder that is not a number (a tensor of size 1) but rather a vector (a 1-dimensional tensor of size 2, 3, 4, or more)?



      Is it like doing the descent on the sum of the components?










      share|improve this question














      What does Tensorflow really do when the Gradient descent optimizer is applied to a "loss" placeholder that is not a number (a tensor of size 1) but rather a vector (a 1-dimensional tensor of size 2, 3, 4, or more)?



      Is it like doing the descent on the sum of the components?







      python tensorflow gradient-descent






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 16 '18 at 8:10









      AristodogAristodog

      192




      192
























          2 Answers
          2






          active

          oldest

          votes


















          0














          The answer to your second question is "no".



          As for the second: just like in the one-dimensional case (e.g. y = f(x), x in R), where the direction the algorithm takes is defined by the derivative of the function with respect to its single variable, in the multidimensional case the 'overall' direction is defined by the derivative of the function with respect to each variable.



          This means the size of the step you'll take in each direction will be determined by the value of the derivative of the variable corresponding to that direction.



          Since there's no way to properly type math in StackOverflow, instead of messing around with it I'll suggest you take a look at this article.






          share|improve this answer
























          • Maybe my question was not clear. In y=f(x), I'm talking about the case where y in multidimensional.

            – Aristodog
            Nov 17 '18 at 11:25



















          0














          Tensorflow first reduces your loss to a scalar and then optimizes that.






          share|improve this answer
























          • What does "reduce" a vector to a scalar mean?

            – Aristodog
            Nov 25 '18 at 14:16













          • Adding all its entries, as in tf.reduce_sum

            – Alexandre Passos
            Nov 26 '18 at 16:13











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53333794%2fmultidimensional-gradient-descent-in-tensorflow%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          The answer to your second question is "no".



          As for the second: just like in the one-dimensional case (e.g. y = f(x), x in R), where the direction the algorithm takes is defined by the derivative of the function with respect to its single variable, in the multidimensional case the 'overall' direction is defined by the derivative of the function with respect to each variable.



          This means the size of the step you'll take in each direction will be determined by the value of the derivative of the variable corresponding to that direction.



          Since there's no way to properly type math in StackOverflow, instead of messing around with it I'll suggest you take a look at this article.






          share|improve this answer
























          • Maybe my question was not clear. In y=f(x), I'm talking about the case where y in multidimensional.

            – Aristodog
            Nov 17 '18 at 11:25
















          0














          The answer to your second question is "no".



          As for the second: just like in the one-dimensional case (e.g. y = f(x), x in R), where the direction the algorithm takes is defined by the derivative of the function with respect to its single variable, in the multidimensional case the 'overall' direction is defined by the derivative of the function with respect to each variable.



          This means the size of the step you'll take in each direction will be determined by the value of the derivative of the variable corresponding to that direction.



          Since there's no way to properly type math in StackOverflow, instead of messing around with it I'll suggest you take a look at this article.






          share|improve this answer
























          • Maybe my question was not clear. In y=f(x), I'm talking about the case where y in multidimensional.

            – Aristodog
            Nov 17 '18 at 11:25














          0












          0








          0







          The answer to your second question is "no".



          As for the second: just like in the one-dimensional case (e.g. y = f(x), x in R), where the direction the algorithm takes is defined by the derivative of the function with respect to its single variable, in the multidimensional case the 'overall' direction is defined by the derivative of the function with respect to each variable.



          This means the size of the step you'll take in each direction will be determined by the value of the derivative of the variable corresponding to that direction.



          Since there's no way to properly type math in StackOverflow, instead of messing around with it I'll suggest you take a look at this article.






          share|improve this answer













          The answer to your second question is "no".



          As for the second: just like in the one-dimensional case (e.g. y = f(x), x in R), where the direction the algorithm takes is defined by the derivative of the function with respect to its single variable, in the multidimensional case the 'overall' direction is defined by the derivative of the function with respect to each variable.



          This means the size of the step you'll take in each direction will be determined by the value of the derivative of the variable corresponding to that direction.



          Since there's no way to properly type math in StackOverflow, instead of messing around with it I'll suggest you take a look at this article.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 16 '18 at 10:31









          Lucas FariasLucas Farias

          16610




          16610













          • Maybe my question was not clear. In y=f(x), I'm talking about the case where y in multidimensional.

            – Aristodog
            Nov 17 '18 at 11:25



















          • Maybe my question was not clear. In y=f(x), I'm talking about the case where y in multidimensional.

            – Aristodog
            Nov 17 '18 at 11:25

















          Maybe my question was not clear. In y=f(x), I'm talking about the case where y in multidimensional.

          – Aristodog
          Nov 17 '18 at 11:25





          Maybe my question was not clear. In y=f(x), I'm talking about the case where y in multidimensional.

          – Aristodog
          Nov 17 '18 at 11:25













          0














          Tensorflow first reduces your loss to a scalar and then optimizes that.






          share|improve this answer
























          • What does "reduce" a vector to a scalar mean?

            – Aristodog
            Nov 25 '18 at 14:16













          • Adding all its entries, as in tf.reduce_sum

            – Alexandre Passos
            Nov 26 '18 at 16:13
















          0














          Tensorflow first reduces your loss to a scalar and then optimizes that.






          share|improve this answer
























          • What does "reduce" a vector to a scalar mean?

            – Aristodog
            Nov 25 '18 at 14:16













          • Adding all its entries, as in tf.reduce_sum

            – Alexandre Passos
            Nov 26 '18 at 16:13














          0












          0








          0







          Tensorflow first reduces your loss to a scalar and then optimizes that.






          share|improve this answer













          Tensorflow first reduces your loss to a scalar and then optimizes that.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 21 '18 at 16:23









          Alexandre PassosAlexandre Passos

          4,2211917




          4,2211917













          • What does "reduce" a vector to a scalar mean?

            – Aristodog
            Nov 25 '18 at 14:16













          • Adding all its entries, as in tf.reduce_sum

            – Alexandre Passos
            Nov 26 '18 at 16:13



















          • What does "reduce" a vector to a scalar mean?

            – Aristodog
            Nov 25 '18 at 14:16













          • Adding all its entries, as in tf.reduce_sum

            – Alexandre Passos
            Nov 26 '18 at 16:13

















          What does "reduce" a vector to a scalar mean?

          – Aristodog
          Nov 25 '18 at 14:16







          What does "reduce" a vector to a scalar mean?

          – Aristodog
          Nov 25 '18 at 14:16















          Adding all its entries, as in tf.reduce_sum

          – Alexandre Passos
          Nov 26 '18 at 16:13





          Adding all its entries, as in tf.reduce_sum

          – Alexandre Passos
          Nov 26 '18 at 16:13


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53333794%2fmultidimensional-gradient-descent-in-tensorflow%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          這個網誌中的熱門文章

          Tangent Lines Diagram Along Smooth Curve

          Yusuf al-Mu'taman ibn Hud

          Zucchini