How and when does Python determine the data type of a variable?





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







7















I was trying to figure out exactly how Python 3 (using CPython as an interpreter) executes its program. I found out that the steps are:




  1. Compilation of Python source code (.py file) by CPython compilator to Python bytecode (.pyc) file. In the case of importing any modules the .pyc files are saved, in the case of one main.py Python script running they are not saved.


  2. Python Virtual Machine interpretation of the bytecode into the hardware specific Machine Code.



A great answer found here https://stackoverflow.com/a/1732383/8640077 says that Python Virtual Machine takes longer to run its bytecode comparing to JVM because the java bytecode contains the information about data types, while Python Virtual Machine interprets lines one by one and has to determine the data types.



My question is how does Python Virtual Machine determine the data type and does it happen during the interpretation to Machine code or during a separate process (which e.g. would produce another intermediate code)?










share|improve this question




















  • 6





    Why do you think Python ever needs to "determine the data type"? Python is a dynamically-typed language; a type will only be checked when you ask specifically, and can very continually over the life of a variable. And I highly doubt the difference in execution time between Python and Java is due to runtime type checking.

    – Daniel Roseman
    Nov 23 '18 at 15:15













  • So even during the translation from Bytecode to Machine Code Python does not know the type of a variable? AD 2: What then makes the biggest difference in execution time between Python and Java?

    – PyFox
    Nov 23 '18 at 15:26













  • No, how can it? The program itself can change the type of a variable at any time, that is what dynamic typing means. This code is perfectly legal: a = 'mystring'; a = MyClassThatIsNotAString(). What is the type of a?

    – Daniel Roseman
    Nov 23 '18 at 15:28








  • 1





    I found this resource while searching for an answer to your question. Maybe you will find it useful.

    – JETM
    Nov 23 '18 at 15:29






  • 1





    I have yet to thoroughly understand this article, but it is much shorter, and I think it answers your question.

    – JETM
    Nov 23 '18 at 15:46


















7















I was trying to figure out exactly how Python 3 (using CPython as an interpreter) executes its program. I found out that the steps are:




  1. Compilation of Python source code (.py file) by CPython compilator to Python bytecode (.pyc) file. In the case of importing any modules the .pyc files are saved, in the case of one main.py Python script running they are not saved.


  2. Python Virtual Machine interpretation of the bytecode into the hardware specific Machine Code.



A great answer found here https://stackoverflow.com/a/1732383/8640077 says that Python Virtual Machine takes longer to run its bytecode comparing to JVM because the java bytecode contains the information about data types, while Python Virtual Machine interprets lines one by one and has to determine the data types.



My question is how does Python Virtual Machine determine the data type and does it happen during the interpretation to Machine code or during a separate process (which e.g. would produce another intermediate code)?










share|improve this question




















  • 6





    Why do you think Python ever needs to "determine the data type"? Python is a dynamically-typed language; a type will only be checked when you ask specifically, and can very continually over the life of a variable. And I highly doubt the difference in execution time between Python and Java is due to runtime type checking.

    – Daniel Roseman
    Nov 23 '18 at 15:15













  • So even during the translation from Bytecode to Machine Code Python does not know the type of a variable? AD 2: What then makes the biggest difference in execution time between Python and Java?

    – PyFox
    Nov 23 '18 at 15:26













  • No, how can it? The program itself can change the type of a variable at any time, that is what dynamic typing means. This code is perfectly legal: a = 'mystring'; a = MyClassThatIsNotAString(). What is the type of a?

    – Daniel Roseman
    Nov 23 '18 at 15:28








  • 1





    I found this resource while searching for an answer to your question. Maybe you will find it useful.

    – JETM
    Nov 23 '18 at 15:29






  • 1





    I have yet to thoroughly understand this article, but it is much shorter, and I think it answers your question.

    – JETM
    Nov 23 '18 at 15:46














7












7








7


3






I was trying to figure out exactly how Python 3 (using CPython as an interpreter) executes its program. I found out that the steps are:




  1. Compilation of Python source code (.py file) by CPython compilator to Python bytecode (.pyc) file. In the case of importing any modules the .pyc files are saved, in the case of one main.py Python script running they are not saved.


  2. Python Virtual Machine interpretation of the bytecode into the hardware specific Machine Code.



A great answer found here https://stackoverflow.com/a/1732383/8640077 says that Python Virtual Machine takes longer to run its bytecode comparing to JVM because the java bytecode contains the information about data types, while Python Virtual Machine interprets lines one by one and has to determine the data types.



My question is how does Python Virtual Machine determine the data type and does it happen during the interpretation to Machine code or during a separate process (which e.g. would produce another intermediate code)?










share|improve this question
















I was trying to figure out exactly how Python 3 (using CPython as an interpreter) executes its program. I found out that the steps are:




  1. Compilation of Python source code (.py file) by CPython compilator to Python bytecode (.pyc) file. In the case of importing any modules the .pyc files are saved, in the case of one main.py Python script running they are not saved.


  2. Python Virtual Machine interpretation of the bytecode into the hardware specific Machine Code.



A great answer found here https://stackoverflow.com/a/1732383/8640077 says that Python Virtual Machine takes longer to run its bytecode comparing to JVM because the java bytecode contains the information about data types, while Python Virtual Machine interprets lines one by one and has to determine the data types.



My question is how does Python Virtual Machine determine the data type and does it happen during the interpretation to Machine code or during a separate process (which e.g. would produce another intermediate code)?







python python-3.x virtual-machine cpython






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 24 '18 at 17:12







PyFox

















asked Nov 23 '18 at 15:13









PyFoxPyFox

1088




1088








  • 6





    Why do you think Python ever needs to "determine the data type"? Python is a dynamically-typed language; a type will only be checked when you ask specifically, and can very continually over the life of a variable. And I highly doubt the difference in execution time between Python and Java is due to runtime type checking.

    – Daniel Roseman
    Nov 23 '18 at 15:15













  • So even during the translation from Bytecode to Machine Code Python does not know the type of a variable? AD 2: What then makes the biggest difference in execution time between Python and Java?

    – PyFox
    Nov 23 '18 at 15:26













  • No, how can it? The program itself can change the type of a variable at any time, that is what dynamic typing means. This code is perfectly legal: a = 'mystring'; a = MyClassThatIsNotAString(). What is the type of a?

    – Daniel Roseman
    Nov 23 '18 at 15:28








  • 1





    I found this resource while searching for an answer to your question. Maybe you will find it useful.

    – JETM
    Nov 23 '18 at 15:29






  • 1





    I have yet to thoroughly understand this article, but it is much shorter, and I think it answers your question.

    – JETM
    Nov 23 '18 at 15:46














  • 6





    Why do you think Python ever needs to "determine the data type"? Python is a dynamically-typed language; a type will only be checked when you ask specifically, and can very continually over the life of a variable. And I highly doubt the difference in execution time between Python and Java is due to runtime type checking.

    – Daniel Roseman
    Nov 23 '18 at 15:15













  • So even during the translation from Bytecode to Machine Code Python does not know the type of a variable? AD 2: What then makes the biggest difference in execution time between Python and Java?

    – PyFox
    Nov 23 '18 at 15:26













  • No, how can it? The program itself can change the type of a variable at any time, that is what dynamic typing means. This code is perfectly legal: a = 'mystring'; a = MyClassThatIsNotAString(). What is the type of a?

    – Daniel Roseman
    Nov 23 '18 at 15:28








  • 1





    I found this resource while searching for an answer to your question. Maybe you will find it useful.

    – JETM
    Nov 23 '18 at 15:29






  • 1





    I have yet to thoroughly understand this article, but it is much shorter, and I think it answers your question.

    – JETM
    Nov 23 '18 at 15:46








6




6





Why do you think Python ever needs to "determine the data type"? Python is a dynamically-typed language; a type will only be checked when you ask specifically, and can very continually over the life of a variable. And I highly doubt the difference in execution time between Python and Java is due to runtime type checking.

– Daniel Roseman
Nov 23 '18 at 15:15







Why do you think Python ever needs to "determine the data type"? Python is a dynamically-typed language; a type will only be checked when you ask specifically, and can very continually over the life of a variable. And I highly doubt the difference in execution time between Python and Java is due to runtime type checking.

– Daniel Roseman
Nov 23 '18 at 15:15















So even during the translation from Bytecode to Machine Code Python does not know the type of a variable? AD 2: What then makes the biggest difference in execution time between Python and Java?

– PyFox
Nov 23 '18 at 15:26







So even during the translation from Bytecode to Machine Code Python does not know the type of a variable? AD 2: What then makes the biggest difference in execution time between Python and Java?

– PyFox
Nov 23 '18 at 15:26















No, how can it? The program itself can change the type of a variable at any time, that is what dynamic typing means. This code is perfectly legal: a = 'mystring'; a = MyClassThatIsNotAString(). What is the type of a?

– Daniel Roseman
Nov 23 '18 at 15:28







No, how can it? The program itself can change the type of a variable at any time, that is what dynamic typing means. This code is perfectly legal: a = 'mystring'; a = MyClassThatIsNotAString(). What is the type of a?

– Daniel Roseman
Nov 23 '18 at 15:28






1




1





I found this resource while searching for an answer to your question. Maybe you will find it useful.

– JETM
Nov 23 '18 at 15:29





I found this resource while searching for an answer to your question. Maybe you will find it useful.

– JETM
Nov 23 '18 at 15:29




1




1





I have yet to thoroughly understand this article, but it is much shorter, and I think it answers your question.

– JETM
Nov 23 '18 at 15:46





I have yet to thoroughly understand this article, but it is much shorter, and I think it answers your question.

– JETM
Nov 23 '18 at 15:46












3 Answers
3






active

oldest

votes


















3














The dynamic, run-time dispatch of CPython (compared to static, compile-time dispatch of Java) is only one of the reasons, why Java is faster than pure CPython: there are jit-compilation in Java, different garbage collection strategies, presence of native types like int, double vs. immutable data structures in CPython and so on.



My earlier superficial experiments have shown, that the dynamical dispatch is only responsible for about 30% of running - you cannot explain speed differences of some factors of magnitude with that.



To make this answer less abstract, let's take a look at an example:



def add(x,y):
return x+y


Looking at the bytecode:



import dis
dis.dis(add)


which gives:



2         0 LOAD_FAST                0 (x)
2 LOAD_FAST 1 (y)
4 BINARY_ADD
6 RETURN_VALUE


We can see on the level of bytecode there is no difference whether x and y are integers or floats or something else - the interpreter doesn't care.



The situation is completely different in Java:



int add(int x, int y) {return x+y;}


and



float add(float x, float y) {return x+y;}


would result in completely different opcodes and the call-dispatch would happen at compile time - the right version is picked depending on the static types which are known at the compile time.



Pretty often CPython-interpreter doesn't have to know the exact type of arguments: Internally there is a base "class/interface" (obviously there are no classes in C, so it is called "protocol", but for somebody who knows C++/Java "interface" is probably the right mental model), from which all other "classes" are derived. This base "class" is called PyObject and here is the description of its protocol.. So as long as the function is a part of this protocol/interface CPython interpreter can call it, without knowing the exact type and the call will be dispatched to the right implementation (a lot like "virtual" functions in C++).



On the pure Python side, it seems as if variables don't have types:



a=1
a="1"


however, internally a has a type - it is PyObject* and this reference can be bound to an integer (1) and to an unicode-string ("1") - because they both "inherit" from PyObject.



From time to time the CPython interpreter tries to find out the right type of the reference, also for the above example - when it sees BINARY_ADD-opcode, the following C-code is executed:



    case TARGET(BINARY_ADD): {
PyObject *right = POP();
PyObject *left = TOP();
PyObject *sum;
...
if (PyUnicode_CheckExact(left) &&
PyUnicode_CheckExact(right)) {
sum = unicode_concatenate(left, right, f, next_instr);
/* unicode_concatenate consumed the ref to left */
}
else {
sum = PyNumber_Add(left, right);
Py_DECREF(left);
}
Py_DECREF(right);
SET_TOP(sum);
if (sum == NULL)
goto error;
DISPATCH();
}


Here the interpreter queries, whether both objects are unicode strings and if this is the case a special method (maybe more efficient, as matter of fact it tries to change the immutable unicode-object in-place, see this SO-answer) is used, otherwise the work is dispatched to PyNumber-protocol.



Obviously, the interpreter also has to know the exact type when an object is created, for example for a="1" or a=1 different "classes" are used - but as we have seen it is not the only one place.



So the interpreter interfers the types during the run-time, but most of the time it doesn't have to do it - the goal can be reached via dynamic dispatch.






share|improve this answer

































    0














    Python is built around the philosophy of duck typing. No explicit type checking takes place, not even during runtime. For example,



    >>> x = 5
    >>> y = "5"
    >>> '__mul__' in dir(x)
    >>> True
    >>> '__mul__' in dir(y)
    >>> True
    >>> type(x)
    >>> <class 'int'>
    >>> type(y)
    >>> <class 'str'>
    >>> type(x*y)
    >>> <class 'str'>


    The CPython interpreter checks if x and y have the __mul__ method defined, and tries to "make it work" and return a result. Also, Python bytecode never gets translated to machine code. It gets executed inside the CPython interpreter. One major difference between the JVM and the CPython virtual machine is that the JVM can compile Java bytecode to machine code for performance gains whenever it wants to (JIT compilation), whereas the CPython VM only runs bytecode just as it is.






    share|improve this answer
























    • What do you mean "Python bytecode never gets translated to machine code. It gets executed inside the CPython interpreter.". Could you please elaborate on that?

      – PyFox
      Nov 23 '18 at 17:22











    • Machine code usually refers to code that can be executed by your computer. For instance, when you compile a C++ program in your computer, it gets compiled to machine code specific to YOUR computer's CPU architecture. Your CPU understands these instructions and can run them. So in a way, your CPU is the interpreter here. Just think of Python bytecode as machine code for the CPython virtual machine. They are just instructions for the CPython virtual machine. The CPython virtual machine can run these instructions without having to translate them to something else

      – prithajnath
      Nov 23 '18 at 18:18



















    0














    It could be useful for your understanding to avoid thinking of "variables" in Python. Compared to statically typed languages that have to associate a type with a variable, a class member, or a function argument, Python only deals with "labels" or names for objects.



    So in the snippet,



    a = "a string"
    a = 5 # a number
    a = MyClass() # an object of type MyClass


    the label a never has a type. It is just a name that points to different objects at different times (very similarly, in fact, to "pointers" in other languages). The objects on the other hand (the string, the number) always have a type. This nature of this type could change, as you can dynamically change the definition of a class, but it will always be determined, i.e. known by the language interpreter.



    So to answer the question: Python never determines the type of a variable (label/name), it only uses it to refer to an object and that object has a type.






    share|improve this answer
























      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53449112%2fhow-and-when-does-python-determine-the-data-type-of-a-variable%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      3














      The dynamic, run-time dispatch of CPython (compared to static, compile-time dispatch of Java) is only one of the reasons, why Java is faster than pure CPython: there are jit-compilation in Java, different garbage collection strategies, presence of native types like int, double vs. immutable data structures in CPython and so on.



      My earlier superficial experiments have shown, that the dynamical dispatch is only responsible for about 30% of running - you cannot explain speed differences of some factors of magnitude with that.



      To make this answer less abstract, let's take a look at an example:



      def add(x,y):
      return x+y


      Looking at the bytecode:



      import dis
      dis.dis(add)


      which gives:



      2         0 LOAD_FAST                0 (x)
      2 LOAD_FAST 1 (y)
      4 BINARY_ADD
      6 RETURN_VALUE


      We can see on the level of bytecode there is no difference whether x and y are integers or floats or something else - the interpreter doesn't care.



      The situation is completely different in Java:



      int add(int x, int y) {return x+y;}


      and



      float add(float x, float y) {return x+y;}


      would result in completely different opcodes and the call-dispatch would happen at compile time - the right version is picked depending on the static types which are known at the compile time.



      Pretty often CPython-interpreter doesn't have to know the exact type of arguments: Internally there is a base "class/interface" (obviously there are no classes in C, so it is called "protocol", but for somebody who knows C++/Java "interface" is probably the right mental model), from which all other "classes" are derived. This base "class" is called PyObject and here is the description of its protocol.. So as long as the function is a part of this protocol/interface CPython interpreter can call it, without knowing the exact type and the call will be dispatched to the right implementation (a lot like "virtual" functions in C++).



      On the pure Python side, it seems as if variables don't have types:



      a=1
      a="1"


      however, internally a has a type - it is PyObject* and this reference can be bound to an integer (1) and to an unicode-string ("1") - because they both "inherit" from PyObject.



      From time to time the CPython interpreter tries to find out the right type of the reference, also for the above example - when it sees BINARY_ADD-opcode, the following C-code is executed:



          case TARGET(BINARY_ADD): {
      PyObject *right = POP();
      PyObject *left = TOP();
      PyObject *sum;
      ...
      if (PyUnicode_CheckExact(left) &&
      PyUnicode_CheckExact(right)) {
      sum = unicode_concatenate(left, right, f, next_instr);
      /* unicode_concatenate consumed the ref to left */
      }
      else {
      sum = PyNumber_Add(left, right);
      Py_DECREF(left);
      }
      Py_DECREF(right);
      SET_TOP(sum);
      if (sum == NULL)
      goto error;
      DISPATCH();
      }


      Here the interpreter queries, whether both objects are unicode strings and if this is the case a special method (maybe more efficient, as matter of fact it tries to change the immutable unicode-object in-place, see this SO-answer) is used, otherwise the work is dispatched to PyNumber-protocol.



      Obviously, the interpreter also has to know the exact type when an object is created, for example for a="1" or a=1 different "classes" are used - but as we have seen it is not the only one place.



      So the interpreter interfers the types during the run-time, but most of the time it doesn't have to do it - the goal can be reached via dynamic dispatch.






      share|improve this answer






























        3














        The dynamic, run-time dispatch of CPython (compared to static, compile-time dispatch of Java) is only one of the reasons, why Java is faster than pure CPython: there are jit-compilation in Java, different garbage collection strategies, presence of native types like int, double vs. immutable data structures in CPython and so on.



        My earlier superficial experiments have shown, that the dynamical dispatch is only responsible for about 30% of running - you cannot explain speed differences of some factors of magnitude with that.



        To make this answer less abstract, let's take a look at an example:



        def add(x,y):
        return x+y


        Looking at the bytecode:



        import dis
        dis.dis(add)


        which gives:



        2         0 LOAD_FAST                0 (x)
        2 LOAD_FAST 1 (y)
        4 BINARY_ADD
        6 RETURN_VALUE


        We can see on the level of bytecode there is no difference whether x and y are integers or floats or something else - the interpreter doesn't care.



        The situation is completely different in Java:



        int add(int x, int y) {return x+y;}


        and



        float add(float x, float y) {return x+y;}


        would result in completely different opcodes and the call-dispatch would happen at compile time - the right version is picked depending on the static types which are known at the compile time.



        Pretty often CPython-interpreter doesn't have to know the exact type of arguments: Internally there is a base "class/interface" (obviously there are no classes in C, so it is called "protocol", but for somebody who knows C++/Java "interface" is probably the right mental model), from which all other "classes" are derived. This base "class" is called PyObject and here is the description of its protocol.. So as long as the function is a part of this protocol/interface CPython interpreter can call it, without knowing the exact type and the call will be dispatched to the right implementation (a lot like "virtual" functions in C++).



        On the pure Python side, it seems as if variables don't have types:



        a=1
        a="1"


        however, internally a has a type - it is PyObject* and this reference can be bound to an integer (1) and to an unicode-string ("1") - because they both "inherit" from PyObject.



        From time to time the CPython interpreter tries to find out the right type of the reference, also for the above example - when it sees BINARY_ADD-opcode, the following C-code is executed:



            case TARGET(BINARY_ADD): {
        PyObject *right = POP();
        PyObject *left = TOP();
        PyObject *sum;
        ...
        if (PyUnicode_CheckExact(left) &&
        PyUnicode_CheckExact(right)) {
        sum = unicode_concatenate(left, right, f, next_instr);
        /* unicode_concatenate consumed the ref to left */
        }
        else {
        sum = PyNumber_Add(left, right);
        Py_DECREF(left);
        }
        Py_DECREF(right);
        SET_TOP(sum);
        if (sum == NULL)
        goto error;
        DISPATCH();
        }


        Here the interpreter queries, whether both objects are unicode strings and if this is the case a special method (maybe more efficient, as matter of fact it tries to change the immutable unicode-object in-place, see this SO-answer) is used, otherwise the work is dispatched to PyNumber-protocol.



        Obviously, the interpreter also has to know the exact type when an object is created, for example for a="1" or a=1 different "classes" are used - but as we have seen it is not the only one place.



        So the interpreter interfers the types during the run-time, but most of the time it doesn't have to do it - the goal can be reached via dynamic dispatch.






        share|improve this answer




























          3












          3








          3







          The dynamic, run-time dispatch of CPython (compared to static, compile-time dispatch of Java) is only one of the reasons, why Java is faster than pure CPython: there are jit-compilation in Java, different garbage collection strategies, presence of native types like int, double vs. immutable data structures in CPython and so on.



          My earlier superficial experiments have shown, that the dynamical dispatch is only responsible for about 30% of running - you cannot explain speed differences of some factors of magnitude with that.



          To make this answer less abstract, let's take a look at an example:



          def add(x,y):
          return x+y


          Looking at the bytecode:



          import dis
          dis.dis(add)


          which gives:



          2         0 LOAD_FAST                0 (x)
          2 LOAD_FAST 1 (y)
          4 BINARY_ADD
          6 RETURN_VALUE


          We can see on the level of bytecode there is no difference whether x and y are integers or floats or something else - the interpreter doesn't care.



          The situation is completely different in Java:



          int add(int x, int y) {return x+y;}


          and



          float add(float x, float y) {return x+y;}


          would result in completely different opcodes and the call-dispatch would happen at compile time - the right version is picked depending on the static types which are known at the compile time.



          Pretty often CPython-interpreter doesn't have to know the exact type of arguments: Internally there is a base "class/interface" (obviously there are no classes in C, so it is called "protocol", but for somebody who knows C++/Java "interface" is probably the right mental model), from which all other "classes" are derived. This base "class" is called PyObject and here is the description of its protocol.. So as long as the function is a part of this protocol/interface CPython interpreter can call it, without knowing the exact type and the call will be dispatched to the right implementation (a lot like "virtual" functions in C++).



          On the pure Python side, it seems as if variables don't have types:



          a=1
          a="1"


          however, internally a has a type - it is PyObject* and this reference can be bound to an integer (1) and to an unicode-string ("1") - because they both "inherit" from PyObject.



          From time to time the CPython interpreter tries to find out the right type of the reference, also for the above example - when it sees BINARY_ADD-opcode, the following C-code is executed:



              case TARGET(BINARY_ADD): {
          PyObject *right = POP();
          PyObject *left = TOP();
          PyObject *sum;
          ...
          if (PyUnicode_CheckExact(left) &&
          PyUnicode_CheckExact(right)) {
          sum = unicode_concatenate(left, right, f, next_instr);
          /* unicode_concatenate consumed the ref to left */
          }
          else {
          sum = PyNumber_Add(left, right);
          Py_DECREF(left);
          }
          Py_DECREF(right);
          SET_TOP(sum);
          if (sum == NULL)
          goto error;
          DISPATCH();
          }


          Here the interpreter queries, whether both objects are unicode strings and if this is the case a special method (maybe more efficient, as matter of fact it tries to change the immutable unicode-object in-place, see this SO-answer) is used, otherwise the work is dispatched to PyNumber-protocol.



          Obviously, the interpreter also has to know the exact type when an object is created, for example for a="1" or a=1 different "classes" are used - but as we have seen it is not the only one place.



          So the interpreter interfers the types during the run-time, but most of the time it doesn't have to do it - the goal can be reached via dynamic dispatch.






          share|improve this answer















          The dynamic, run-time dispatch of CPython (compared to static, compile-time dispatch of Java) is only one of the reasons, why Java is faster than pure CPython: there are jit-compilation in Java, different garbage collection strategies, presence of native types like int, double vs. immutable data structures in CPython and so on.



          My earlier superficial experiments have shown, that the dynamical dispatch is only responsible for about 30% of running - you cannot explain speed differences of some factors of magnitude with that.



          To make this answer less abstract, let's take a look at an example:



          def add(x,y):
          return x+y


          Looking at the bytecode:



          import dis
          dis.dis(add)


          which gives:



          2         0 LOAD_FAST                0 (x)
          2 LOAD_FAST 1 (y)
          4 BINARY_ADD
          6 RETURN_VALUE


          We can see on the level of bytecode there is no difference whether x and y are integers or floats or something else - the interpreter doesn't care.



          The situation is completely different in Java:



          int add(int x, int y) {return x+y;}


          and



          float add(float x, float y) {return x+y;}


          would result in completely different opcodes and the call-dispatch would happen at compile time - the right version is picked depending on the static types which are known at the compile time.



          Pretty often CPython-interpreter doesn't have to know the exact type of arguments: Internally there is a base "class/interface" (obviously there are no classes in C, so it is called "protocol", but for somebody who knows C++/Java "interface" is probably the right mental model), from which all other "classes" are derived. This base "class" is called PyObject and here is the description of its protocol.. So as long as the function is a part of this protocol/interface CPython interpreter can call it, without knowing the exact type and the call will be dispatched to the right implementation (a lot like "virtual" functions in C++).



          On the pure Python side, it seems as if variables don't have types:



          a=1
          a="1"


          however, internally a has a type - it is PyObject* and this reference can be bound to an integer (1) and to an unicode-string ("1") - because they both "inherit" from PyObject.



          From time to time the CPython interpreter tries to find out the right type of the reference, also for the above example - when it sees BINARY_ADD-opcode, the following C-code is executed:



              case TARGET(BINARY_ADD): {
          PyObject *right = POP();
          PyObject *left = TOP();
          PyObject *sum;
          ...
          if (PyUnicode_CheckExact(left) &&
          PyUnicode_CheckExact(right)) {
          sum = unicode_concatenate(left, right, f, next_instr);
          /* unicode_concatenate consumed the ref to left */
          }
          else {
          sum = PyNumber_Add(left, right);
          Py_DECREF(left);
          }
          Py_DECREF(right);
          SET_TOP(sum);
          if (sum == NULL)
          goto error;
          DISPATCH();
          }


          Here the interpreter queries, whether both objects are unicode strings and if this is the case a special method (maybe more efficient, as matter of fact it tries to change the immutable unicode-object in-place, see this SO-answer) is used, otherwise the work is dispatched to PyNumber-protocol.



          Obviously, the interpreter also has to know the exact type when an object is created, for example for a="1" or a=1 different "classes" are used - but as we have seen it is not the only one place.



          So the interpreter interfers the types during the run-time, but most of the time it doesn't have to do it - the goal can be reached via dynamic dispatch.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Jan 11 at 16:08

























          answered Nov 24 '18 at 16:07









          eadead

          13.6k23164




          13.6k23164

























              0














              Python is built around the philosophy of duck typing. No explicit type checking takes place, not even during runtime. For example,



              >>> x = 5
              >>> y = "5"
              >>> '__mul__' in dir(x)
              >>> True
              >>> '__mul__' in dir(y)
              >>> True
              >>> type(x)
              >>> <class 'int'>
              >>> type(y)
              >>> <class 'str'>
              >>> type(x*y)
              >>> <class 'str'>


              The CPython interpreter checks if x and y have the __mul__ method defined, and tries to "make it work" and return a result. Also, Python bytecode never gets translated to machine code. It gets executed inside the CPython interpreter. One major difference between the JVM and the CPython virtual machine is that the JVM can compile Java bytecode to machine code for performance gains whenever it wants to (JIT compilation), whereas the CPython VM only runs bytecode just as it is.






              share|improve this answer
























              • What do you mean "Python bytecode never gets translated to machine code. It gets executed inside the CPython interpreter.". Could you please elaborate on that?

                – PyFox
                Nov 23 '18 at 17:22











              • Machine code usually refers to code that can be executed by your computer. For instance, when you compile a C++ program in your computer, it gets compiled to machine code specific to YOUR computer's CPU architecture. Your CPU understands these instructions and can run them. So in a way, your CPU is the interpreter here. Just think of Python bytecode as machine code for the CPython virtual machine. They are just instructions for the CPython virtual machine. The CPython virtual machine can run these instructions without having to translate them to something else

                – prithajnath
                Nov 23 '18 at 18:18
















              0














              Python is built around the philosophy of duck typing. No explicit type checking takes place, not even during runtime. For example,



              >>> x = 5
              >>> y = "5"
              >>> '__mul__' in dir(x)
              >>> True
              >>> '__mul__' in dir(y)
              >>> True
              >>> type(x)
              >>> <class 'int'>
              >>> type(y)
              >>> <class 'str'>
              >>> type(x*y)
              >>> <class 'str'>


              The CPython interpreter checks if x and y have the __mul__ method defined, and tries to "make it work" and return a result. Also, Python bytecode never gets translated to machine code. It gets executed inside the CPython interpreter. One major difference between the JVM and the CPython virtual machine is that the JVM can compile Java bytecode to machine code for performance gains whenever it wants to (JIT compilation), whereas the CPython VM only runs bytecode just as it is.






              share|improve this answer
























              • What do you mean "Python bytecode never gets translated to machine code. It gets executed inside the CPython interpreter.". Could you please elaborate on that?

                – PyFox
                Nov 23 '18 at 17:22











              • Machine code usually refers to code that can be executed by your computer. For instance, when you compile a C++ program in your computer, it gets compiled to machine code specific to YOUR computer's CPU architecture. Your CPU understands these instructions and can run them. So in a way, your CPU is the interpreter here. Just think of Python bytecode as machine code for the CPython virtual machine. They are just instructions for the CPython virtual machine. The CPython virtual machine can run these instructions without having to translate them to something else

                – prithajnath
                Nov 23 '18 at 18:18














              0












              0








              0







              Python is built around the philosophy of duck typing. No explicit type checking takes place, not even during runtime. For example,



              >>> x = 5
              >>> y = "5"
              >>> '__mul__' in dir(x)
              >>> True
              >>> '__mul__' in dir(y)
              >>> True
              >>> type(x)
              >>> <class 'int'>
              >>> type(y)
              >>> <class 'str'>
              >>> type(x*y)
              >>> <class 'str'>


              The CPython interpreter checks if x and y have the __mul__ method defined, and tries to "make it work" and return a result. Also, Python bytecode never gets translated to machine code. It gets executed inside the CPython interpreter. One major difference between the JVM and the CPython virtual machine is that the JVM can compile Java bytecode to machine code for performance gains whenever it wants to (JIT compilation), whereas the CPython VM only runs bytecode just as it is.






              share|improve this answer













              Python is built around the philosophy of duck typing. No explicit type checking takes place, not even during runtime. For example,



              >>> x = 5
              >>> y = "5"
              >>> '__mul__' in dir(x)
              >>> True
              >>> '__mul__' in dir(y)
              >>> True
              >>> type(x)
              >>> <class 'int'>
              >>> type(y)
              >>> <class 'str'>
              >>> type(x*y)
              >>> <class 'str'>


              The CPython interpreter checks if x and y have the __mul__ method defined, and tries to "make it work" and return a result. Also, Python bytecode never gets translated to machine code. It gets executed inside the CPython interpreter. One major difference between the JVM and the CPython virtual machine is that the JVM can compile Java bytecode to machine code for performance gains whenever it wants to (JIT compilation), whereas the CPython VM only runs bytecode just as it is.







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Nov 23 '18 at 16:02









              prithajnathprithajnath

              578413




              578413













              • What do you mean "Python bytecode never gets translated to machine code. It gets executed inside the CPython interpreter.". Could you please elaborate on that?

                – PyFox
                Nov 23 '18 at 17:22











              • Machine code usually refers to code that can be executed by your computer. For instance, when you compile a C++ program in your computer, it gets compiled to machine code specific to YOUR computer's CPU architecture. Your CPU understands these instructions and can run them. So in a way, your CPU is the interpreter here. Just think of Python bytecode as machine code for the CPython virtual machine. They are just instructions for the CPython virtual machine. The CPython virtual machine can run these instructions without having to translate them to something else

                – prithajnath
                Nov 23 '18 at 18:18



















              • What do you mean "Python bytecode never gets translated to machine code. It gets executed inside the CPython interpreter.". Could you please elaborate on that?

                – PyFox
                Nov 23 '18 at 17:22











              • Machine code usually refers to code that can be executed by your computer. For instance, when you compile a C++ program in your computer, it gets compiled to machine code specific to YOUR computer's CPU architecture. Your CPU understands these instructions and can run them. So in a way, your CPU is the interpreter here. Just think of Python bytecode as machine code for the CPython virtual machine. They are just instructions for the CPython virtual machine. The CPython virtual machine can run these instructions without having to translate them to something else

                – prithajnath
                Nov 23 '18 at 18:18

















              What do you mean "Python bytecode never gets translated to machine code. It gets executed inside the CPython interpreter.". Could you please elaborate on that?

              – PyFox
              Nov 23 '18 at 17:22





              What do you mean "Python bytecode never gets translated to machine code. It gets executed inside the CPython interpreter.". Could you please elaborate on that?

              – PyFox
              Nov 23 '18 at 17:22













              Machine code usually refers to code that can be executed by your computer. For instance, when you compile a C++ program in your computer, it gets compiled to machine code specific to YOUR computer's CPU architecture. Your CPU understands these instructions and can run them. So in a way, your CPU is the interpreter here. Just think of Python bytecode as machine code for the CPython virtual machine. They are just instructions for the CPython virtual machine. The CPython virtual machine can run these instructions without having to translate them to something else

              – prithajnath
              Nov 23 '18 at 18:18





              Machine code usually refers to code that can be executed by your computer. For instance, when you compile a C++ program in your computer, it gets compiled to machine code specific to YOUR computer's CPU architecture. Your CPU understands these instructions and can run them. So in a way, your CPU is the interpreter here. Just think of Python bytecode as machine code for the CPython virtual machine. They are just instructions for the CPython virtual machine. The CPython virtual machine can run these instructions without having to translate them to something else

              – prithajnath
              Nov 23 '18 at 18:18











              0














              It could be useful for your understanding to avoid thinking of "variables" in Python. Compared to statically typed languages that have to associate a type with a variable, a class member, or a function argument, Python only deals with "labels" or names for objects.



              So in the snippet,



              a = "a string"
              a = 5 # a number
              a = MyClass() # an object of type MyClass


              the label a never has a type. It is just a name that points to different objects at different times (very similarly, in fact, to "pointers" in other languages). The objects on the other hand (the string, the number) always have a type. This nature of this type could change, as you can dynamically change the definition of a class, but it will always be determined, i.e. known by the language interpreter.



              So to answer the question: Python never determines the type of a variable (label/name), it only uses it to refer to an object and that object has a type.






              share|improve this answer




























                0














                It could be useful for your understanding to avoid thinking of "variables" in Python. Compared to statically typed languages that have to associate a type with a variable, a class member, or a function argument, Python only deals with "labels" or names for objects.



                So in the snippet,



                a = "a string"
                a = 5 # a number
                a = MyClass() # an object of type MyClass


                the label a never has a type. It is just a name that points to different objects at different times (very similarly, in fact, to "pointers" in other languages). The objects on the other hand (the string, the number) always have a type. This nature of this type could change, as you can dynamically change the definition of a class, but it will always be determined, i.e. known by the language interpreter.



                So to answer the question: Python never determines the type of a variable (label/name), it only uses it to refer to an object and that object has a type.






                share|improve this answer


























                  0












                  0








                  0







                  It could be useful for your understanding to avoid thinking of "variables" in Python. Compared to statically typed languages that have to associate a type with a variable, a class member, or a function argument, Python only deals with "labels" or names for objects.



                  So in the snippet,



                  a = "a string"
                  a = 5 # a number
                  a = MyClass() # an object of type MyClass


                  the label a never has a type. It is just a name that points to different objects at different times (very similarly, in fact, to "pointers" in other languages). The objects on the other hand (the string, the number) always have a type. This nature of this type could change, as you can dynamically change the definition of a class, but it will always be determined, i.e. known by the language interpreter.



                  So to answer the question: Python never determines the type of a variable (label/name), it only uses it to refer to an object and that object has a type.






                  share|improve this answer













                  It could be useful for your understanding to avoid thinking of "variables" in Python. Compared to statically typed languages that have to associate a type with a variable, a class member, or a function argument, Python only deals with "labels" or names for objects.



                  So in the snippet,



                  a = "a string"
                  a = 5 # a number
                  a = MyClass() # an object of type MyClass


                  the label a never has a type. It is just a name that points to different objects at different times (very similarly, in fact, to "pointers" in other languages). The objects on the other hand (the string, the number) always have a type. This nature of this type could change, as you can dynamically change the definition of a class, but it will always be determined, i.e. known by the language interpreter.



                  So to answer the question: Python never determines the type of a variable (label/name), it only uses it to refer to an object and that object has a type.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 26 '18 at 0:12









                  strankstrank

                  11




                  11






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53449112%2fhow-and-when-does-python-determine-the-data-type-of-a-variable%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      這個網誌中的熱門文章

                      Tangent Lines Diagram Along Smooth Curve

                      Yusuf al-Mu'taman ibn Hud

                      Zucchini