String to indicator variable, type mismatch error





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















I am trying to convert a string variable (type str2, format %9s) into an indicator variable in Stata.



However, I keep receiving the following error:




type mismatch r(109)




I am using the 2016 ANES set and I am essentially trying to group states into open primary and closed primary/caucus states.



I have attempted the following code:



gen oprim= (state=="AL" & "AK" & "CO" & "GA" &...)

gen oprim=1 if state=="AL" & "AK" & "CO" & "GA" &...


I have had trouble converting this variable before. for example, I tried generating the new indicator variable without putting quotations around the state codes.



I have also tried to destring the variable, but I am receiving the following output:



destring state, generate(statenum) float
state: contains nonnumeric characters; no **generate**


Any help anyone could offer would be much appreciated.










share|improve this question




















  • 2





    Please provide example data using dataex. Read here for more information.

    – Pearly Spencer
    Nov 23 '18 at 19:28


















0















I am trying to convert a string variable (type str2, format %9s) into an indicator variable in Stata.



However, I keep receiving the following error:




type mismatch r(109)




I am using the 2016 ANES set and I am essentially trying to group states into open primary and closed primary/caucus states.



I have attempted the following code:



gen oprim= (state=="AL" & "AK" & "CO" & "GA" &...)

gen oprim=1 if state=="AL" & "AK" & "CO" & "GA" &...


I have had trouble converting this variable before. for example, I tried generating the new indicator variable without putting quotations around the state codes.



I have also tried to destring the variable, but I am receiving the following output:



destring state, generate(statenum) float
state: contains nonnumeric characters; no **generate**


Any help anyone could offer would be much appreciated.










share|improve this question




















  • 2





    Please provide example data using dataex. Read here for more information.

    – Pearly Spencer
    Nov 23 '18 at 19:28














0












0








0








I am trying to convert a string variable (type str2, format %9s) into an indicator variable in Stata.



However, I keep receiving the following error:




type mismatch r(109)




I am using the 2016 ANES set and I am essentially trying to group states into open primary and closed primary/caucus states.



I have attempted the following code:



gen oprim= (state=="AL" & "AK" & "CO" & "GA" &...)

gen oprim=1 if state=="AL" & "AK" & "CO" & "GA" &...


I have had trouble converting this variable before. for example, I tried generating the new indicator variable without putting quotations around the state codes.



I have also tried to destring the variable, but I am receiving the following output:



destring state, generate(statenum) float
state: contains nonnumeric characters; no **generate**


Any help anyone could offer would be much appreciated.










share|improve this question
















I am trying to convert a string variable (type str2, format %9s) into an indicator variable in Stata.



However, I keep receiving the following error:




type mismatch r(109)




I am using the 2016 ANES set and I am essentially trying to group states into open primary and closed primary/caucus states.



I have attempted the following code:



gen oprim= (state=="AL" & "AK" & "CO" & "GA" &...)

gen oprim=1 if state=="AL" & "AK" & "CO" & "GA" &...


I have had trouble converting this variable before. for example, I tried generating the new indicator variable without putting quotations around the state codes.



I have also tried to destring the variable, but I am receiving the following output:



destring state, generate(statenum) float
state: contains nonnumeric characters; no **generate**


Any help anyone could offer would be much appreciated.







string stata type-mismatch






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 24 '18 at 10:26









Pearly Spencer

12.6k174070




12.6k174070










asked Nov 23 '18 at 19:04









Alec MAlec M

183




183








  • 2





    Please provide example data using dataex. Read here for more information.

    – Pearly Spencer
    Nov 23 '18 at 19:28














  • 2





    Please provide example data using dataex. Read here for more information.

    – Pearly Spencer
    Nov 23 '18 at 19:28








2




2





Please provide example data using dataex. Read here for more information.

– Pearly Spencer
Nov 23 '18 at 19:28





Please provide example data using dataex. Read here for more information.

– Pearly Spencer
Nov 23 '18 at 19:28












2 Answers
2






active

oldest

votes


















2














Let's spell out why the code in the question is wrong. The OP doesn't give example data but the errors are all identifiable without such data, assuming naturally that state is a string variable in the dataset.



First, we can leave out the ... (which no one presumes are legal) and the parentheses (which make no difference).



gen oprim = state=="AL" & "AK" & "CO" & "GA"

gen oprim=1 if state=="AL" & "AK" & "CO" & "GA"


Either of these will fail because Stata parses the if condition as



if



state == "AL"



& "AK"



& "CO"



& "GA"



state == "AL" is a true-or-false condition evaluated as 0 or 1, but none of "AK" "CO" "GA" is a true or false condition; they are all string values and so the commands fail, because Stata needs to see something numeric as each of the elements in a if condition. Although clearly silly,



gen oprim = state == "AL" & 42



would be legal as 42 is numeric (and in true-or-false evaluations counts as true). Stata won't fill in state ==, which is what you hope to see implied.



If you rewrite



gen oprim = state == "AL" & state == "AK" & state == "CO" & state == "GA" 


then you have a legal command. It's just not at all what you evidently want. It's impossible for state to be equal to different string values in the same observation, which is what this command is testing for. You're confusing & (and) with | (or).



gen oprim = state == "AL" | state == "AK" | state == "CO" | state == "GA" 


Such statements get long and are tedious and error-prone to write out, but Stata has alternative syntax



gen oprim = inlist(state, "AL", "AK", "CO", "GA") 


There are limits to that -- and yet other strategies too -- but I will leave this answer there without addressing further issues.






share|improve this answer































    1














    Using the first ten observations of the census toy dataset:



    sysuse census, clear
    keep if _n <= 10


    The following works for me:



    generate oprim = 0 
    replace oprim = 1 if state2 == "AZ" | state2 == "DE"

    list state2 oprim, separator(0)

    +----------------+
    | state2 oprim |
    |----------------|
    1. | AL 0 |
    2. | AK 0 |
    3. | AZ 1 |
    4. | AR 0 |
    5. | CA 0 |
    6. | CO 0 |
    7. | CT 0 |
    8. | DE 1 |
    9. | FL 0 |
    10. | GA 0 |
    +----------------+





    share|improve this answer


























      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53451828%2fstring-to-indicator-variable-type-mismatch-error%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      2














      Let's spell out why the code in the question is wrong. The OP doesn't give example data but the errors are all identifiable without such data, assuming naturally that state is a string variable in the dataset.



      First, we can leave out the ... (which no one presumes are legal) and the parentheses (which make no difference).



      gen oprim = state=="AL" & "AK" & "CO" & "GA"

      gen oprim=1 if state=="AL" & "AK" & "CO" & "GA"


      Either of these will fail because Stata parses the if condition as



      if



      state == "AL"



      & "AK"



      & "CO"



      & "GA"



      state == "AL" is a true-or-false condition evaluated as 0 or 1, but none of "AK" "CO" "GA" is a true or false condition; they are all string values and so the commands fail, because Stata needs to see something numeric as each of the elements in a if condition. Although clearly silly,



      gen oprim = state == "AL" & 42



      would be legal as 42 is numeric (and in true-or-false evaluations counts as true). Stata won't fill in state ==, which is what you hope to see implied.



      If you rewrite



      gen oprim = state == "AL" & state == "AK" & state == "CO" & state == "GA" 


      then you have a legal command. It's just not at all what you evidently want. It's impossible for state to be equal to different string values in the same observation, which is what this command is testing for. You're confusing & (and) with | (or).



      gen oprim = state == "AL" | state == "AK" | state == "CO" | state == "GA" 


      Such statements get long and are tedious and error-prone to write out, but Stata has alternative syntax



      gen oprim = inlist(state, "AL", "AK", "CO", "GA") 


      There are limits to that -- and yet other strategies too -- but I will leave this answer there without addressing further issues.






      share|improve this answer




























        2














        Let's spell out why the code in the question is wrong. The OP doesn't give example data but the errors are all identifiable without such data, assuming naturally that state is a string variable in the dataset.



        First, we can leave out the ... (which no one presumes are legal) and the parentheses (which make no difference).



        gen oprim = state=="AL" & "AK" & "CO" & "GA"

        gen oprim=1 if state=="AL" & "AK" & "CO" & "GA"


        Either of these will fail because Stata parses the if condition as



        if



        state == "AL"



        & "AK"



        & "CO"



        & "GA"



        state == "AL" is a true-or-false condition evaluated as 0 or 1, but none of "AK" "CO" "GA" is a true or false condition; they are all string values and so the commands fail, because Stata needs to see something numeric as each of the elements in a if condition. Although clearly silly,



        gen oprim = state == "AL" & 42



        would be legal as 42 is numeric (and in true-or-false evaluations counts as true). Stata won't fill in state ==, which is what you hope to see implied.



        If you rewrite



        gen oprim = state == "AL" & state == "AK" & state == "CO" & state == "GA" 


        then you have a legal command. It's just not at all what you evidently want. It's impossible for state to be equal to different string values in the same observation, which is what this command is testing for. You're confusing & (and) with | (or).



        gen oprim = state == "AL" | state == "AK" | state == "CO" | state == "GA" 


        Such statements get long and are tedious and error-prone to write out, but Stata has alternative syntax



        gen oprim = inlist(state, "AL", "AK", "CO", "GA") 


        There are limits to that -- and yet other strategies too -- but I will leave this answer there without addressing further issues.






        share|improve this answer


























          2












          2








          2







          Let's spell out why the code in the question is wrong. The OP doesn't give example data but the errors are all identifiable without such data, assuming naturally that state is a string variable in the dataset.



          First, we can leave out the ... (which no one presumes are legal) and the parentheses (which make no difference).



          gen oprim = state=="AL" & "AK" & "CO" & "GA"

          gen oprim=1 if state=="AL" & "AK" & "CO" & "GA"


          Either of these will fail because Stata parses the if condition as



          if



          state == "AL"



          & "AK"



          & "CO"



          & "GA"



          state == "AL" is a true-or-false condition evaluated as 0 or 1, but none of "AK" "CO" "GA" is a true or false condition; they are all string values and so the commands fail, because Stata needs to see something numeric as each of the elements in a if condition. Although clearly silly,



          gen oprim = state == "AL" & 42



          would be legal as 42 is numeric (and in true-or-false evaluations counts as true). Stata won't fill in state ==, which is what you hope to see implied.



          If you rewrite



          gen oprim = state == "AL" & state == "AK" & state == "CO" & state == "GA" 


          then you have a legal command. It's just not at all what you evidently want. It's impossible for state to be equal to different string values in the same observation, which is what this command is testing for. You're confusing & (and) with | (or).



          gen oprim = state == "AL" | state == "AK" | state == "CO" | state == "GA" 


          Such statements get long and are tedious and error-prone to write out, but Stata has alternative syntax



          gen oprim = inlist(state, "AL", "AK", "CO", "GA") 


          There are limits to that -- and yet other strategies too -- but I will leave this answer there without addressing further issues.






          share|improve this answer













          Let's spell out why the code in the question is wrong. The OP doesn't give example data but the errors are all identifiable without such data, assuming naturally that state is a string variable in the dataset.



          First, we can leave out the ... (which no one presumes are legal) and the parentheses (which make no difference).



          gen oprim = state=="AL" & "AK" & "CO" & "GA"

          gen oprim=1 if state=="AL" & "AK" & "CO" & "GA"


          Either of these will fail because Stata parses the if condition as



          if



          state == "AL"



          & "AK"



          & "CO"



          & "GA"



          state == "AL" is a true-or-false condition evaluated as 0 or 1, but none of "AK" "CO" "GA" is a true or false condition; they are all string values and so the commands fail, because Stata needs to see something numeric as each of the elements in a if condition. Although clearly silly,



          gen oprim = state == "AL" & 42



          would be legal as 42 is numeric (and in true-or-false evaluations counts as true). Stata won't fill in state ==, which is what you hope to see implied.



          If you rewrite



          gen oprim = state == "AL" & state == "AK" & state == "CO" & state == "GA" 


          then you have a legal command. It's just not at all what you evidently want. It's impossible for state to be equal to different string values in the same observation, which is what this command is testing for. You're confusing & (and) with | (or).



          gen oprim = state == "AL" | state == "AK" | state == "CO" | state == "GA" 


          Such statements get long and are tedious and error-prone to write out, but Stata has alternative syntax



          gen oprim = inlist(state, "AL", "AK", "CO", "GA") 


          There are limits to that -- and yet other strategies too -- but I will leave this answer there without addressing further issues.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 24 '18 at 8:30









          Nick CoxNick Cox

          25.5k42139




          25.5k42139

























              1














              Using the first ten observations of the census toy dataset:



              sysuse census, clear
              keep if _n <= 10


              The following works for me:



              generate oprim = 0 
              replace oprim = 1 if state2 == "AZ" | state2 == "DE"

              list state2 oprim, separator(0)

              +----------------+
              | state2 oprim |
              |----------------|
              1. | AL 0 |
              2. | AK 0 |
              3. | AZ 1 |
              4. | AR 0 |
              5. | CA 0 |
              6. | CO 0 |
              7. | CT 0 |
              8. | DE 1 |
              9. | FL 0 |
              10. | GA 0 |
              +----------------+





              share|improve this answer






























                1














                Using the first ten observations of the census toy dataset:



                sysuse census, clear
                keep if _n <= 10


                The following works for me:



                generate oprim = 0 
                replace oprim = 1 if state2 == "AZ" | state2 == "DE"

                list state2 oprim, separator(0)

                +----------------+
                | state2 oprim |
                |----------------|
                1. | AL 0 |
                2. | AK 0 |
                3. | AZ 1 |
                4. | AR 0 |
                5. | CA 0 |
                6. | CO 0 |
                7. | CT 0 |
                8. | DE 1 |
                9. | FL 0 |
                10. | GA 0 |
                +----------------+





                share|improve this answer




























                  1












                  1








                  1







                  Using the first ten observations of the census toy dataset:



                  sysuse census, clear
                  keep if _n <= 10


                  The following works for me:



                  generate oprim = 0 
                  replace oprim = 1 if state2 == "AZ" | state2 == "DE"

                  list state2 oprim, separator(0)

                  +----------------+
                  | state2 oprim |
                  |----------------|
                  1. | AL 0 |
                  2. | AK 0 |
                  3. | AZ 1 |
                  4. | AR 0 |
                  5. | CA 0 |
                  6. | CO 0 |
                  7. | CT 0 |
                  8. | DE 1 |
                  9. | FL 0 |
                  10. | GA 0 |
                  +----------------+





                  share|improve this answer















                  Using the first ten observations of the census toy dataset:



                  sysuse census, clear
                  keep if _n <= 10


                  The following works for me:



                  generate oprim = 0 
                  replace oprim = 1 if state2 == "AZ" | state2 == "DE"

                  list state2 oprim, separator(0)

                  +----------------+
                  | state2 oprim |
                  |----------------|
                  1. | AL 0 |
                  2. | AK 0 |
                  3. | AZ 1 |
                  4. | AR 0 |
                  5. | CA 0 |
                  6. | CO 0 |
                  7. | CT 0 |
                  8. | DE 1 |
                  9. | FL 0 |
                  10. | GA 0 |
                  +----------------+






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Nov 26 '18 at 18:50

























                  answered Nov 23 '18 at 19:22









                  Pearly SpencerPearly Spencer

                  12.6k174070




                  12.6k174070






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53451828%2fstring-to-indicator-variable-type-mismatch-error%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      這個網誌中的熱門文章

                      Tangent Lines Diagram Along Smooth Curve

                      Yusuf al-Mu'taman ibn Hud

                      Zucchini