beautiful soup - turning attributes into dataframe - BEA API











up vote
0
down vote

favorite












I'm attempting to use the BEA's API to query income data. API Instructions - https://apps.bea.gov/api/_pdf/bea_web_service_api_user_guide.pdf



My goal is to parse the XML generated and turn it into a dataframe, with columns for the different years.



The issue that I run into is that the way I am parsing the data, it is in a "melted" format, where I want individual columns for the years and the Income data for those years in each of those columns.



How can I accomplish this? Below is the code that I am using. It requires that you sign up for an API key via email and enter it after "UserID" in the URL below.



bea_income = 'https://apps.bea.gov/api/data/?UserID=ENTERYOURAPIKEY&method=GetData&'
'datasetname=RegionalIncome&TableName=RPI2&LineCode=2&Year=2014,2015,2016&GeoFips=MSA&ResultFormat=xml'

bea_inc_request = requests.get(bea_income, headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8'})
bea_inc_html = bea_inc_request.content
bea_inc_soup = BeautifulSoup(bea_inc_html, 'xml')

MSA =
TimePeriod =
Income =
GeoFips =

for i in range(len(bea_inc_soup.Results.find_all('Data'))):
MSA.append(bea_inc_soup.Results.find_all('Data')[i]['GeoName'])
GeoFips.append(bea_inc_soup.Results.find_all('Data')[i]['GeoFips'])
Income.append(bea_inc_soup.Results.find_all('Data')[i]['DataValue'])
TimePeriod.append(bea_inc_soup.Results.find_all('Data')[i]['TimePeriod'])


income_data = pd.DataFrame({'MSA':MSA, 'FIPS':GeoFips, 'Year':TimePeriod, 'Income':Income})

MSA FIPS Year Income
0 Abilene, TX (Metropolitan Statistical Area) 10180 2014 41818
1 Abilene, TX (Metropolitan Statistical Area) 10180 2015 41651
2 Abilene, TX (Metropolitan Statistical Area) 10180 2016 40409
3 Akron, OH (Metropolitan Statistical Area) 10420 2016 45448
4 Akron, OH (Metropolitan Statistical Area) 10420 2015 45298









share|improve this question


























    up vote
    0
    down vote

    favorite












    I'm attempting to use the BEA's API to query income data. API Instructions - https://apps.bea.gov/api/_pdf/bea_web_service_api_user_guide.pdf



    My goal is to parse the XML generated and turn it into a dataframe, with columns for the different years.



    The issue that I run into is that the way I am parsing the data, it is in a "melted" format, where I want individual columns for the years and the Income data for those years in each of those columns.



    How can I accomplish this? Below is the code that I am using. It requires that you sign up for an API key via email and enter it after "UserID" in the URL below.



    bea_income = 'https://apps.bea.gov/api/data/?UserID=ENTERYOURAPIKEY&method=GetData&'
    'datasetname=RegionalIncome&TableName=RPI2&LineCode=2&Year=2014,2015,2016&GeoFips=MSA&ResultFormat=xml'

    bea_inc_request = requests.get(bea_income, headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8'})
    bea_inc_html = bea_inc_request.content
    bea_inc_soup = BeautifulSoup(bea_inc_html, 'xml')

    MSA =
    TimePeriod =
    Income =
    GeoFips =

    for i in range(len(bea_inc_soup.Results.find_all('Data'))):
    MSA.append(bea_inc_soup.Results.find_all('Data')[i]['GeoName'])
    GeoFips.append(bea_inc_soup.Results.find_all('Data')[i]['GeoFips'])
    Income.append(bea_inc_soup.Results.find_all('Data')[i]['DataValue'])
    TimePeriod.append(bea_inc_soup.Results.find_all('Data')[i]['TimePeriod'])


    income_data = pd.DataFrame({'MSA':MSA, 'FIPS':GeoFips, 'Year':TimePeriod, 'Income':Income})

    MSA FIPS Year Income
    0 Abilene, TX (Metropolitan Statistical Area) 10180 2014 41818
    1 Abilene, TX (Metropolitan Statistical Area) 10180 2015 41651
    2 Abilene, TX (Metropolitan Statistical Area) 10180 2016 40409
    3 Akron, OH (Metropolitan Statistical Area) 10420 2016 45448
    4 Akron, OH (Metropolitan Statistical Area) 10420 2015 45298









    share|improve this question
























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      I'm attempting to use the BEA's API to query income data. API Instructions - https://apps.bea.gov/api/_pdf/bea_web_service_api_user_guide.pdf



      My goal is to parse the XML generated and turn it into a dataframe, with columns for the different years.



      The issue that I run into is that the way I am parsing the data, it is in a "melted" format, where I want individual columns for the years and the Income data for those years in each of those columns.



      How can I accomplish this? Below is the code that I am using. It requires that you sign up for an API key via email and enter it after "UserID" in the URL below.



      bea_income = 'https://apps.bea.gov/api/data/?UserID=ENTERYOURAPIKEY&method=GetData&'
      'datasetname=RegionalIncome&TableName=RPI2&LineCode=2&Year=2014,2015,2016&GeoFips=MSA&ResultFormat=xml'

      bea_inc_request = requests.get(bea_income, headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36',
      'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8'})
      bea_inc_html = bea_inc_request.content
      bea_inc_soup = BeautifulSoup(bea_inc_html, 'xml')

      MSA =
      TimePeriod =
      Income =
      GeoFips =

      for i in range(len(bea_inc_soup.Results.find_all('Data'))):
      MSA.append(bea_inc_soup.Results.find_all('Data')[i]['GeoName'])
      GeoFips.append(bea_inc_soup.Results.find_all('Data')[i]['GeoFips'])
      Income.append(bea_inc_soup.Results.find_all('Data')[i]['DataValue'])
      TimePeriod.append(bea_inc_soup.Results.find_all('Data')[i]['TimePeriod'])


      income_data = pd.DataFrame({'MSA':MSA, 'FIPS':GeoFips, 'Year':TimePeriod, 'Income':Income})

      MSA FIPS Year Income
      0 Abilene, TX (Metropolitan Statistical Area) 10180 2014 41818
      1 Abilene, TX (Metropolitan Statistical Area) 10180 2015 41651
      2 Abilene, TX (Metropolitan Statistical Area) 10180 2016 40409
      3 Akron, OH (Metropolitan Statistical Area) 10420 2016 45448
      4 Akron, OH (Metropolitan Statistical Area) 10420 2015 45298









      share|improve this question













      I'm attempting to use the BEA's API to query income data. API Instructions - https://apps.bea.gov/api/_pdf/bea_web_service_api_user_guide.pdf



      My goal is to parse the XML generated and turn it into a dataframe, with columns for the different years.



      The issue that I run into is that the way I am parsing the data, it is in a "melted" format, where I want individual columns for the years and the Income data for those years in each of those columns.



      How can I accomplish this? Below is the code that I am using. It requires that you sign up for an API key via email and enter it after "UserID" in the URL below.



      bea_income = 'https://apps.bea.gov/api/data/?UserID=ENTERYOURAPIKEY&method=GetData&'
      'datasetname=RegionalIncome&TableName=RPI2&LineCode=2&Year=2014,2015,2016&GeoFips=MSA&ResultFormat=xml'

      bea_inc_request = requests.get(bea_income, headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36',
      'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8'})
      bea_inc_html = bea_inc_request.content
      bea_inc_soup = BeautifulSoup(bea_inc_html, 'xml')

      MSA =
      TimePeriod =
      Income =
      GeoFips =

      for i in range(len(bea_inc_soup.Results.find_all('Data'))):
      MSA.append(bea_inc_soup.Results.find_all('Data')[i]['GeoName'])
      GeoFips.append(bea_inc_soup.Results.find_all('Data')[i]['GeoFips'])
      Income.append(bea_inc_soup.Results.find_all('Data')[i]['DataValue'])
      TimePeriod.append(bea_inc_soup.Results.find_all('Data')[i]['TimePeriod'])


      income_data = pd.DataFrame({'MSA':MSA, 'FIPS':GeoFips, 'Year':TimePeriod, 'Income':Income})

      MSA FIPS Year Income
      0 Abilene, TX (Metropolitan Statistical Area) 10180 2014 41818
      1 Abilene, TX (Metropolitan Statistical Area) 10180 2015 41651
      2 Abilene, TX (Metropolitan Statistical Area) 10180 2016 40409
      3 Akron, OH (Metropolitan Statistical Area) 10420 2016 45448
      4 Akron, OH (Metropolitan Statistical Area) 10420 2015 45298






      python pandas beautifulsoup






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 5 at 3:52









      steich

      467




      467





























          active

          oldest

          votes











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














           

          draft saved


          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53148088%2fbeautiful-soup-turning-attributes-into-dataframe-bea-api%23new-answer', 'question_page');
          }
          );

          Post as a guest





































          active

          oldest

          votes













          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















           

          draft saved


          draft discarded



















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53148088%2fbeautiful-soup-turning-attributes-into-dataframe-bea-api%23new-answer', 'question_page');
          }
          );

          Post as a guest




















































































          這個網誌中的熱門文章

          Tangent Lines Diagram Along Smooth Curve

          Yusuf al-Mu'taman ibn Hud

          Zucchini