splitting a dataframe into chunks and naming each new chunk into a dataframe












0















is there a good code to split dataframes into chunks and automatically name each chunk into its own dataframe?



for example, dfmaster has 1000 records. split by 200 and create df1, df2,….df5
any guidance would be much appreciated.



I've looked on other boards and there is no guidance for a function that can automatically create new dataframes.










share|improve this question

























  • If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.

    – RoyM
    Nov 15 '18 at 7:08
















0















is there a good code to split dataframes into chunks and automatically name each chunk into its own dataframe?



for example, dfmaster has 1000 records. split by 200 and create df1, df2,….df5
any guidance would be much appreciated.



I've looked on other boards and there is no guidance for a function that can automatically create new dataframes.










share|improve this question

























  • If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.

    – RoyM
    Nov 15 '18 at 7:08














0












0








0








is there a good code to split dataframes into chunks and automatically name each chunk into its own dataframe?



for example, dfmaster has 1000 records. split by 200 and create df1, df2,….df5
any guidance would be much appreciated.



I've looked on other boards and there is no guidance for a function that can automatically create new dataframes.










share|improve this question
















is there a good code to split dataframes into chunks and automatically name each chunk into its own dataframe?



for example, dfmaster has 1000 records. split by 200 and create df1, df2,….df5
any guidance would be much appreciated.



I've looked on other boards and there is no guidance for a function that can automatically create new dataframes.







python loops dataframe split chunks






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 15 '18 at 7:05









Torxed

13.3k105587




13.3k105587










asked Nov 15 '18 at 7:03









pynewbeepynewbee

133212




133212













  • If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.

    – RoyM
    Nov 15 '18 at 7:08



















  • If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.

    – RoyM
    Nov 15 '18 at 7:08

















If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.

– RoyM
Nov 15 '18 at 7:08





If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.

– RoyM
Nov 15 '18 at 7:08












2 Answers
2






active

oldest

votes


















1














Use numpy for splitting:



See example below:



In [2095]: df
Out[2095]:
0 1 2 3 4 5 6 7 8 9 10
0 0.25 0.00 0.00 0.0 0.00 0.0 0.94 0.00 0.00 0.63 0.00
1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.00 0.51
3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN

In [2096]: np.split(df, 2)
Out[2096]:
[ 0 1 2 3 4 5 6 7 8 9 10
0 0.25 0.00 0.0 0.0 0.0 0.0 0.94 0.0 0.0 0.63 0.0
1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN,
0 1 2 3 4 5 6 7 8 9 10
2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.0 0.51
3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN]



df gets split into 2 dataframes having 2 rows each.



You can do np.split(df, 500)






share|improve this answer































    0














    I find these ideas helpful:



    solution via list:
    https://stackoverflow.com/a/49563326/10396469



    solution using numpy.split:
    https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.split.html



    just use df = df.values first to convert from dataframe to numpy.array.






    share|improve this answer























      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53314056%2fsplitting-a-dataframe-into-chunks-and-naming-each-new-chunk-into-a-dataframe%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      1














      Use numpy for splitting:



      See example below:



      In [2095]: df
      Out[2095]:
      0 1 2 3 4 5 6 7 8 9 10
      0 0.25 0.00 0.00 0.0 0.00 0.0 0.94 0.00 0.00 0.63 0.00
      1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN
      2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.00 0.51
      3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN

      In [2096]: np.split(df, 2)
      Out[2096]:
      [ 0 1 2 3 4 5 6 7 8 9 10
      0 0.25 0.00 0.0 0.0 0.0 0.0 0.94 0.0 0.0 0.63 0.0
      1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN,
      0 1 2 3 4 5 6 7 8 9 10
      2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.0 0.51
      3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN]



      df gets split into 2 dataframes having 2 rows each.



      You can do np.split(df, 500)






      share|improve this answer




























        1














        Use numpy for splitting:



        See example below:



        In [2095]: df
        Out[2095]:
        0 1 2 3 4 5 6 7 8 9 10
        0 0.25 0.00 0.00 0.0 0.00 0.0 0.94 0.00 0.00 0.63 0.00
        1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN
        2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.00 0.51
        3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN

        In [2096]: np.split(df, 2)
        Out[2096]:
        [ 0 1 2 3 4 5 6 7 8 9 10
        0 0.25 0.00 0.0 0.0 0.0 0.0 0.94 0.0 0.0 0.63 0.0
        1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN,
        0 1 2 3 4 5 6 7 8 9 10
        2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.0 0.51
        3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN]



        df gets split into 2 dataframes having 2 rows each.



        You can do np.split(df, 500)






        share|improve this answer


























          1












          1








          1







          Use numpy for splitting:



          See example below:



          In [2095]: df
          Out[2095]:
          0 1 2 3 4 5 6 7 8 9 10
          0 0.25 0.00 0.00 0.0 0.00 0.0 0.94 0.00 0.00 0.63 0.00
          1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN
          2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.00 0.51
          3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN

          In [2096]: np.split(df, 2)
          Out[2096]:
          [ 0 1 2 3 4 5 6 7 8 9 10
          0 0.25 0.00 0.0 0.0 0.0 0.0 0.94 0.0 0.0 0.63 0.0
          1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN,
          0 1 2 3 4 5 6 7 8 9 10
          2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.0 0.51
          3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN]



          df gets split into 2 dataframes having 2 rows each.



          You can do np.split(df, 500)






          share|improve this answer













          Use numpy for splitting:



          See example below:



          In [2095]: df
          Out[2095]:
          0 1 2 3 4 5 6 7 8 9 10
          0 0.25 0.00 0.00 0.0 0.00 0.0 0.94 0.00 0.00 0.63 0.00
          1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN
          2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.00 0.51
          3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN

          In [2096]: np.split(df, 2)
          Out[2096]:
          [ 0 1 2 3 4 5 6 7 8 9 10
          0 0.25 0.00 0.0 0.0 0.0 0.0 0.94 0.0 0.0 0.63 0.0
          1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN,
          0 1 2 3 4 5 6 7 8 9 10
          2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.0 0.51
          3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN]



          df gets split into 2 dataframes having 2 rows each.



          You can do np.split(df, 500)







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 15 '18 at 7:15









          Mayank PorwalMayank Porwal

          4,8822724




          4,8822724

























              0














              I find these ideas helpful:



              solution via list:
              https://stackoverflow.com/a/49563326/10396469



              solution using numpy.split:
              https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.split.html



              just use df = df.values first to convert from dataframe to numpy.array.






              share|improve this answer




























                0














                I find these ideas helpful:



                solution via list:
                https://stackoverflow.com/a/49563326/10396469



                solution using numpy.split:
                https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.split.html



                just use df = df.values first to convert from dataframe to numpy.array.






                share|improve this answer


























                  0












                  0








                  0







                  I find these ideas helpful:



                  solution via list:
                  https://stackoverflow.com/a/49563326/10396469



                  solution using numpy.split:
                  https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.split.html



                  just use df = df.values first to convert from dataframe to numpy.array.






                  share|improve this answer













                  I find these ideas helpful:



                  solution via list:
                  https://stackoverflow.com/a/49563326/10396469



                  solution using numpy.split:
                  https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.split.html



                  just use df = df.values first to convert from dataframe to numpy.array.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 15 '18 at 7:12









                  Ruslan S.Ruslan S.

                  165




                  165






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53314056%2fsplitting-a-dataframe-into-chunks-and-naming-each-new-chunk-into-a-dataframe%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      這個網誌中的熱門文章

                      Academy of Television Arts & Sciences

                      L'Équipe

                      1995 France bombings