Batch copying TSVs to postgres with copy command





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







1















I wrote this script that uploads the contents of a folder of TSVs to my Postgres database.



It works, but it reads the files line by line, which takes a long time.



Is there a way to modify this so it runs the COPY command instead of an INSERT command?



I left my previous attempt at a COPY in the code below (but commented out). The problem with that code is that it copied the file headers into the rows of my Postgres table.



def main():

# MAKE SURE THIS IS THE RIGHT FILE TYPE
for file in pathlib.Path().rglob('*.tsv'):
print(os.path.abspath(file))

# MAKE SURE THIS IS THE RIGHT TABLE
cur.execute(create_table_agent)


with open(file,'r') as file_in:
reader = csv.reader(file_in, delimiter='t')
next(reader)
for row in reader:
print(row)
cur.execute("INSERT INTO mls_agent_1_line VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)", row)
# cur.copy_from(file_in, 'mls_appraisal_world', sep='t', null='\N')
conn.commit()

conn.close()

if __name__ == '__main__':
main()









share|improve this question





























    1















    I wrote this script that uploads the contents of a folder of TSVs to my Postgres database.



    It works, but it reads the files line by line, which takes a long time.



    Is there a way to modify this so it runs the COPY command instead of an INSERT command?



    I left my previous attempt at a COPY in the code below (but commented out). The problem with that code is that it copied the file headers into the rows of my Postgres table.



    def main():

    # MAKE SURE THIS IS THE RIGHT FILE TYPE
    for file in pathlib.Path().rglob('*.tsv'):
    print(os.path.abspath(file))

    # MAKE SURE THIS IS THE RIGHT TABLE
    cur.execute(create_table_agent)


    with open(file,'r') as file_in:
    reader = csv.reader(file_in, delimiter='t')
    next(reader)
    for row in reader:
    print(row)
    cur.execute("INSERT INTO mls_agent_1_line VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)", row)
    # cur.copy_from(file_in, 'mls_appraisal_world', sep='t', null='\N')
    conn.commit()

    conn.close()

    if __name__ == '__main__':
    main()









    share|improve this question

























      1












      1








      1








      I wrote this script that uploads the contents of a folder of TSVs to my Postgres database.



      It works, but it reads the files line by line, which takes a long time.



      Is there a way to modify this so it runs the COPY command instead of an INSERT command?



      I left my previous attempt at a COPY in the code below (but commented out). The problem with that code is that it copied the file headers into the rows of my Postgres table.



      def main():

      # MAKE SURE THIS IS THE RIGHT FILE TYPE
      for file in pathlib.Path().rglob('*.tsv'):
      print(os.path.abspath(file))

      # MAKE SURE THIS IS THE RIGHT TABLE
      cur.execute(create_table_agent)


      with open(file,'r') as file_in:
      reader = csv.reader(file_in, delimiter='t')
      next(reader)
      for row in reader:
      print(row)
      cur.execute("INSERT INTO mls_agent_1_line VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)", row)
      # cur.copy_from(file_in, 'mls_appraisal_world', sep='t', null='\N')
      conn.commit()

      conn.close()

      if __name__ == '__main__':
      main()









      share|improve this question














      I wrote this script that uploads the contents of a folder of TSVs to my Postgres database.



      It works, but it reads the files line by line, which takes a long time.



      Is there a way to modify this so it runs the COPY command instead of an INSERT command?



      I left my previous attempt at a COPY in the code below (but commented out). The problem with that code is that it copied the file headers into the rows of my Postgres table.



      def main():

      # MAKE SURE THIS IS THE RIGHT FILE TYPE
      for file in pathlib.Path().rglob('*.tsv'):
      print(os.path.abspath(file))

      # MAKE SURE THIS IS THE RIGHT TABLE
      cur.execute(create_table_agent)


      with open(file,'r') as file_in:
      reader = csv.reader(file_in, delimiter='t')
      next(reader)
      for row in reader:
      print(row)
      cur.execute("INSERT INTO mls_agent_1_line VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)", row)
      # cur.copy_from(file_in, 'mls_appraisal_world', sep='t', null='\N')
      conn.commit()

      conn.close()

      if __name__ == '__main__':
      main()






      python postgresql psycopg2






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 23 '18 at 23:25









      reallymemorablereallymemorable

      174119




      174119
























          1 Answer
          1






          active

          oldest

          votes


















          1














          The Postgres COPY command can properly skip headers only in CSV format. Per the documentation:




          HEADER



          Specifies that the file contains a header line with the names of each column in the file. On output, the first line contains the column names from the table, and on input, the first line is ignored. This option is allowed only when using CSV format.




          If your files can be properly imported by the COPY command with format csv option, use the function copy_expert(sql, file, size=8192):



          with open(file, 'r') as file_in:
          cur.copy_expert("copy table_name from stdin with csv header delimiter E't'", file_in)





          share|improve this answer


























          • I have used the csv header command in single line copy commands before, like this: copy table_name from '/path/to/file.txt' with csv header delimiter E't';

            – reallymemorable
            Nov 24 '18 at 3:22






          • 1





            Ok, see the updated answer.

            – klin
            Nov 24 '18 at 3:52











          • That worked -- but out of curiosity -- why didn't that command start with copy?

            – reallymemorable
            Nov 24 '18 at 15:10






          • 1





            COPY is a Postgres SQL command, COPY is a command specific for psql that implements the first and works only in psql.

            – klin
            Nov 24 '18 at 15:24












          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53453835%2fbatch-copying-tsvs-to-postgres-with-copy-command%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          The Postgres COPY command can properly skip headers only in CSV format. Per the documentation:




          HEADER



          Specifies that the file contains a header line with the names of each column in the file. On output, the first line contains the column names from the table, and on input, the first line is ignored. This option is allowed only when using CSV format.




          If your files can be properly imported by the COPY command with format csv option, use the function copy_expert(sql, file, size=8192):



          with open(file, 'r') as file_in:
          cur.copy_expert("copy table_name from stdin with csv header delimiter E't'", file_in)





          share|improve this answer


























          • I have used the csv header command in single line copy commands before, like this: copy table_name from '/path/to/file.txt' with csv header delimiter E't';

            – reallymemorable
            Nov 24 '18 at 3:22






          • 1





            Ok, see the updated answer.

            – klin
            Nov 24 '18 at 3:52











          • That worked -- but out of curiosity -- why didn't that command start with copy?

            – reallymemorable
            Nov 24 '18 at 15:10






          • 1





            COPY is a Postgres SQL command, COPY is a command specific for psql that implements the first and works only in psql.

            – klin
            Nov 24 '18 at 15:24
















          1














          The Postgres COPY command can properly skip headers only in CSV format. Per the documentation:




          HEADER



          Specifies that the file contains a header line with the names of each column in the file. On output, the first line contains the column names from the table, and on input, the first line is ignored. This option is allowed only when using CSV format.




          If your files can be properly imported by the COPY command with format csv option, use the function copy_expert(sql, file, size=8192):



          with open(file, 'r') as file_in:
          cur.copy_expert("copy table_name from stdin with csv header delimiter E't'", file_in)





          share|improve this answer


























          • I have used the csv header command in single line copy commands before, like this: copy table_name from '/path/to/file.txt' with csv header delimiter E't';

            – reallymemorable
            Nov 24 '18 at 3:22






          • 1





            Ok, see the updated answer.

            – klin
            Nov 24 '18 at 3:52











          • That worked -- but out of curiosity -- why didn't that command start with copy?

            – reallymemorable
            Nov 24 '18 at 15:10






          • 1





            COPY is a Postgres SQL command, COPY is a command specific for psql that implements the first and works only in psql.

            – klin
            Nov 24 '18 at 15:24














          1












          1








          1







          The Postgres COPY command can properly skip headers only in CSV format. Per the documentation:




          HEADER



          Specifies that the file contains a header line with the names of each column in the file. On output, the first line contains the column names from the table, and on input, the first line is ignored. This option is allowed only when using CSV format.




          If your files can be properly imported by the COPY command with format csv option, use the function copy_expert(sql, file, size=8192):



          with open(file, 'r') as file_in:
          cur.copy_expert("copy table_name from stdin with csv header delimiter E't'", file_in)





          share|improve this answer















          The Postgres COPY command can properly skip headers only in CSV format. Per the documentation:




          HEADER



          Specifies that the file contains a header line with the names of each column in the file. On output, the first line contains the column names from the table, and on input, the first line is ignored. This option is allowed only when using CSV format.




          If your files can be properly imported by the COPY command with format csv option, use the function copy_expert(sql, file, size=8192):



          with open(file, 'r') as file_in:
          cur.copy_expert("copy table_name from stdin with csv header delimiter E't'", file_in)






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 24 '18 at 3:51

























          answered Nov 24 '18 at 0:55









          klinklin

          61k65991




          61k65991













          • I have used the csv header command in single line copy commands before, like this: copy table_name from '/path/to/file.txt' with csv header delimiter E't';

            – reallymemorable
            Nov 24 '18 at 3:22






          • 1





            Ok, see the updated answer.

            – klin
            Nov 24 '18 at 3:52











          • That worked -- but out of curiosity -- why didn't that command start with copy?

            – reallymemorable
            Nov 24 '18 at 15:10






          • 1





            COPY is a Postgres SQL command, COPY is a command specific for psql that implements the first and works only in psql.

            – klin
            Nov 24 '18 at 15:24



















          • I have used the csv header command in single line copy commands before, like this: copy table_name from '/path/to/file.txt' with csv header delimiter E't';

            – reallymemorable
            Nov 24 '18 at 3:22






          • 1





            Ok, see the updated answer.

            – klin
            Nov 24 '18 at 3:52











          • That worked -- but out of curiosity -- why didn't that command start with copy?

            – reallymemorable
            Nov 24 '18 at 15:10






          • 1





            COPY is a Postgres SQL command, COPY is a command specific for psql that implements the first and works only in psql.

            – klin
            Nov 24 '18 at 15:24

















          I have used the csv header command in single line copy commands before, like this: copy table_name from '/path/to/file.txt' with csv header delimiter E't';

          – reallymemorable
          Nov 24 '18 at 3:22





          I have used the csv header command in single line copy commands before, like this: copy table_name from '/path/to/file.txt' with csv header delimiter E't';

          – reallymemorable
          Nov 24 '18 at 3:22




          1




          1





          Ok, see the updated answer.

          – klin
          Nov 24 '18 at 3:52





          Ok, see the updated answer.

          – klin
          Nov 24 '18 at 3:52













          That worked -- but out of curiosity -- why didn't that command start with copy?

          – reallymemorable
          Nov 24 '18 at 15:10





          That worked -- but out of curiosity -- why didn't that command start with copy?

          – reallymemorable
          Nov 24 '18 at 15:10




          1




          1





          COPY is a Postgres SQL command, COPY is a command specific for psql that implements the first and works only in psql.

          – klin
          Nov 24 '18 at 15:24





          COPY is a Postgres SQL command, COPY is a command specific for psql that implements the first and works only in psql.

          – klin
          Nov 24 '18 at 15:24




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53453835%2fbatch-copying-tsvs-to-postgres-with-copy-command%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          這個網誌中的熱門文章

          Xamarin.form Move up view when keyboard appear

          Post-Redirect-Get with Spring WebFlux and Thymeleaf

          Anylogic : not able to use stopDelay()