Limit the number of running jobs in SLURM












6















I am queuing multiple jobs in SLURM. Can I limit the number of parallel running jobs in slurm?



Thanks in advance!










share|improve this question



























    6















    I am queuing multiple jobs in SLURM. Can I limit the number of parallel running jobs in slurm?



    Thanks in advance!










    share|improve this question

























      6












      6








      6


      1






      I am queuing multiple jobs in SLURM. Can I limit the number of parallel running jobs in slurm?



      Thanks in advance!










      share|improve this question














      I am queuing multiple jobs in SLURM. Can I limit the number of parallel running jobs in slurm?



      Thanks in advance!







      slurm






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Mar 15 '17 at 14:19









      user1447257user1447257

      6111824




      6111824
























          3 Answers
          3






          active

          oldest

          votes


















          7














          If you are not the administrator, your can hold some jobs if you do not want them all to start at the same time, with scontrol hold <JOBID>, and you can delay the submission of some jobs with sbatch --begin=YYYY-MM-DD. Also, if it is a job array, you can limit the number of jobs in the array that are concurrently running with for instance --array=1:100%25 to have 100 jobs in the array but only 25 of them running.






          share|improve this answer
























          • when you hold a job I guess you have to manually unhold it to allow it to run right ? Is it possible to make it automatically unhold whenever a running job ends ?

            – zwlayer
            Apr 18 '18 at 10:13











          • @zwlayer yes you have to unhold them. You can put that at the end of your submission script, or write some script to monitor what is happening, but then I would suggest considering the use of a workflow management tool such as FireWorks for instance.

            – damienfrancois
            Apr 18 '18 at 10:15





















          4














          According to the SLURM Resource Limits documentation, you can limit the total number of jobs that you can run for an association/qos with the MaxJobs parameter. As a reminder, an association is a combination of cluster, account, user name and (optional) partition name.



          You should be able to do something similar to:



          sacctmgr modify user <userid> account=<account_name> set MaxJobs=10


          I found this presentation to be very helpful in case you have more questions.






          share|improve this answer



















          • 1





            I wanted to limit only the number of parallel jobs of a certain step as a user. This is globally.

            – user1447257
            Mar 16 '17 at 9:21



















          0














          According to SLURM documentation, --array=0-15%4 (- sign and not :) will limit the number of simultaneously running tasks from this job array to 4



          I wrote test.sbatch:



          #!/bin/bash
          # test.sbatch
          #
          #SBATCH -J a
          #SBATCH -p campus
          #SBATCH -c 1
          #SBATCH -o %A_%a.output

          mkdir test${SLURM_ARRAY_TASK_ID}

          # sleep for up to 10 minutes to see them running in squeue and
          # different times to check that the number of parallel jobs remain constant
          RANGE=600; number=$RANDOM; let "number %= $RANGE"; echo "$number"

          sleep $number


          and run it with sbatch --array=1-15%4 test.sbatch



          Jobs run as expected (always 4 in parallel) and just create directories and kept running for $number seconds.



          Appreciate comments and suggestions.






          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f42812425%2flimit-the-number-of-running-jobs-in-slurm%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            3 Answers
            3






            active

            oldest

            votes








            3 Answers
            3






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            7














            If you are not the administrator, your can hold some jobs if you do not want them all to start at the same time, with scontrol hold <JOBID>, and you can delay the submission of some jobs with sbatch --begin=YYYY-MM-DD. Also, if it is a job array, you can limit the number of jobs in the array that are concurrently running with for instance --array=1:100%25 to have 100 jobs in the array but only 25 of them running.






            share|improve this answer
























            • when you hold a job I guess you have to manually unhold it to allow it to run right ? Is it possible to make it automatically unhold whenever a running job ends ?

              – zwlayer
              Apr 18 '18 at 10:13











            • @zwlayer yes you have to unhold them. You can put that at the end of your submission script, or write some script to monitor what is happening, but then I would suggest considering the use of a workflow management tool such as FireWorks for instance.

              – damienfrancois
              Apr 18 '18 at 10:15


















            7














            If you are not the administrator, your can hold some jobs if you do not want them all to start at the same time, with scontrol hold <JOBID>, and you can delay the submission of some jobs with sbatch --begin=YYYY-MM-DD. Also, if it is a job array, you can limit the number of jobs in the array that are concurrently running with for instance --array=1:100%25 to have 100 jobs in the array but only 25 of them running.






            share|improve this answer
























            • when you hold a job I guess you have to manually unhold it to allow it to run right ? Is it possible to make it automatically unhold whenever a running job ends ?

              – zwlayer
              Apr 18 '18 at 10:13











            • @zwlayer yes you have to unhold them. You can put that at the end of your submission script, or write some script to monitor what is happening, but then I would suggest considering the use of a workflow management tool such as FireWorks for instance.

              – damienfrancois
              Apr 18 '18 at 10:15
















            7












            7








            7







            If you are not the administrator, your can hold some jobs if you do not want them all to start at the same time, with scontrol hold <JOBID>, and you can delay the submission of some jobs with sbatch --begin=YYYY-MM-DD. Also, if it is a job array, you can limit the number of jobs in the array that are concurrently running with for instance --array=1:100%25 to have 100 jobs in the array but only 25 of them running.






            share|improve this answer













            If you are not the administrator, your can hold some jobs if you do not want them all to start at the same time, with scontrol hold <JOBID>, and you can delay the submission of some jobs with sbatch --begin=YYYY-MM-DD. Also, if it is a job array, you can limit the number of jobs in the array that are concurrently running with for instance --array=1:100%25 to have 100 jobs in the array but only 25 of them running.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Mar 16 '17 at 8:53









            damienfrancoisdamienfrancois

            25.5k54561




            25.5k54561













            • when you hold a job I guess you have to manually unhold it to allow it to run right ? Is it possible to make it automatically unhold whenever a running job ends ?

              – zwlayer
              Apr 18 '18 at 10:13











            • @zwlayer yes you have to unhold them. You can put that at the end of your submission script, or write some script to monitor what is happening, but then I would suggest considering the use of a workflow management tool such as FireWorks for instance.

              – damienfrancois
              Apr 18 '18 at 10:15





















            • when you hold a job I guess you have to manually unhold it to allow it to run right ? Is it possible to make it automatically unhold whenever a running job ends ?

              – zwlayer
              Apr 18 '18 at 10:13











            • @zwlayer yes you have to unhold them. You can put that at the end of your submission script, or write some script to monitor what is happening, but then I would suggest considering the use of a workflow management tool such as FireWorks for instance.

              – damienfrancois
              Apr 18 '18 at 10:15



















            when you hold a job I guess you have to manually unhold it to allow it to run right ? Is it possible to make it automatically unhold whenever a running job ends ?

            – zwlayer
            Apr 18 '18 at 10:13





            when you hold a job I guess you have to manually unhold it to allow it to run right ? Is it possible to make it automatically unhold whenever a running job ends ?

            – zwlayer
            Apr 18 '18 at 10:13













            @zwlayer yes you have to unhold them. You can put that at the end of your submission script, or write some script to monitor what is happening, but then I would suggest considering the use of a workflow management tool such as FireWorks for instance.

            – damienfrancois
            Apr 18 '18 at 10:15







            @zwlayer yes you have to unhold them. You can put that at the end of your submission script, or write some script to monitor what is happening, but then I would suggest considering the use of a workflow management tool such as FireWorks for instance.

            – damienfrancois
            Apr 18 '18 at 10:15















            4














            According to the SLURM Resource Limits documentation, you can limit the total number of jobs that you can run for an association/qos with the MaxJobs parameter. As a reminder, an association is a combination of cluster, account, user name and (optional) partition name.



            You should be able to do something similar to:



            sacctmgr modify user <userid> account=<account_name> set MaxJobs=10


            I found this presentation to be very helpful in case you have more questions.






            share|improve this answer



















            • 1





              I wanted to limit only the number of parallel jobs of a certain step as a user. This is globally.

              – user1447257
              Mar 16 '17 at 9:21
















            4














            According to the SLURM Resource Limits documentation, you can limit the total number of jobs that you can run for an association/qos with the MaxJobs parameter. As a reminder, an association is a combination of cluster, account, user name and (optional) partition name.



            You should be able to do something similar to:



            sacctmgr modify user <userid> account=<account_name> set MaxJobs=10


            I found this presentation to be very helpful in case you have more questions.






            share|improve this answer



















            • 1





              I wanted to limit only the number of parallel jobs of a certain step as a user. This is globally.

              – user1447257
              Mar 16 '17 at 9:21














            4












            4








            4







            According to the SLURM Resource Limits documentation, you can limit the total number of jobs that you can run for an association/qos with the MaxJobs parameter. As a reminder, an association is a combination of cluster, account, user name and (optional) partition name.



            You should be able to do something similar to:



            sacctmgr modify user <userid> account=<account_name> set MaxJobs=10


            I found this presentation to be very helpful in case you have more questions.






            share|improve this answer













            According to the SLURM Resource Limits documentation, you can limit the total number of jobs that you can run for an association/qos with the MaxJobs parameter. As a reminder, an association is a combination of cluster, account, user name and (optional) partition name.



            You should be able to do something similar to:



            sacctmgr modify user <userid> account=<account_name> set MaxJobs=10


            I found this presentation to be very helpful in case you have more questions.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Mar 16 '17 at 1:00









            AndresMAndresM

            962515




            962515








            • 1





              I wanted to limit only the number of parallel jobs of a certain step as a user. This is globally.

              – user1447257
              Mar 16 '17 at 9:21














            • 1





              I wanted to limit only the number of parallel jobs of a certain step as a user. This is globally.

              – user1447257
              Mar 16 '17 at 9:21








            1




            1





            I wanted to limit only the number of parallel jobs of a certain step as a user. This is globally.

            – user1447257
            Mar 16 '17 at 9:21





            I wanted to limit only the number of parallel jobs of a certain step as a user. This is globally.

            – user1447257
            Mar 16 '17 at 9:21











            0














            According to SLURM documentation, --array=0-15%4 (- sign and not :) will limit the number of simultaneously running tasks from this job array to 4



            I wrote test.sbatch:



            #!/bin/bash
            # test.sbatch
            #
            #SBATCH -J a
            #SBATCH -p campus
            #SBATCH -c 1
            #SBATCH -o %A_%a.output

            mkdir test${SLURM_ARRAY_TASK_ID}

            # sleep for up to 10 minutes to see them running in squeue and
            # different times to check that the number of parallel jobs remain constant
            RANGE=600; number=$RANDOM; let "number %= $RANGE"; echo "$number"

            sleep $number


            and run it with sbatch --array=1-15%4 test.sbatch



            Jobs run as expected (always 4 in parallel) and just create directories and kept running for $number seconds.



            Appreciate comments and suggestions.






            share|improve this answer




























              0














              According to SLURM documentation, --array=0-15%4 (- sign and not :) will limit the number of simultaneously running tasks from this job array to 4



              I wrote test.sbatch:



              #!/bin/bash
              # test.sbatch
              #
              #SBATCH -J a
              #SBATCH -p campus
              #SBATCH -c 1
              #SBATCH -o %A_%a.output

              mkdir test${SLURM_ARRAY_TASK_ID}

              # sleep for up to 10 minutes to see them running in squeue and
              # different times to check that the number of parallel jobs remain constant
              RANGE=600; number=$RANDOM; let "number %= $RANGE"; echo "$number"

              sleep $number


              and run it with sbatch --array=1-15%4 test.sbatch



              Jobs run as expected (always 4 in parallel) and just create directories and kept running for $number seconds.



              Appreciate comments and suggestions.






              share|improve this answer


























                0












                0








                0







                According to SLURM documentation, --array=0-15%4 (- sign and not :) will limit the number of simultaneously running tasks from this job array to 4



                I wrote test.sbatch:



                #!/bin/bash
                # test.sbatch
                #
                #SBATCH -J a
                #SBATCH -p campus
                #SBATCH -c 1
                #SBATCH -o %A_%a.output

                mkdir test${SLURM_ARRAY_TASK_ID}

                # sleep for up to 10 minutes to see them running in squeue and
                # different times to check that the number of parallel jobs remain constant
                RANGE=600; number=$RANDOM; let "number %= $RANGE"; echo "$number"

                sleep $number


                and run it with sbatch --array=1-15%4 test.sbatch



                Jobs run as expected (always 4 in parallel) and just create directories and kept running for $number seconds.



                Appreciate comments and suggestions.






                share|improve this answer













                According to SLURM documentation, --array=0-15%4 (- sign and not :) will limit the number of simultaneously running tasks from this job array to 4



                I wrote test.sbatch:



                #!/bin/bash
                # test.sbatch
                #
                #SBATCH -J a
                #SBATCH -p campus
                #SBATCH -c 1
                #SBATCH -o %A_%a.output

                mkdir test${SLURM_ARRAY_TASK_ID}

                # sleep for up to 10 minutes to see them running in squeue and
                # different times to check that the number of parallel jobs remain constant
                RANGE=600; number=$RANDOM; let "number %= $RANGE"; echo "$number"

                sleep $number


                and run it with sbatch --array=1-15%4 test.sbatch



                Jobs run as expected (always 4 in parallel) and just create directories and kept running for $number seconds.



                Appreciate comments and suggestions.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered 2 days ago









                aerijmanaerijman

                93




                93






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f42812425%2flimit-the-number-of-running-jobs-in-slurm%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    這個網誌中的熱門文章

                    Post-Redirect-Get with Spring WebFlux and Thymeleaf

                    Xamarin.form Move up view when keyboard appear

                    JBPM : POST request for execute process go wrong