Spark Web UI Completed Workers No Longer Accessible












0















Running Spark and using the Spark UI, everything is working fine.
After the master is running for a day or two, we start to notice completed worker logs start to turn into text only, no longer a URL to view the App and Driver logs.



The scratch drive for each worker is 250GB, so there is no issue with us running out of space. The log directory is 250GB as well. Both have tons of free space.



Is there a retention value that needs to be set in the default config file that would allow these to exist for a longer period?



For example, we have 4 workers, 3 of the workers the worker URLs are no longer available. Only for the one that is shared with the master. If we restart the master, or kick of a new job, we will see the worker URL in the completed jobs section. Most of these jobs run at 4am, it seems after 2pm or so they start to disappear - meaning the worker URL turns into text and the link is no longer available. Checking the dirs, the app dir stdout still exists, but the driver dir exists, but the stdout doesn't. It's like a janitor process removed the stdout after a period of time.



We were checking thru the docs and noticed a few params that look like they may be related to this.



spark.history.retainedApplications=50 
spark.executor.logs.rolling.*
spark.deploy.retainedApplications=200
spark.deploy.retainedDrivers=200


Would anyone know what causing it or how we can increase the time or totals?
Thanks.










share|improve this question



























    0















    Running Spark and using the Spark UI, everything is working fine.
    After the master is running for a day or two, we start to notice completed worker logs start to turn into text only, no longer a URL to view the App and Driver logs.



    The scratch drive for each worker is 250GB, so there is no issue with us running out of space. The log directory is 250GB as well. Both have tons of free space.



    Is there a retention value that needs to be set in the default config file that would allow these to exist for a longer period?



    For example, we have 4 workers, 3 of the workers the worker URLs are no longer available. Only for the one that is shared with the master. If we restart the master, or kick of a new job, we will see the worker URL in the completed jobs section. Most of these jobs run at 4am, it seems after 2pm or so they start to disappear - meaning the worker URL turns into text and the link is no longer available. Checking the dirs, the app dir stdout still exists, but the driver dir exists, but the stdout doesn't. It's like a janitor process removed the stdout after a period of time.



    We were checking thru the docs and noticed a few params that look like they may be related to this.



    spark.history.retainedApplications=50 
    spark.executor.logs.rolling.*
    spark.deploy.retainedApplications=200
    spark.deploy.retainedDrivers=200


    Would anyone know what causing it or how we can increase the time or totals?
    Thanks.










    share|improve this question

























      0












      0








      0








      Running Spark and using the Spark UI, everything is working fine.
      After the master is running for a day or two, we start to notice completed worker logs start to turn into text only, no longer a URL to view the App and Driver logs.



      The scratch drive for each worker is 250GB, so there is no issue with us running out of space. The log directory is 250GB as well. Both have tons of free space.



      Is there a retention value that needs to be set in the default config file that would allow these to exist for a longer period?



      For example, we have 4 workers, 3 of the workers the worker URLs are no longer available. Only for the one that is shared with the master. If we restart the master, or kick of a new job, we will see the worker URL in the completed jobs section. Most of these jobs run at 4am, it seems after 2pm or so they start to disappear - meaning the worker URL turns into text and the link is no longer available. Checking the dirs, the app dir stdout still exists, but the driver dir exists, but the stdout doesn't. It's like a janitor process removed the stdout after a period of time.



      We were checking thru the docs and noticed a few params that look like they may be related to this.



      spark.history.retainedApplications=50 
      spark.executor.logs.rolling.*
      spark.deploy.retainedApplications=200
      spark.deploy.retainedDrivers=200


      Would anyone know what causing it or how we can increase the time or totals?
      Thanks.










      share|improve this question














      Running Spark and using the Spark UI, everything is working fine.
      After the master is running for a day or two, we start to notice completed worker logs start to turn into text only, no longer a URL to view the App and Driver logs.



      The scratch drive for each worker is 250GB, so there is no issue with us running out of space. The log directory is 250GB as well. Both have tons of free space.



      Is there a retention value that needs to be set in the default config file that would allow these to exist for a longer period?



      For example, we have 4 workers, 3 of the workers the worker URLs are no longer available. Only for the one that is shared with the master. If we restart the master, or kick of a new job, we will see the worker URL in the completed jobs section. Most of these jobs run at 4am, it seems after 2pm or so they start to disappear - meaning the worker URL turns into text and the link is no longer available. Checking the dirs, the app dir stdout still exists, but the driver dir exists, but the stdout doesn't. It's like a janitor process removed the stdout after a period of time.



      We were checking thru the docs and noticed a few params that look like they may be related to this.



      spark.history.retainedApplications=50 
      spark.executor.logs.rolling.*
      spark.deploy.retainedApplications=200
      spark.deploy.retainedDrivers=200


      Would anyone know what causing it or how we can increase the time or totals?
      Thanks.







      scala apache-spark hadoop mapreduce






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 21 '18 at 22:43









      RichRich

      1835




      1835
























          0






          active

          oldest

          votes











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53421463%2fspark-web-ui-completed-workers-no-longer-accessible%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53421463%2fspark-web-ui-completed-workers-no-longer-accessible%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          這個網誌中的熱門文章

          Xamarin.form Move up view when keyboard appear

          Post-Redirect-Get with Spring WebFlux and Thymeleaf

          Anylogic : not able to use stopDelay()