Java Multithread File Read + Operation + Write












1















I am trying to write a small program to solve the following problems using multithreading in java. I am struggling to even understand where to start, and am looking for some advice. The desired steps in the process are as follows:




  1. Read in all the text files contained within a given directory

  2. Create a word count for each one of the files read.

  3. Write the count of words as an output to a new file in a different directory.


I have written the wordcount function, and that works fine, but would like to know more about how to multithread this operation so that the files are read, the words counted, and then the output is all written in parallel.










share|improve this question























  • Perhaps using Stream functionality in Java (docs.oracle.com/javase/tutorial/collections/streams/…). Create a List of files contained within the directory, and get a parallelStream of those File objects. Then you can process each file, returning the word count as part of the parallel stream processing. Then you can figure out what to do with the count... if each is written to their own file, that too could be part of the stream processing. I'd start there

    – EdH
    Nov 16 '18 at 22:22


















1















I am trying to write a small program to solve the following problems using multithreading in java. I am struggling to even understand where to start, and am looking for some advice. The desired steps in the process are as follows:




  1. Read in all the text files contained within a given directory

  2. Create a word count for each one of the files read.

  3. Write the count of words as an output to a new file in a different directory.


I have written the wordcount function, and that works fine, but would like to know more about how to multithread this operation so that the files are read, the words counted, and then the output is all written in parallel.










share|improve this question























  • Perhaps using Stream functionality in Java (docs.oracle.com/javase/tutorial/collections/streams/…). Create a List of files contained within the directory, and get a parallelStream of those File objects. Then you can process each file, returning the word count as part of the parallel stream processing. Then you can figure out what to do with the count... if each is written to their own file, that too could be part of the stream processing. I'd start there

    – EdH
    Nov 16 '18 at 22:22
















1












1








1








I am trying to write a small program to solve the following problems using multithreading in java. I am struggling to even understand where to start, and am looking for some advice. The desired steps in the process are as follows:




  1. Read in all the text files contained within a given directory

  2. Create a word count for each one of the files read.

  3. Write the count of words as an output to a new file in a different directory.


I have written the wordcount function, and that works fine, but would like to know more about how to multithread this operation so that the files are read, the words counted, and then the output is all written in parallel.










share|improve this question














I am trying to write a small program to solve the following problems using multithreading in java. I am struggling to even understand where to start, and am looking for some advice. The desired steps in the process are as follows:




  1. Read in all the text files contained within a given directory

  2. Create a word count for each one of the files read.

  3. Write the count of words as an output to a new file in a different directory.


I have written the wordcount function, and that works fine, but would like to know more about how to multithread this operation so that the files are read, the words counted, and then the output is all written in parallel.







java multithreading






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 16 '18 at 22:01









user10665129user10665129

61




61













  • Perhaps using Stream functionality in Java (docs.oracle.com/javase/tutorial/collections/streams/…). Create a List of files contained within the directory, and get a parallelStream of those File objects. Then you can process each file, returning the word count as part of the parallel stream processing. Then you can figure out what to do with the count... if each is written to their own file, that too could be part of the stream processing. I'd start there

    – EdH
    Nov 16 '18 at 22:22





















  • Perhaps using Stream functionality in Java (docs.oracle.com/javase/tutorial/collections/streams/…). Create a List of files contained within the directory, and get a parallelStream of those File objects. Then you can process each file, returning the word count as part of the parallel stream processing. Then you can figure out what to do with the count... if each is written to their own file, that too could be part of the stream processing. I'd start there

    – EdH
    Nov 16 '18 at 22:22



















Perhaps using Stream functionality in Java (docs.oracle.com/javase/tutorial/collections/streams/…). Create a List of files contained within the directory, and get a parallelStream of those File objects. Then you can process each file, returning the word count as part of the parallel stream processing. Then you can figure out what to do with the count... if each is written to their own file, that too could be part of the stream processing. I'd start there

– EdH
Nov 16 '18 at 22:22







Perhaps using Stream functionality in Java (docs.oracle.com/javase/tutorial/collections/streams/…). Create a List of files contained within the directory, and get a parallelStream of those File objects. Then you can process each file, returning the word count as part of the parallel stream processing. Then you can figure out what to do with the count... if each is written to their own file, that too could be part of the stream processing. I'd start there

– EdH
Nov 16 '18 at 22:22














2 Answers
2






active

oldest

votes


















1














Can you share the single threaded version? Conceptually it can be as simple as this (pseudo Java code). countWords and writeOutput are your methods and
files is a list of files you already read in.



files.parallelStream()
.map(file -> new Pair(file, countWords(file)))
.forEach((file, count) -> writeOutput(file, count));





share|improve this answer































    0














    I like your enthusiastic in learning multi-threaded programming!

    Beside parallel your task between files, you can also parallel and optimize the actual countWords task.



    This is a classic case for using Fork/Join. A rather advance level of multi-threaded programming but a very satisfying one :)



    you can search the web for examples similar to your own, here is a one to start with: https://www.baeldung.com/java-fork-join






    share|improve this answer























      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53346013%2fjava-multithread-file-read-operation-write%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      1














      Can you share the single threaded version? Conceptually it can be as simple as this (pseudo Java code). countWords and writeOutput are your methods and
      files is a list of files you already read in.



      files.parallelStream()
      .map(file -> new Pair(file, countWords(file)))
      .forEach((file, count) -> writeOutput(file, count));





      share|improve this answer




























        1














        Can you share the single threaded version? Conceptually it can be as simple as this (pseudo Java code). countWords and writeOutput are your methods and
        files is a list of files you already read in.



        files.parallelStream()
        .map(file -> new Pair(file, countWords(file)))
        .forEach((file, count) -> writeOutput(file, count));





        share|improve this answer


























          1












          1








          1







          Can you share the single threaded version? Conceptually it can be as simple as this (pseudo Java code). countWords and writeOutput are your methods and
          files is a list of files you already read in.



          files.parallelStream()
          .map(file -> new Pair(file, countWords(file)))
          .forEach((file, count) -> writeOutput(file, count));





          share|improve this answer













          Can you share the single threaded version? Conceptually it can be as simple as this (pseudo Java code). countWords and writeOutput are your methods and
          files is a list of files you already read in.



          files.parallelStream()
          .map(file -> new Pair(file, countWords(file)))
          .forEach((file, count) -> writeOutput(file, count));






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 16 '18 at 22:24









          K. M. Fazle Azim BabuK. M. Fazle Azim Babu

          1264




          1264

























              0














              I like your enthusiastic in learning multi-threaded programming!

              Beside parallel your task between files, you can also parallel and optimize the actual countWords task.



              This is a classic case for using Fork/Join. A rather advance level of multi-threaded programming but a very satisfying one :)



              you can search the web for examples similar to your own, here is a one to start with: https://www.baeldung.com/java-fork-join






              share|improve this answer




























                0














                I like your enthusiastic in learning multi-threaded programming!

                Beside parallel your task between files, you can also parallel and optimize the actual countWords task.



                This is a classic case for using Fork/Join. A rather advance level of multi-threaded programming but a very satisfying one :)



                you can search the web for examples similar to your own, here is a one to start with: https://www.baeldung.com/java-fork-join






                share|improve this answer


























                  0












                  0








                  0







                  I like your enthusiastic in learning multi-threaded programming!

                  Beside parallel your task between files, you can also parallel and optimize the actual countWords task.



                  This is a classic case for using Fork/Join. A rather advance level of multi-threaded programming but a very satisfying one :)



                  you can search the web for examples similar to your own, here is a one to start with: https://www.baeldung.com/java-fork-join






                  share|improve this answer













                  I like your enthusiastic in learning multi-threaded programming!

                  Beside parallel your task between files, you can also parallel and optimize the actual countWords task.



                  This is a classic case for using Fork/Join. A rather advance level of multi-threaded programming but a very satisfying one :)



                  you can search the web for examples similar to your own, here is a one to start with: https://www.baeldung.com/java-fork-join







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 16 '18 at 23:13









                  doronydorony

                  351111




                  351111






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53346013%2fjava-multithread-file-read-operation-write%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      這個網誌中的熱門文章

                      Tangent Lines Diagram Along Smooth Curve

                      Yusuf al-Mu'taman ibn Hud

                      Zucchini