PHP preg_match_all() match all words except preposition, adjectives like other less important word in array












1















PHP preg_match_all() match all words except some word in Array.



$input = 'Lorem Ipsum is simply dummy text of the printing industry.';
$except = array('and', 'the', 'text', 'simply');
preg_match_all('/(?<match>w{3,}+)/', $input, $matches, PREG_PATTERN_ORDER);
print_r($matches['match']);


This give all word with unwanted words.



Array
(
[0] => Lorem
[1] => Ipsum
[2] => simply
[3] => dummy
[4] => text
[5] => the
[6] => printing
[7] => industry
)


Need to return only important words not adjective or preposition adjectives like other less important word in array.




$except = array('and', 'the', 'text', 'simply');




It would be better if we can use one function for this purpose.










share|improve this question


















  • 2





    I'd use array_diff to eliminate all the words you have in $except.

    – Jeff
    Nov 23 '18 at 11:57
















1















PHP preg_match_all() match all words except some word in Array.



$input = 'Lorem Ipsum is simply dummy text of the printing industry.';
$except = array('and', 'the', 'text', 'simply');
preg_match_all('/(?<match>w{3,}+)/', $input, $matches, PREG_PATTERN_ORDER);
print_r($matches['match']);


This give all word with unwanted words.



Array
(
[0] => Lorem
[1] => Ipsum
[2] => simply
[3] => dummy
[4] => text
[5] => the
[6] => printing
[7] => industry
)


Need to return only important words not adjective or preposition adjectives like other less important word in array.




$except = array('and', 'the', 'text', 'simply');




It would be better if we can use one function for this purpose.










share|improve this question


















  • 2





    I'd use array_diff to eliminate all the words you have in $except.

    – Jeff
    Nov 23 '18 at 11:57














1












1








1


1






PHP preg_match_all() match all words except some word in Array.



$input = 'Lorem Ipsum is simply dummy text of the printing industry.';
$except = array('and', 'the', 'text', 'simply');
preg_match_all('/(?<match>w{3,}+)/', $input, $matches, PREG_PATTERN_ORDER);
print_r($matches['match']);


This give all word with unwanted words.



Array
(
[0] => Lorem
[1] => Ipsum
[2] => simply
[3] => dummy
[4] => text
[5] => the
[6] => printing
[7] => industry
)


Need to return only important words not adjective or preposition adjectives like other less important word in array.




$except = array('and', 'the', 'text', 'simply');




It would be better if we can use one function for this purpose.










share|improve this question














PHP preg_match_all() match all words except some word in Array.



$input = 'Lorem Ipsum is simply dummy text of the printing industry.';
$except = array('and', 'the', 'text', 'simply');
preg_match_all('/(?<match>w{3,}+)/', $input, $matches, PREG_PATTERN_ORDER);
print_r($matches['match']);


This give all word with unwanted words.



Array
(
[0] => Lorem
[1] => Ipsum
[2] => simply
[3] => dummy
[4] => text
[5] => the
[6] => printing
[7] => industry
)


Need to return only important words not adjective or preposition adjectives like other less important word in array.




$except = array('and', 'the', 'text', 'simply');




It would be better if we can use one function for this purpose.







php regex artificial-intelligence






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 23 '18 at 11:54









Shapon PalShapon Pal

3561415




3561415








  • 2





    I'd use array_diff to eliminate all the words you have in $except.

    – Jeff
    Nov 23 '18 at 11:57














  • 2





    I'd use array_diff to eliminate all the words you have in $except.

    – Jeff
    Nov 23 '18 at 11:57








2




2





I'd use array_diff to eliminate all the words you have in $except.

– Jeff
Nov 23 '18 at 11:57





I'd use array_diff to eliminate all the words you have in $except.

– Jeff
Nov 23 '18 at 11:57












4 Answers
4






active

oldest

votes


















3














Build a regex with a negative lookahead anchored at the word boundary:



'~b(?!(?:and|the|text|simply)b)w{3,}~'


See the regex demo



Details





  • b - a word boundary


  • (?!(?:and|the|text|simply)b) - no and, the, etc. as whole word is allowed immediately to the right of the current location


  • w{3,} - 3 or more word chars.


PHP demo:



$input = 'Lorem Ipsum is simply dummy text of the printing industry.';
$except = array('and', 'the', 'text', 'simply');
if (preg_match_all('/b(?!(?:' . implode('|', $except) . ')b)w{3,}/', $input, $matches)) {
print_r($matches[0]);
}


Output:



Array
(
[0] => Lorem
[1] => Ipsum
[2] => dummy
[3] => printing
[4] => industry
)





share|improve this answer































    2














    You can just apply array_diff to your result and the $except array:



    $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
    $except = array('and', 'the', 'text', 'simply');
    preg_match_all('/(?<match>w{3,}+)/', $input, $matches, PREG_PATTERN_ORDER);
    print_r(array_diff($matches['match'], $except));


    Output:



    Array
    (
    [0] => Lorem
    [1] => Ipsum
    [3] => dummy
    [6] => printing
    [7] => industry
    )


    demo on 3v4l.org



    If you want the result array to be indexed from 0, use array_values on it i.e.



    print_r(array_values(array_diff($matches['match'], $except)));


    Output:



    Array
    (
    [0] => Lorem
    [1] => Ipsum
    [2] => dummy
    [3] => printing
    [4] => industry
    )





    share|improve this answer































      2














      You could use array_diff() to eliminate the words you have in $except:



      $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
      $except = array('and', 'the', 'text', 'simply');
      preg_match_all('/(?<match>w{3,}+)/', $input, $matches, PREG_PATTERN_ORDER);
      $filtered = array_diff($matches['match'],$except);

      var_dump($filtered);

      // Output:
      array(5) {
      [0]=>
      string(5) "Lorem"
      [1]=>
      string(5) "Ipsum"
      [3]=>
      string(5) "dummy"
      [6]=>
      string(8) "printing"
      [7]=>
      string(8) "industry"
      }





      share|improve this answer































        1














        Here is an example using array_diff() with explode().



        $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
        $inputArray = explode(' ', $input);
        $except = array('and', 'the', 'text', 'simply');
        $results = array_values(array_diff($inputArray, $except));

        echo '<pre>';
        print_r($results);
        echo '</pre>';


        This will output:



         Array
        (
        [0] => Lorem
        [1] => Ipsum
        [2] => is
        [3] => dummy
        [4] => of
        [5] => printing
        [6] => industry.
        )





        share|improve this answer
























          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53446242%2fphp-preg-match-all-match-all-words-except-preposition-adjectives-like-other-l%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          4 Answers
          4






          active

          oldest

          votes








          4 Answers
          4






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          3














          Build a regex with a negative lookahead anchored at the word boundary:



          '~b(?!(?:and|the|text|simply)b)w{3,}~'


          See the regex demo



          Details





          • b - a word boundary


          • (?!(?:and|the|text|simply)b) - no and, the, etc. as whole word is allowed immediately to the right of the current location


          • w{3,} - 3 or more word chars.


          PHP demo:



          $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
          $except = array('and', 'the', 'text', 'simply');
          if (preg_match_all('/b(?!(?:' . implode('|', $except) . ')b)w{3,}/', $input, $matches)) {
          print_r($matches[0]);
          }


          Output:



          Array
          (
          [0] => Lorem
          [1] => Ipsum
          [2] => dummy
          [3] => printing
          [4] => industry
          )





          share|improve this answer




























            3














            Build a regex with a negative lookahead anchored at the word boundary:



            '~b(?!(?:and|the|text|simply)b)w{3,}~'


            See the regex demo



            Details





            • b - a word boundary


            • (?!(?:and|the|text|simply)b) - no and, the, etc. as whole word is allowed immediately to the right of the current location


            • w{3,} - 3 or more word chars.


            PHP demo:



            $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
            $except = array('and', 'the', 'text', 'simply');
            if (preg_match_all('/b(?!(?:' . implode('|', $except) . ')b)w{3,}/', $input, $matches)) {
            print_r($matches[0]);
            }


            Output:



            Array
            (
            [0] => Lorem
            [1] => Ipsum
            [2] => dummy
            [3] => printing
            [4] => industry
            )





            share|improve this answer


























              3












              3








              3







              Build a regex with a negative lookahead anchored at the word boundary:



              '~b(?!(?:and|the|text|simply)b)w{3,}~'


              See the regex demo



              Details





              • b - a word boundary


              • (?!(?:and|the|text|simply)b) - no and, the, etc. as whole word is allowed immediately to the right of the current location


              • w{3,} - 3 or more word chars.


              PHP demo:



              $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
              $except = array('and', 'the', 'text', 'simply');
              if (preg_match_all('/b(?!(?:' . implode('|', $except) . ')b)w{3,}/', $input, $matches)) {
              print_r($matches[0]);
              }


              Output:



              Array
              (
              [0] => Lorem
              [1] => Ipsum
              [2] => dummy
              [3] => printing
              [4] => industry
              )





              share|improve this answer













              Build a regex with a negative lookahead anchored at the word boundary:



              '~b(?!(?:and|the|text|simply)b)w{3,}~'


              See the regex demo



              Details





              • b - a word boundary


              • (?!(?:and|the|text|simply)b) - no and, the, etc. as whole word is allowed immediately to the right of the current location


              • w{3,} - 3 or more word chars.


              PHP demo:



              $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
              $except = array('and', 'the', 'text', 'simply');
              if (preg_match_all('/b(?!(?:' . implode('|', $except) . ')b)w{3,}/', $input, $matches)) {
              print_r($matches[0]);
              }


              Output:



              Array
              (
              [0] => Lorem
              [1] => Ipsum
              [2] => dummy
              [3] => printing
              [4] => industry
              )






              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Nov 23 '18 at 11:58









              Wiktor StribiżewWiktor Stribiżew

              327k16147226




              327k16147226

























                  2














                  You can just apply array_diff to your result and the $except array:



                  $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
                  $except = array('and', 'the', 'text', 'simply');
                  preg_match_all('/(?<match>w{3,}+)/', $input, $matches, PREG_PATTERN_ORDER);
                  print_r(array_diff($matches['match'], $except));


                  Output:



                  Array
                  (
                  [0] => Lorem
                  [1] => Ipsum
                  [3] => dummy
                  [6] => printing
                  [7] => industry
                  )


                  demo on 3v4l.org



                  If you want the result array to be indexed from 0, use array_values on it i.e.



                  print_r(array_values(array_diff($matches['match'], $except)));


                  Output:



                  Array
                  (
                  [0] => Lorem
                  [1] => Ipsum
                  [2] => dummy
                  [3] => printing
                  [4] => industry
                  )





                  share|improve this answer




























                    2














                    You can just apply array_diff to your result and the $except array:



                    $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
                    $except = array('and', 'the', 'text', 'simply');
                    preg_match_all('/(?<match>w{3,}+)/', $input, $matches, PREG_PATTERN_ORDER);
                    print_r(array_diff($matches['match'], $except));


                    Output:



                    Array
                    (
                    [0] => Lorem
                    [1] => Ipsum
                    [3] => dummy
                    [6] => printing
                    [7] => industry
                    )


                    demo on 3v4l.org



                    If you want the result array to be indexed from 0, use array_values on it i.e.



                    print_r(array_values(array_diff($matches['match'], $except)));


                    Output:



                    Array
                    (
                    [0] => Lorem
                    [1] => Ipsum
                    [2] => dummy
                    [3] => printing
                    [4] => industry
                    )





                    share|improve this answer


























                      2












                      2








                      2







                      You can just apply array_diff to your result and the $except array:



                      $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
                      $except = array('and', 'the', 'text', 'simply');
                      preg_match_all('/(?<match>w{3,}+)/', $input, $matches, PREG_PATTERN_ORDER);
                      print_r(array_diff($matches['match'], $except));


                      Output:



                      Array
                      (
                      [0] => Lorem
                      [1] => Ipsum
                      [3] => dummy
                      [6] => printing
                      [7] => industry
                      )


                      demo on 3v4l.org



                      If you want the result array to be indexed from 0, use array_values on it i.e.



                      print_r(array_values(array_diff($matches['match'], $except)));


                      Output:



                      Array
                      (
                      [0] => Lorem
                      [1] => Ipsum
                      [2] => dummy
                      [3] => printing
                      [4] => industry
                      )





                      share|improve this answer













                      You can just apply array_diff to your result and the $except array:



                      $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
                      $except = array('and', 'the', 'text', 'simply');
                      preg_match_all('/(?<match>w{3,}+)/', $input, $matches, PREG_PATTERN_ORDER);
                      print_r(array_diff($matches['match'], $except));


                      Output:



                      Array
                      (
                      [0] => Lorem
                      [1] => Ipsum
                      [3] => dummy
                      [6] => printing
                      [7] => industry
                      )


                      demo on 3v4l.org



                      If you want the result array to be indexed from 0, use array_values on it i.e.



                      print_r(array_values(array_diff($matches['match'], $except)));


                      Output:



                      Array
                      (
                      [0] => Lorem
                      [1] => Ipsum
                      [2] => dummy
                      [3] => printing
                      [4] => industry
                      )






                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Nov 23 '18 at 11:58









                      NickNick

                      38.2k132443




                      38.2k132443























                          2














                          You could use array_diff() to eliminate the words you have in $except:



                          $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
                          $except = array('and', 'the', 'text', 'simply');
                          preg_match_all('/(?<match>w{3,}+)/', $input, $matches, PREG_PATTERN_ORDER);
                          $filtered = array_diff($matches['match'],$except);

                          var_dump($filtered);

                          // Output:
                          array(5) {
                          [0]=>
                          string(5) "Lorem"
                          [1]=>
                          string(5) "Ipsum"
                          [3]=>
                          string(5) "dummy"
                          [6]=>
                          string(8) "printing"
                          [7]=>
                          string(8) "industry"
                          }





                          share|improve this answer




























                            2














                            You could use array_diff() to eliminate the words you have in $except:



                            $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
                            $except = array('and', 'the', 'text', 'simply');
                            preg_match_all('/(?<match>w{3,}+)/', $input, $matches, PREG_PATTERN_ORDER);
                            $filtered = array_diff($matches['match'],$except);

                            var_dump($filtered);

                            // Output:
                            array(5) {
                            [0]=>
                            string(5) "Lorem"
                            [1]=>
                            string(5) "Ipsum"
                            [3]=>
                            string(5) "dummy"
                            [6]=>
                            string(8) "printing"
                            [7]=>
                            string(8) "industry"
                            }





                            share|improve this answer


























                              2












                              2








                              2







                              You could use array_diff() to eliminate the words you have in $except:



                              $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
                              $except = array('and', 'the', 'text', 'simply');
                              preg_match_all('/(?<match>w{3,}+)/', $input, $matches, PREG_PATTERN_ORDER);
                              $filtered = array_diff($matches['match'],$except);

                              var_dump($filtered);

                              // Output:
                              array(5) {
                              [0]=>
                              string(5) "Lorem"
                              [1]=>
                              string(5) "Ipsum"
                              [3]=>
                              string(5) "dummy"
                              [6]=>
                              string(8) "printing"
                              [7]=>
                              string(8) "industry"
                              }





                              share|improve this answer













                              You could use array_diff() to eliminate the words you have in $except:



                              $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
                              $except = array('and', 'the', 'text', 'simply');
                              preg_match_all('/(?<match>w{3,}+)/', $input, $matches, PREG_PATTERN_ORDER);
                              $filtered = array_diff($matches['match'],$except);

                              var_dump($filtered);

                              // Output:
                              array(5) {
                              [0]=>
                              string(5) "Lorem"
                              [1]=>
                              string(5) "Ipsum"
                              [3]=>
                              string(5) "dummy"
                              [6]=>
                              string(8) "printing"
                              [7]=>
                              string(8) "industry"
                              }






                              share|improve this answer












                              share|improve this answer



                              share|improve this answer










                              answered Nov 23 '18 at 11:58









                              JeffJeff

                              6,55311027




                              6,55311027























                                  1














                                  Here is an example using array_diff() with explode().



                                  $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
                                  $inputArray = explode(' ', $input);
                                  $except = array('and', 'the', 'text', 'simply');
                                  $results = array_values(array_diff($inputArray, $except));

                                  echo '<pre>';
                                  print_r($results);
                                  echo '</pre>';


                                  This will output:



                                   Array
                                  (
                                  [0] => Lorem
                                  [1] => Ipsum
                                  [2] => is
                                  [3] => dummy
                                  [4] => of
                                  [5] => printing
                                  [6] => industry.
                                  )





                                  share|improve this answer




























                                    1














                                    Here is an example using array_diff() with explode().



                                    $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
                                    $inputArray = explode(' ', $input);
                                    $except = array('and', 'the', 'text', 'simply');
                                    $results = array_values(array_diff($inputArray, $except));

                                    echo '<pre>';
                                    print_r($results);
                                    echo '</pre>';


                                    This will output:



                                     Array
                                    (
                                    [0] => Lorem
                                    [1] => Ipsum
                                    [2] => is
                                    [3] => dummy
                                    [4] => of
                                    [5] => printing
                                    [6] => industry.
                                    )





                                    share|improve this answer


























                                      1












                                      1








                                      1







                                      Here is an example using array_diff() with explode().



                                      $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
                                      $inputArray = explode(' ', $input);
                                      $except = array('and', 'the', 'text', 'simply');
                                      $results = array_values(array_diff($inputArray, $except));

                                      echo '<pre>';
                                      print_r($results);
                                      echo '</pre>';


                                      This will output:



                                       Array
                                      (
                                      [0] => Lorem
                                      [1] => Ipsum
                                      [2] => is
                                      [3] => dummy
                                      [4] => of
                                      [5] => printing
                                      [6] => industry.
                                      )





                                      share|improve this answer













                                      Here is an example using array_diff() with explode().



                                      $input = 'Lorem Ipsum is simply dummy text of the printing industry.';
                                      $inputArray = explode(' ', $input);
                                      $except = array('and', 'the', 'text', 'simply');
                                      $results = array_values(array_diff($inputArray, $except));

                                      echo '<pre>';
                                      print_r($results);
                                      echo '</pre>';


                                      This will output:



                                       Array
                                      (
                                      [0] => Lorem
                                      [1] => Ipsum
                                      [2] => is
                                      [3] => dummy
                                      [4] => of
                                      [5] => printing
                                      [6] => industry.
                                      )






                                      share|improve this answer












                                      share|improve this answer



                                      share|improve this answer










                                      answered Nov 23 '18 at 12:04









                                      Joseph_JJoseph_J

                                      3,2732721




                                      3,2732721






























                                          draft saved

                                          draft discarded




















































                                          Thanks for contributing an answer to Stack Overflow!


                                          • Please be sure to answer the question. Provide details and share your research!

                                          But avoid



                                          • Asking for help, clarification, or responding to other answers.

                                          • Making statements based on opinion; back them up with references or personal experience.


                                          To learn more, see our tips on writing great answers.




                                          draft saved


                                          draft discarded














                                          StackExchange.ready(
                                          function () {
                                          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53446242%2fphp-preg-match-all-match-all-words-except-preposition-adjectives-like-other-l%23new-answer', 'question_page');
                                          }
                                          );

                                          Post as a guest















                                          Required, but never shown





















































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown

































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown







                                          這個網誌中的熱門文章

                                          Academy of Television Arts & Sciences

                                          MGP Nordic

                                          Xamarin.form Move up view when keyboard appear