Producing random variates distributed hypergeometrically











up vote
2
down vote

favorite












How can we generate random variates for a hypergeometric distribution in C++? Unfortunately, the standard library does not supply a std::hypergeometric_distribution. There is a boost::math::hypergeometric_distribution; however, it appears that it cannot be used for generating random variates.










share|improve this question


























    up vote
    2
    down vote

    favorite












    How can we generate random variates for a hypergeometric distribution in C++? Unfortunately, the standard library does not supply a std::hypergeometric_distribution. There is a boost::math::hypergeometric_distribution; however, it appears that it cannot be used for generating random variates.










    share|improve this question
























      up vote
      2
      down vote

      favorite









      up vote
      2
      down vote

      favorite











      How can we generate random variates for a hypergeometric distribution in C++? Unfortunately, the standard library does not supply a std::hypergeometric_distribution. There is a boost::math::hypergeometric_distribution; however, it appears that it cannot be used for generating random variates.










      share|improve this question













      How can we generate random variates for a hypergeometric distribution in C++? Unfortunately, the standard library does not supply a std::hypergeometric_distribution. There is a boost::math::hypergeometric_distribution; however, it appears that it cannot be used for generating random variates.







      c++ random boost distribution






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 7 at 19:10









      user1494080

      6932622




      6932622
























          1 Answer
          1






          active

          oldest

          votes

















          up vote
          0
          down vote













          Well, here is simple code (perhaps with bugs, though it compiles, runs and produced something resembling the truth) - sampling Hypergeometric distribution. Notations are the same as in Wiki article. Internally, it fill array with values [0...N-1], shuffle first n of them and count if shuffled values are less than K (successes).



          Shuffle is classic Fisher-Yates-Knuth one.



          #include <cstdint>
          #include <iostream>
          #include <random>
          #include <vector>
          #include <numeric>
          #include <cmath>

          #define func auto


          func nCk(uint64_t n, uint64_t k) -> double { // computes binomial coefficient
          if (k > n)
          return 0.0;

          if (k * 2ULL > n)
          k = n - k;

          if (k == 0.0)
          return 1.0;

          return exp(lgamma(n+1) - (lgamma(k+1) + lgamma(n-k+1)));
          }


          func PMF(uint64_t N, uint64_t K, uint64_t n, uint64_t k) -> double { // Hypergeometric distribution PMF
          return nCk(K, k) * nCk(N - K, n - k) / nCk(N, n);
          }


          func sample_hyper(int N, int K, int n, // sampling from Hypergeometric distribution
          std::mt19937_64& rng,
          std::vector<int>& nums) -> int {

          int rc{ 0 };
          for (int k = 0; k != n; ++k) {
          std::uniform_int_distribution<int> uni{ k, N-1 };
          auto s = uni(rng);
          std::swap(nums[k], nums[s]);

          rc += (nums[k] < K);
          }
          return rc;
          }


          func main() -> int {
          auto rng = std::mt19937_64{1234567891ULL};

          int N = 500; // compare with Wiki
          int K = 50;
          int n = 100;

          auto nums = std::vector<int>( N );
          std::iota(nums.begin(), nums.end(), 0); // array to shuffle, filled with 0, 1, 2, 3 ... sequence

          auto h = std::vector<int>( K, 0 ); // histogram

          int NT = 100000; // number of trials
          for(int k = 0; k != NT; ++k) {
          int q = sample_hyper(N, K, n, rng, nums);
          h[q] += 1; // count it
          }

          for (int k = 0; k != 20; ++k) { // and print it
          double e = double(h[k]) / double(NT);
          std::cout << k << " " << e << " " << PMF(N, K, n, k) << 'n';
          }

          return 0;
          }


          UPDATE



          I've implemented PMF for hypergeometric distribution, and my sampling seems to be consistent with it. Please check updated code.






          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














             

            draft saved


            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53196221%2fproducing-random-variates-distributed-hypergeometrically%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            0
            down vote













            Well, here is simple code (perhaps with bugs, though it compiles, runs and produced something resembling the truth) - sampling Hypergeometric distribution. Notations are the same as in Wiki article. Internally, it fill array with values [0...N-1], shuffle first n of them and count if shuffled values are less than K (successes).



            Shuffle is classic Fisher-Yates-Knuth one.



            #include <cstdint>
            #include <iostream>
            #include <random>
            #include <vector>
            #include <numeric>
            #include <cmath>

            #define func auto


            func nCk(uint64_t n, uint64_t k) -> double { // computes binomial coefficient
            if (k > n)
            return 0.0;

            if (k * 2ULL > n)
            k = n - k;

            if (k == 0.0)
            return 1.0;

            return exp(lgamma(n+1) - (lgamma(k+1) + lgamma(n-k+1)));
            }


            func PMF(uint64_t N, uint64_t K, uint64_t n, uint64_t k) -> double { // Hypergeometric distribution PMF
            return nCk(K, k) * nCk(N - K, n - k) / nCk(N, n);
            }


            func sample_hyper(int N, int K, int n, // sampling from Hypergeometric distribution
            std::mt19937_64& rng,
            std::vector<int>& nums) -> int {

            int rc{ 0 };
            for (int k = 0; k != n; ++k) {
            std::uniform_int_distribution<int> uni{ k, N-1 };
            auto s = uni(rng);
            std::swap(nums[k], nums[s]);

            rc += (nums[k] < K);
            }
            return rc;
            }


            func main() -> int {
            auto rng = std::mt19937_64{1234567891ULL};

            int N = 500; // compare with Wiki
            int K = 50;
            int n = 100;

            auto nums = std::vector<int>( N );
            std::iota(nums.begin(), nums.end(), 0); // array to shuffle, filled with 0, 1, 2, 3 ... sequence

            auto h = std::vector<int>( K, 0 ); // histogram

            int NT = 100000; // number of trials
            for(int k = 0; k != NT; ++k) {
            int q = sample_hyper(N, K, n, rng, nums);
            h[q] += 1; // count it
            }

            for (int k = 0; k != 20; ++k) { // and print it
            double e = double(h[k]) / double(NT);
            std::cout << k << " " << e << " " << PMF(N, K, n, k) << 'n';
            }

            return 0;
            }


            UPDATE



            I've implemented PMF for hypergeometric distribution, and my sampling seems to be consistent with it. Please check updated code.






            share|improve this answer



























              up vote
              0
              down vote













              Well, here is simple code (perhaps with bugs, though it compiles, runs and produced something resembling the truth) - sampling Hypergeometric distribution. Notations are the same as in Wiki article. Internally, it fill array with values [0...N-1], shuffle first n of them and count if shuffled values are less than K (successes).



              Shuffle is classic Fisher-Yates-Knuth one.



              #include <cstdint>
              #include <iostream>
              #include <random>
              #include <vector>
              #include <numeric>
              #include <cmath>

              #define func auto


              func nCk(uint64_t n, uint64_t k) -> double { // computes binomial coefficient
              if (k > n)
              return 0.0;

              if (k * 2ULL > n)
              k = n - k;

              if (k == 0.0)
              return 1.0;

              return exp(lgamma(n+1) - (lgamma(k+1) + lgamma(n-k+1)));
              }


              func PMF(uint64_t N, uint64_t K, uint64_t n, uint64_t k) -> double { // Hypergeometric distribution PMF
              return nCk(K, k) * nCk(N - K, n - k) / nCk(N, n);
              }


              func sample_hyper(int N, int K, int n, // sampling from Hypergeometric distribution
              std::mt19937_64& rng,
              std::vector<int>& nums) -> int {

              int rc{ 0 };
              for (int k = 0; k != n; ++k) {
              std::uniform_int_distribution<int> uni{ k, N-1 };
              auto s = uni(rng);
              std::swap(nums[k], nums[s]);

              rc += (nums[k] < K);
              }
              return rc;
              }


              func main() -> int {
              auto rng = std::mt19937_64{1234567891ULL};

              int N = 500; // compare with Wiki
              int K = 50;
              int n = 100;

              auto nums = std::vector<int>( N );
              std::iota(nums.begin(), nums.end(), 0); // array to shuffle, filled with 0, 1, 2, 3 ... sequence

              auto h = std::vector<int>( K, 0 ); // histogram

              int NT = 100000; // number of trials
              for(int k = 0; k != NT; ++k) {
              int q = sample_hyper(N, K, n, rng, nums);
              h[q] += 1; // count it
              }

              for (int k = 0; k != 20; ++k) { // and print it
              double e = double(h[k]) / double(NT);
              std::cout << k << " " << e << " " << PMF(N, K, n, k) << 'n';
              }

              return 0;
              }


              UPDATE



              I've implemented PMF for hypergeometric distribution, and my sampling seems to be consistent with it. Please check updated code.






              share|improve this answer

























                up vote
                0
                down vote










                up vote
                0
                down vote









                Well, here is simple code (perhaps with bugs, though it compiles, runs and produced something resembling the truth) - sampling Hypergeometric distribution. Notations are the same as in Wiki article. Internally, it fill array with values [0...N-1], shuffle first n of them and count if shuffled values are less than K (successes).



                Shuffle is classic Fisher-Yates-Knuth one.



                #include <cstdint>
                #include <iostream>
                #include <random>
                #include <vector>
                #include <numeric>
                #include <cmath>

                #define func auto


                func nCk(uint64_t n, uint64_t k) -> double { // computes binomial coefficient
                if (k > n)
                return 0.0;

                if (k * 2ULL > n)
                k = n - k;

                if (k == 0.0)
                return 1.0;

                return exp(lgamma(n+1) - (lgamma(k+1) + lgamma(n-k+1)));
                }


                func PMF(uint64_t N, uint64_t K, uint64_t n, uint64_t k) -> double { // Hypergeometric distribution PMF
                return nCk(K, k) * nCk(N - K, n - k) / nCk(N, n);
                }


                func sample_hyper(int N, int K, int n, // sampling from Hypergeometric distribution
                std::mt19937_64& rng,
                std::vector<int>& nums) -> int {

                int rc{ 0 };
                for (int k = 0; k != n; ++k) {
                std::uniform_int_distribution<int> uni{ k, N-1 };
                auto s = uni(rng);
                std::swap(nums[k], nums[s]);

                rc += (nums[k] < K);
                }
                return rc;
                }


                func main() -> int {
                auto rng = std::mt19937_64{1234567891ULL};

                int N = 500; // compare with Wiki
                int K = 50;
                int n = 100;

                auto nums = std::vector<int>( N );
                std::iota(nums.begin(), nums.end(), 0); // array to shuffle, filled with 0, 1, 2, 3 ... sequence

                auto h = std::vector<int>( K, 0 ); // histogram

                int NT = 100000; // number of trials
                for(int k = 0; k != NT; ++k) {
                int q = sample_hyper(N, K, n, rng, nums);
                h[q] += 1; // count it
                }

                for (int k = 0; k != 20; ++k) { // and print it
                double e = double(h[k]) / double(NT);
                std::cout << k << " " << e << " " << PMF(N, K, n, k) << 'n';
                }

                return 0;
                }


                UPDATE



                I've implemented PMF for hypergeometric distribution, and my sampling seems to be consistent with it. Please check updated code.






                share|improve this answer














                Well, here is simple code (perhaps with bugs, though it compiles, runs and produced something resembling the truth) - sampling Hypergeometric distribution. Notations are the same as in Wiki article. Internally, it fill array with values [0...N-1], shuffle first n of them and count if shuffled values are less than K (successes).



                Shuffle is classic Fisher-Yates-Knuth one.



                #include <cstdint>
                #include <iostream>
                #include <random>
                #include <vector>
                #include <numeric>
                #include <cmath>

                #define func auto


                func nCk(uint64_t n, uint64_t k) -> double { // computes binomial coefficient
                if (k > n)
                return 0.0;

                if (k * 2ULL > n)
                k = n - k;

                if (k == 0.0)
                return 1.0;

                return exp(lgamma(n+1) - (lgamma(k+1) + lgamma(n-k+1)));
                }


                func PMF(uint64_t N, uint64_t K, uint64_t n, uint64_t k) -> double { // Hypergeometric distribution PMF
                return nCk(K, k) * nCk(N - K, n - k) / nCk(N, n);
                }


                func sample_hyper(int N, int K, int n, // sampling from Hypergeometric distribution
                std::mt19937_64& rng,
                std::vector<int>& nums) -> int {

                int rc{ 0 };
                for (int k = 0; k != n; ++k) {
                std::uniform_int_distribution<int> uni{ k, N-1 };
                auto s = uni(rng);
                std::swap(nums[k], nums[s]);

                rc += (nums[k] < K);
                }
                return rc;
                }


                func main() -> int {
                auto rng = std::mt19937_64{1234567891ULL};

                int N = 500; // compare with Wiki
                int K = 50;
                int n = 100;

                auto nums = std::vector<int>( N );
                std::iota(nums.begin(), nums.end(), 0); // array to shuffle, filled with 0, 1, 2, 3 ... sequence

                auto h = std::vector<int>( K, 0 ); // histogram

                int NT = 100000; // number of trials
                for(int k = 0; k != NT; ++k) {
                int q = sample_hyper(N, K, n, rng, nums);
                h[q] += 1; // count it
                }

                for (int k = 0; k != 20; ++k) { // and print it
                double e = double(h[k]) / double(NT);
                std::cout << k << " " << e << " " << PMF(N, K, n, k) << 'n';
                }

                return 0;
                }


                UPDATE



                I've implemented PMF for hypergeometric distribution, and my sampling seems to be consistent with it. Please check updated code.







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Nov 8 at 23:42

























                answered Nov 7 at 20:43









                Severin Pappadeux

                9,16121331




                9,16121331






























                     

                    draft saved


                    draft discarded



















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53196221%2fproducing-random-variates-distributed-hypergeometrically%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    這個網誌中的熱門文章

                    Xamarin.form Move up view when keyboard appear

                    Post-Redirect-Get with Spring WebFlux and Thymeleaf

                    Anylogic : not able to use stopDelay()