• Welcome to the new Internet Infidels Discussion Board, formerly Talk Freethought.

The Programming Thread

There are at least two "cryptographically secure" PRNGs with source code in the public domain. The most famous of these might be the Mersenne Twister (MT19937) with a cycle of (219937-1). We might assume that "security" is irrelevant but these random generators generate numbers which are VERY . . . random! Why not use them?

One possible reason to avoid them would be the computational cost of computing the next pseudo-random number in sequence. HOWEVER it is possible to minimize the consumption of the random bits.

Obviously, a binary decision with probabilities (0.5, 0.5) can be achieved with a SINGLE bit from the random-number generator. In other words we need only call MT19937 once to get 32 such random decisions.

Less obvious is that a binary decision like (0.8173, 0.1827) can be achieved with just 0.687 random bits on average.

Similarly, a uniform integer on N16 can be achieved with 4 random bits, and a uniform integer on N17 can be achieved with just 4.09 bits on average.

Is this hint enough to see how it's done?
 
The Mersenne Twister 32 bit is the default C++ library random number generator. there is also a 64 bit Mersenne Twister and a linear congruential generator.

I downloaded the Mersenne Twister code.

I have been using 10^10 iterations for testing and comparing URAND to C++ Mersenne Twister and C rand(). The C++ Mersenne Twister executes much slower than the URAND I posted. To be expected, the URAND code has one multiplication and one addition for each iteration.


In terms of frequency testing URAND look like the C++ Mersenne Twister in terms of randomness. They are equally close to a true unform distribution in terms of the number of coornces of a number. Both are an order of magnitude better than C rand() as a unform approximation.

Plotting URANFD and Mersenne Twister both look like random eectricall noise. I took the Fourier trasform of both and they both have the flat trasform of random noise as I would see with elecrcal noise.

That leaves the period and namdomness of intervals between occurences of a number, the gap test.

I have been codng a gap test.

Which to use? It alweys coes down to the application. The linear comgruential algorithm is simple, compact, and fast with reasonbale charteristcs. Suitable for the kind of simualtions I did on systems.
 
Execution times C++ Mersenne Twister vs LG.

for 10^10 iterations C++ took 68 seconds, URAND LG took .016 seconds.

Made a dumb mistake at first. Make sure the compiler is in release mode not debug, it runs slower in debug.

MS documentation says other than hardware devices Mersenne Twister has the highest performance, LG the lowest.


Code:
#include <random>
#include <chrono>

int main(){
    long long i,n = pow(10,10);
    int x;
//   Mersenne Twister 32 bit
//   unsigned seed = chrono::system_clock::now().time_since_epoch().count();
//   default_random_engine generator (seed);
//    uniform_int_distribution<int> dist (0,100); // random ints 0-100
//    for(i=0;i<n;i++)x = dist(generator);;
  
    //URAND LG 32 bit
    int sf = 100,os = 0;
    unsigned m = 0x7fffffff;
    unsigned k1 = 907633385,k2 = 1686629725;
    unsigned rand_num = 1;
    for(i=0;i<n;i++){
        rand_num = rand_num*k2 + k1;
        x = (rand_num%m)%sf + os; // random ints 0-100
       }
  return 0;
  }
 
Last edited:
My little library has several PRNGs to choose from, though I've found no reason to use something faster than Mersenne Twister. When I time these various PRNGs, the times to call the PRNG 5 billion times range from 55 seconds (Bob Jenkin's Burtle PRNG) to 119 seconds (Marsaglia's KISS PRNG). GNU's compound Tausworthe (random(), 69 seconds) and the Mersenne Twister (83 seconds) are about midway between Burtle (which emphasizes speed) and Marsaglia (which emphasizes ridiculously long period).

These numbers "compare apples with apples." They were measured with the same compiler and overhead on my VERY old laptop running cygwin. Much of the time was spent on loop-handling etc. OUTSIDE the actual PRNG. Looking at the source code for the PRNGs we see that, Yes, the time closely parallels the number of operations in the critical code.

As you can see, ALL these generators are fast enough. In a real application the choice of PRNG will have almost no effect on net speed.

As I've mentioned, applications which consume 32 random bits for a simple switch can often make do with a single bit or even less. . . .
Obviously, a binary decision with probabilities (0.5, 0.5) can be achieved with a SINGLE bit from the random-number generator. In other words we need only call MT19937 once to get 32 such random decisions.

Less obvious is that a binary decision like (0.8173, 0.1827) can be achieved with just 0.687 random bits on average.

Similarly, a uniform integer on N16 can be achieved with 4 random bits, and a uniform integer on N17 can be achieved with just 4.09 bits on average.

Is this hint enough to see how it's done?

However this is just a fun exercise! The PRNGs are all so fast that saving those random bits is pointless!

Execution times C++ Mersenne Twister vs LG.

for 10^10 iterations C++ took 68 seconds, URAND LG took .016 seconds.
Obviously something went badly wrong in your experiment! I'm sure you'll agree that you didn't do 10^10 iterations of anything in 16 milliseconds!
 
Don;t have a clue what you are 'hinting' at. I scanned through the Mersenne algorithm and there is plenty of code on the net. No mystery or puzzle.

Feel free to post complete working debugged code that others can run.
 
steve said:
Don;t have a clue what you are 'hinting' at. I scanned through the Mersenne algorithm and there is plenty of code on the net. No mystery or puzzle.

Most of my comments had nothing whatsoever to do with Mersenne Twister.
My question was about your "URAND LG". Perhaps you should show us that code.

Execution times C++ Mersenne Twister vs LG.

for 10^10 iterations C++ took 68 seconds, URAND LG took .016 seconds.
Obviously something went badly wrong in your experiment! I'm sure you'll agree that you didn't do 10^10 iterations of anything in 16 milliseconds!

No comment about this? :confused2:

- - - - - - - - - - - - - - - - - -

I will post two miscellaneous comments about my random-number routines.

(1) when choosing a uniform real variate over (0, 1) one starts with a specified number of bits. By default my library routines use 31 bits. The variate is then chosen to be distributed uniformly over
{1,3,5,7,9,... , 4294967291, 4294967293, 232-1 } ÷ 232
Note that this chooses a number from (0, 1) -- not [0,1) as many or most implementations do. Note that the mean will be 0.5 exactly. Perfectionistic? Sure, but why not? Steve, modify your code to use, say 8 bits instead of 32 bits and se what mean value you get.

(2) I'll present pseudo-code to illustrate how to minimize the number of random bits consumed by smallish selections and binary decision.

Let Nk denote {0, 1, 2, 3, ..., k-1 }
Internally, the code maintains a pair of integers (r, m). r is a uniformly-distributed random variable over Nm
I will describe two routines Augment and Decide.

Augment (k) // used to increase ("augment") the precision of (r, m)​
Retrieve k bits from the PRNG; i.e. retrieve a random x uniform over N2k
Replace (r, m) as follows. Obviously shifts will be used instead of multiplies.​
r ← r * 2k + x​
m ← m * 2k
Decide (p) // return 1 with probability p, 0 with probability 1-p​
If m is "too small" call Augment (k) for some appropriate k (e.g. make m a 31-bit number).​
Approximate p as s/m for some integer s.​
If r < s Then​
m ← s​
Return 1​
Else (r ≥ s) Then​
m ← m - s​
r ← r - s​
Return 0​

The basic idea is that there is UNUSED randomness available for free after a Decide. Keep that randomness, rather than discarding it and "wasting" a call to the 32-bit PRNG every time. (As mentioned, even very good PRNGs are so good that the technique, however "nifty", has no importance unless you're working with a very expensive physical RNG.)

Anyone who takes the time to study this "Decide (p)" routine -- perhaps working simple examples -- and then understands how it works, should give themself a very big pat on the back! I've chatted with enough expert programmers to realize that this method -- however simple the above pseudo-code may appear -- is likely to baffle. Kudos to anyone who can pursue it all the way to an "Aha!" moment.
 
And to close out the topic, comparing rand(), URAND , and the C++ imputation of mt19937.

Te first test is cumulative distribution. For all tree generators the plot is a straight line as expected. For
random ints 0-99 and 2^20 iterations

URAND median = 500
rand() median = 496
mt19937 median = 500

The second test is a frequency test, IOW a histogram.

For 10^10 iterations and random ints 0-99.

1. Each nonoccurence of each number in each generator is counted.
2. The percipient deviation from a theoretical perfect uniform distribution is calculated which can be positive or negative.
3, The average of the absolute values of the deviations is calculated

Typical average deviations from theoretical, they move around a little on each run.

URAND 0.00752%
mt19937 0.0251%
rand() 1.09125%

The counts and deviations for each number generated by rand() show a clear bias. Lower numbers are all on the high side, higher numbers are all on the low side. There is no randomness.

There are comments on the net about avoiding using rand(), and I see why.

URAND looks ok, it has a smaller period than mt19937. That could be addressed by tinging m to a 64 bit lint.

I;d say URAND is ok, it has a smaller period than mt19937
It is an older out of print book but was a re fence text, Knuth;s Semi Numeral Algorithms. There are PDFs, it has a lengthy chapter on random generators and generator testing.

ps://seriouscomputerist.atariverse.com/media/pdf/book/Art%20of%20Computer%20Programming%20-%20Volume%202%20(Seminumerical%20Algorithms).pdf


Code:
#include <chrono>
#include <random>


typedef mt19937 random_engine;
const unsigned seed = chrono::system_clock::now().time_since_epoch().count();
random_engine generator (seed);



double mers_doub(int lo,int hi){
    uniform_real_distribution<double> dist (lo,hi);
   return dist(generator);
}//mers_doub()

int mers_int(int lo,int hi){
    uniform_int_distribution<int> dist (lo,hi);
   return dist(generator);
}

int urand_int(int lo,int hi){
        static int init = 1;
        unsigned m = 0x7fffffff;
        unsigned k1 = 907633385,k2 = 1686629725;
        static unsigned rand_num = 1;
        if(init){rand_num = time(NULL),init = 0;}
        rand_num = rand_num*k2 + k1;
        int sf = 1 + hi - lo;
        return (rand_num%m)%sf + lo;
}//urand_int()

int comp_int_ascend(const void * a, const void * b) {
  if (*(int*)a > *(int*)b) return 1;
  else if (*(int*)a < *(int*)b) return -1;
  else return 0;
}

void cum_dist_test_int(string t){
   long long i,n = pow(2,20);
    int lo,hi,min,max,median;
    double *cd = new double[n];
    long long s;
    int   *y = new int[n];
    lo = 0;hi = 1000;
    min = hi + 1;max = lo - 11;
    srand(time(NULL));
    for(i=0;i<n;i++)cd[i] = 100*double(i)/double(n);
    if(t == "u")for(i=0;i<n;i++)y[i] = urand_int(lo,hi);
    if(t == "m")for(i=0;i<n;i++)y[i] = mers_int(lo,hi);
    if(t == "r")for(i=0;i<n;i++)y[i] = (rand()%RAND_MAX)%hi;
    qsort(y,n,sizeof(int),comp_int_ascend);
    for(i=0;i<n;i++){
        if(cd[i] >= 50.){
            median = y[i];
            break;
            }
        }
    for(i=0;i<n;i++){
         s += y[i];
         if(y[i]<min)min = y[i];
         if(y[i]>max)max = y[i];
     }
     printf("Min  %d  Max  %d Median  %d\n",min,max,median);
    FILE *p = fopen("urand.dat","w");
    for(i=0;i<n;i++)fprintf(p,"%d \t %.15f\n",y[i],cd[i]);
    fclose(p);
    delete [] y;
    delete [] cd;
 }//cum_dist_tes_int()

void rand_freq_test(void){
    cout<<"rand_freq_test"<<endl;
   //-------------------------------------------
   //C++
    typedef mt19937 random_engine;
    unsigned seed = chrono::system_clock::now().time_since_epoch().count();
    random_engine generator (seed);
    //-----------------------------------------
    //URAND
    unsigned m = 0x7fffffff;
    unsigned k1 = 907633385,k2 = 1686629725;
    unsigned rand_num = time(NULL);
    //------------------------------------------
    long long i,n = pow(10,10);
    int xu,xm,xr;
    int hi = 1000;
    long long countu[hi],countm[hi],countr[hi];
    double su = 0,sm = 0,sr=0,pcu[hi],pcm[hi],pcr[hi];
    for(i=0;i<hi;i++){countu[i] = 0;countm[i] = 0,countr[i] = 0;}
    uniform_int_distribution<int> dist (0,hi-1);
    srand(time(NULL));
    for(i=0;i<n;i++){
        rand_num = rand_num*k2 + k1;
        xu =(rand_num%m)%hi;
        xm = dist(generator);
        xr = (rand()%RAND_MAX)%hi;
        countu[xu]++;
        countm[xm]++;
        countr[xr]++;
        }
    int k = n/hi;  // k,n powers of 10
    for(i=0;i<hi;i++){
        pcu[i] = 100*(double(countu[i]) - k)/k;
        su += abs(pcu[i]);
        pcm[i] = 100*(double(countm[i]) - k)/k;
        sm += abs(pcm[i]);
        pcr[i] = 100*(double(countr[i]) - k)/k;
        sr += abs(pcr[i]);
        }
    su = su/hi;
    sm = sm/hi;
    sr = sr/hi;
    printf("Average Deviation  URAND %.5f mt19937  %.5f  rand()  %.5f\n",su,sm,sr);
    FILE *p = fopen("freqs.dat","w");
    for(i=0;i< hi;i++)fprintf(p,"%lld  %lldl   %+.5f   %lld  %+.5f   %lld  %+.5f\n",
        i,countr[i],pcr[i],countu[i],pcu[i],countm[i],pcm[i]);
    fclose(p);
}//rand_freq_test()

int main(void){
    //cum_dist_test_int("m");
   rand_freq_test();
   return 0
}
 
Last edited:
A section of the frequncy output file.

First column is the numer, secnd is the counts fr rand()m abd te trhird column s the deviation form theretical.

Code:
     rand() count,percnt  URAND              mt19937       
761  100902l   +0.90200   100011  +0.01100   100582  +0.58200
762  100472l   +0.47200   99783  -0.21700   99770  -0.23000
763  101136l   +1.13600   100047  +0.04700   99717  -0.28300
764  100661l   +0.66100   100220  +0.22000   100507  +0.50700
765  100704l   +0.70400   99802  -0.19800   100327  +0.32700
766  100687l   +0.68700   100131  +0.13100   100041  +0.04100
767  97911l   -2.08900   100712  +0.71200   100540  +0.54000
768  97429l   -2.57100   99495  -0.50500   99999  -0.00100
769  97485l   -2.51500   99550  -0.45000   100360  +0.36000
770  97276l   -2.72400   99697  -0.30300   99926  -0.07400
771  97414l   -2.58600   99761  -0.23900   99748  -0.25200
772  97407l   -2.59300   100659  +0.65900   100400  +0.40000
773  97982l   -2.01800   99857  -0.14300   100028  +0.02800
774  97181l   -2.81900   99664  -0.33600   100125  +0.12500
 
Back
Top Bottom