PRNGs randomness versus uniformity.

General FreeBASIC programming questions.
deltarho[1859]
Posts: 2846
Joined: Jan 02, 2017 0:34
Location: UK

PRNGs randomness versus uniformity.

Postby deltarho[1859] » Sep 23, 2018 10:18

As one of my 'real life' tests, I have a 640*400 graphics screen with a yellow foreground and blue background. The following code was executed for the FB generators #1 to #4 and 7 of my own generators.

I was interested in two things: How long to complete the test and how many 'yellow pixels' were left standing, so to speak.

Code: Select all

t = Timer
For j = 1 to 10
  For i = 1 to 640*400
    PSet ( Int(Rnd*640), Int(Rnd*400) ) ' Replacing the FB Rnds with the other generators
  Next
Next
t = Timer - t

What this proved was that PCG32II did very well and was only slightly beaten by CMWC4096 and the generator from Numerical Recipes; the latter being a very fast 32-bit LCG.

What caught my eye was all of the generators left pretty much the same number of yellow pixels standing averaging about 10 to 15 pixels. However, one noticeable exception was FB #4.

From the manual, FB #4 is described as:
4 - Uses a function that is designed to give the same random number sequences as QBASIC. This should be stable across all platforms, and provides 24-bit precision, with a low degree of randomness.

FB #4 fails PractRand at 4KB. PCG32II goes to at least 16TB with Practrand.

FB #4 had between 1 and 3 yellow pixels standing. What this told me was FB #4 whilst not passing muster in the randomness stakes seemed to be doing exceptionally well in the uniformity stakes.

If we take our 640*400 screen and wrote an algorithm to Pset from the left and work its way to the write such that no yellow pixels were left then we would have created a uniform distribution. However, that behaviour would in no way qualify as being random. A random distribution would see yellow pixels changing to blue in an unpredictable manner.

Now Monte Carlo methods are not sensitive to randomness but they are sensitive to uniformity. Most people use good quality random number generators because they tend to uniformity the more random numbers we request. A cryptographic PRNG will eventually produce perfect uniformity but they are very slow in comparison with PRNGs.

In view of this, it occurred to me that FB #4 may be a good candidate for Monte Carlo simulation. So, I ran a Montecarlo test to estimate PI and got this from five runs of 20 million pairs of random numbers and got this.

Code: Select all

FB #4      3.141453 0.9999556105437050 *
PCG32II    3.141148 0.9998583350424872
 
FB #4      3.141481 0.9999645868824955 *
PCG32II    3.140841 0.9997608048933606
 
FB #4      3.141667 0.9999762080447450 *
PCG32II    3.141793 0.9999363591313422
 
FB #4      3.141526 0.9999789108273737 *
PCG32II    3.141880 0.9999084795116608
 
FB #4      3.141593 0.9999997624103084 *
PCG32II    3.141638 0.9999856296578152
 
PI = 3.14159265358979323846

The second column of numbers indicates the number of significant figures achieved and * is the better of the two generators in this respect.

FB #4 is the more accurate than PCG32II with every run. Look at the fifth run!

FB #4 is not the fastest generator on the block but it is faster than Mersenne Twister and, clearly, has a better level of uniformity than PCG32II for the same number of random number requests. PCG32II will reach the same level of uniformity but only after more requests than FB #4.

In another thread dodicat wrote:
randomize 4 seems the best for accuracy.
It reaches a steady 5 decimal places after about 20 seconds.
The others never seem to get 5 places even after a coffee run.

That is true.

However, he also wrote:
But it seems to produce the best randomness (5 places in pi/4 within a few seconds).

which is false from both the manual and PractRand.

It has, from the above, the best uniformity before anything else.

I think in the same thread jj2007 and I reckoned that the FB generators should be 'dumped' and replaced by newer generators such as PCG32II. FB #4 should be kept for Monte Carlo simulation if nothing else.

If you are 'into' Monte Carlo simulation and are not using FB #4 it is worthwhile looking at.
jj2007
Posts: 1937
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: PRNGs randomness versus uniformity.

Postby jj2007 » Sep 23, 2018 12:35

deltarho[1859] wrote:What this proved was that PCG32II did very well and was only slightly beaten by CMWC4096 and the generator from Numerical Recipes; the latter being a very fast 32-bit LCG.
With "beaten" you mean speed, or also randomness as measured with PractRand? I am afraid I am a bit lost in the PRNG #2 mega-thread... can you post a link to the PCG32II and the Numerical Recipes algos that you consider the most valid ones? Googling for Numerical Recipes random produces many results but not one single algo (with constants, multipliers etc) that could serve as a basis for benchmarks.

Some concrete algos are listed in Random Numbers for Simulations by Roberto Innocente, but without any test results.

On a side note, some speak of multiply-with-carry implementations. I wonder if PractRand results change when switching to carry-less multiplication using the fast PCLMULQDQ instruction. Again, for testing one would need an authoritative reference (=YOU) for deciding on which algo to pick.
deltarho[1859]
Posts: 2846
Joined: Jan 02, 2017 0:34
Location: UK

Re: PRNGs randomness versus uniformity.

Postby deltarho[1859] » Sep 23, 2018 13:52

jj2007 wrote:With "beaten" you mean speed

I should have been more specific but it did follow from my second paragraph: "I was interested in two things"

pcg32_random
Numerical Recipes at Parameters in common use.

The Numerical Recipes book is mentioned here.

With regard the multiply-with-carry link that is a variant CMWC which I use in my CMWC4096. Without checking I am not sure if that website is exactly the same as mine because Marsaglia added a tweak just before he died and I don't think many people caught onto it.
I wonder if PractRand results change when switching to carry-less multiplication using the fast PCLMULQDQ instruction.

I have no idea. A lot of the asm in CMWC4096 was written by John Gleason at the PowerBASIC forums in 2012 and since PCLMULQDQ was introduced in 2010 John may not have been aware of it.

I should add that I am now firmly against this sort of test:

Code: Select all

t = Timer
For i = 1 To 10^8
  x = <random [0,1)>
  asm nop or asm "nop"
Next
t = Timer - t
Print 100/t

That is a laboratory test. I prefer to give generators 'real life' tests because that is where they will actually get used. The rankings tend to move around dependent on the type of task. Having said that PCG32II is always 'up there' in the clouds which is why I favour it. If ever it does get beaten on speed it is only marginally so.
jj2007
Posts: 1937
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: PRNGs randomness versus uniformity.

Postby jj2007 » Sep 23, 2018 14:28

deltarho[1859] wrote:Numerical Recipes at Parameters in common use.
Thanks, but the Wiki quotes very different parameters: 1664525 and 10139042. I have seen several times 4101842887655102017ll as one of the parameters...
deltarho[1859]
Posts: 2846
Joined: Jan 02, 2017 0:34
Location: UK

Re: PRNGs randomness versus uniformity.

Postby deltarho[1859] » Sep 23, 2018 14:51

jj2007 wrote:but the Wiki quotes very different parameters

Very different to what? I didn't mention any parameter values.

The WiKi entry is 1664525 and 1013904223

4101842887655102017ll is for a 64-bit generator - Numerical Recipes is a 32-bit generator.

'll'? I've never seen anyone use a Longint - Ulongint is normally used. Perhaps that was a typo and I am being pedantic but we have to be in this game.
jj2007
Posts: 1937
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: PRNGs randomness versus uniformity.

Postby jj2007 » Sep 23, 2018 16:49

deltarho[1859] wrote:4101842887655102017ll is for a 64-bit generator - Numerical Recipes is a 32-bit generator.
Google numerical recipes "4101842887655102017ll" - this is what I mean, the Internet is full of "hits" but I miss one authoritative source that describes what exactly it is, and how it can be calculated. There are also hits for C or C++ sources, I tried some and got a bunch of errors. Probably because C is so "portable". I hate it.

Just found this jewel:
(22 Apr 1992)
"The NR-recommended random number generators RAN1 and RAN2 should not be used for any serious application. If you use the top bit of RAN1 to create a discrete random walk (plus or minus 1 with equal probability) of length 10,000, the variance will be around 1500, far below the desired value of 10,000.
"Both are low-modulus generators with a shuffling buffer, in one case with the bottom bits twiddled with another low-modulus generator. The moduli are just too low for serious work, and the resulting generators even out too well."
deltarho[1859]
Posts: 2846
Joined: Jan 02, 2017 0:34
Location: UK

Re: PRNGs randomness versus uniformity.

Postby deltarho[1859] » Sep 23, 2018 17:16

I see. It is not used in a classic LCG which would use Ulongint.

I found A Java implementation of the Numerical Recipies random number generator
propose a random number generator which they advocate as giving a good compromise between quality and speed.

It is a combined generator: two XORShift generators are combined with an LCG and a multiply with carry generator.

The next edition of the book will include a kitchen sink.

So, be prepared to nod off waiting for this to knock out any random numbers.

Added: Just saw your addition. A wide berth seems to be needed to avoid damaging our hull. <smile>

So, back to this thread's theme: If you are 'into' Monte Carlo simulation and are not using FB #4 it is worthwhile looking at.
deltarho[1859]
Posts: 2846
Joined: Jan 02, 2017 0:34
Location: UK

Re: PRNGs randomness versus uniformity.

Postby deltarho[1859] » Sep 26, 2018 22:10

I was looking at ENT today.

Chi-square is good at telling us whether data is uniform or not.

Arithmetic mean is good at telling us whether data is uniform or not.

Monte Carlo is telling us whether data is uniform or not.

Only the Serial correlation coefficient is telling us that data may be random.

So, three out of four metrics are about uniformity and not randomness.

ENT then is good at telling us if data is not random but virtually useless at telling us whether something is random.

PractRand looks at both uniformity and randomness.
sean_vn
Posts: 283
Joined: Aug 06, 2012 8:26

Re: PRNGs randomness versus uniformity.

Postby sean_vn » Sep 26, 2018 23:13

The upper bits of a LCG where you use only every second value seems good for uniformity. I also have experimented with using the bswap instruction with LCG.
If you generate permutations from a RNG the distribution of those permutations seems like a good test but of course only works on very small sets.
If there was an efficient way of doing large integer addition then repeatedly adding PHI-1 (in fixed point integer arithmetic)
https://en.wikipedia.org/wiki/Low-discrepancy_sequence#Additive_recurrence
and hashing that large integer would give a good RNG. However all the add with carry instructions needed on current intel/amd cpus create horrible dependency chains that likely would stall up the cpu badly.

Golden ratio in hex: 1.9E3779B97F4A7C15F39 from
https://en.wikipedia.org/wiki/Golden_ratio

Return to “General”

Who is online

Users browsing this forum: No registered users and 12 guests