FreeBASIC's PRNG #2

General FreeBASIC programming questions.
deltarho[1859]
Posts: 2181
Joined: Jan 02, 2017 0:34
Location: UK

Re: FreeBASIC's PRNG #2

@dodicat

Re chiSquare2g

Run the following:

Code: Select all

#Include "file.bi"

Dim As Ulong i, n = 4*1024*1024

Randomize 666, 3

If FileExists( "16MB.txt" ) Then Kill "16MB.txt"
Open "16MB.txt" For Binary As #1
For i = 1 To n
Put #1,, Cast( Ulong, Rnd*2^32 )
Next
Close #1
Print "Finished"
Sleep

This will create a 16MB file filled with Ulongs generated by FB #3. I have used a fixed seed so that your results will match mine.

Now run 'chisquare2g 16MB.txt' at the command line.

You will get this:

With Fourmilab's ENT I get this:

Code: Select all

Entropy = 7.999989 bits per byte.

Optimum compression would reduce the size
of this 16777216 byte file by 0 percent.

Chi square distribution for 16777216 samples is 246.23, and randomly
would exceed this value 64.18 percent of the times.

Arithmetic mean value of data bytes is 127.4657 (127.5 = random).
Monte Carlo value for Pi is 3.142236505 (error 0.02 percent).
Serial correlation coefficient is -0.000197 (totally uncorrelated = 0.0).

So,

Code: Select all

chiSquare2g => 64.231%
ENT         => 64.18%

<1% : >99% => Very strong evidence to suspect that data is not random
<5%, >=1% : >95%, <=99% => Strong evidence to suspect that data is not random
<10%, >=5% : >90%, <=95% => Some evidence to suspect that data is not random

otherwise no grounds to question data.

With FB #3 at 64%, we have a Chi-square pass.

*** The question of the day: Have I restored your faith? ***
dafhi
Posts: 1323
Joined: Jun 04, 2005 9:51

Re: FreeBASIC's PRNG #2

Think I've got something of cryptographic quality.

krr .. code doesn't run (looking at it from the future) .. will leave code up

Code: Select all

/' -- low granularity sequencer

When you run this, keep in mind i'm using a 2-bit internal state.
Bit size is adjustable; loop is at the bottom.

The base class lets me use <= 8 bits of granularity to represent
the output (usually lower grain than internal state)

Threw in 2 extra state vars in the sequencer class
output:  early repeats are fine, and in fact preferred.

This demo may contain errors.  I just kind of threw everything together

'/

type MiniBits

/' -- 2018 Sep 10 - by dafhi

simulate <= 8 bit integers

'/

as ubyte          u
as ushort         c ''2018 Sep 9 (from ubyte)

declare property  bits as ubyte
declare property  bits( as ubyte)

declare operator  let( as ubyte)
declare operator  let( as MiniBits)
declare operator  cast as ubyte
declare operator  cast as string

declare sub       ror(as byte) '' bit rotate

declare constructor( as ubyte = 0, as ubyte = 0)
declare constructor( as MiniBits)

private:
as ubyte          b, _bits
End Type

constructor MiniBits(i as ubyte, b as ubyte)
if b = 0 then b = 4
bits = b
this = i
End Constructor
constructor MiniBits(i as MiniBits)
constructor i.b
End Constructor

operator MiniBits.let(i as MiniBits)
this = i.b
End Operator
operator MiniBits.let(i as ubyte)
b = i and u
End Operator

operator MiniBits.cast as ubyte
return b
End Operator
operator MiniBits.cast as string
return str(cubyte(this))
End Operator

property MiniBits.bits(i as ubyte)
i = (i-1)and 7:  _bits = i + 1:  c = 2 ^ _bits:  u = c - 1
End Property
property MiniBits.bits as ubyte
return _bits
End Property

sub MiniBits.ror(i as byte)
i -= _bits * int(i / _bits)
b = (b shl (_bits-i) or  b shr i)and u
End Sub

type Sequencer
as MiniBits             state = type(,2) 'bits, val
as MiniBits             ret = type(,2)

declare constructor(as ubyte=3, as ubyte=1, as ubyte=1, as ubyte=2, as ubyte=2)
declare constructor(as sequencer)

declare operator        cast as ubyte
declare operator        cast as string
declare operator        let( as sequencer)

declare sub             iterate

as ubyte                mul
as ubyte                ror

'' extra state vars
as minibits             a, b

private:
declare sub             _let(as ubyte, as ubyte, as ubyte, as ubyte, as ubyte)
End Type

sub Sequencer._let(m as ubyte, _add as ubyte, r as ubyte, retbits as ubyte, statebits as ubyte)
state.bits = statebits
a.bits = state.bits
b.bits = state.bits
End Sub

constructor Sequencer(m as ubyte, _add as ubyte, r as ubyte, retbits as ubyte, statebits as ubyte)
_let m, _add, r, retbits, statebits
End Constructor

constructor Sequencer(i as sequencer)
constructor i.mul, i.add, i.ror, i.ret.bits, i.state.bits
End Constructor

operator Sequencer.Let(i as sequencer)
_let i.mul, i.add, i.ror, i.ret.bits, i.state.bits
End Operator

operator Sequencer.cast as ubyte
iterate:  return ret
End Operator

operator Sequencer.cast as string
return str(cubyte(this))
End Operator

/' -- ghost copy.  moved to vicinity of loop (bottom of page) for easy experimentation
sub Sequencer.iterate

/'  internal state variables (usually higher than output)  '/

a += 1        '- state=1
state.ror ror '+ state shr (state.bits \ 2)
b = mul * (state + add)
state = a xor b

/' regular strength '/
ret = b
End Sub
'/

type myint as integer

#ifndef max
#define max(i,j) iif(i>j, i, j)
#define min(i,j) iif(i<j, i, j)
#EndIf

function get_period(A as sequencer, show as boolean = false)as myint

dim as myint      c = min( 2^( A.state.bits*2 + 1), 1e7 )
dim as myint      i, j, offs, bins(A.ret.u), seq(1 to c), missing

for i = 1 to ubound(seq)
seq(i) = A:  bins( seq(i) ) += 1
'? seq(i);
Next
'?: ?

for i = 0 to A.ret.u
if bins(i) = 0 then j += 1: missing = i + 1
if j = 1 then return 0
Next
'if missing then ? "value not in sequence: "; missing - 1

'' frequency analyzer
for i = 1 to c\2
for j = i+a.state.c-1 to c-1
if seq(i)=seq(j) then
for offs = 1 to j-2:  if seq(i+offs)<>seq(j+offs) then exit for
Next
EndIf:  if offs = j - 1 then goto exit_i
Next
next
exit_i:
if j >= c-1 then return 0

j -= i '+ 1

if j <= A.state.c then return 0

if show then
? A.mul; " "; A.add; " "; A.ror; "  ret "; A.ret.bits; " state "; A.state.bits
? "period: "; j

for i = 1 to c
? seq(i);
Next
?: ?
endif

return j

End Function

function tally_if_good_period(mul as ubyte, _
ror as ubyte, _
retbits as ubyte = 2, _
statebits as ubyte = 2, _
show as boolean = false) as myint

dim as Sequencer  A = type(mul, add, ror, retbits, statebits)
return get_period(A, show)
End function

'
' -- Main
'

sub Sequencer.iterate
/'  internal state variables (usually higher than output)  '/

a += 1        - (state=1)
state.ror ror '+ state shr (state.bits \ 2)
b = mul * (state + add)
state = a xor b

/' regular strength '/
ret = b
End Sub
' ------------

'
' anything up to 8
var state_bits = 2

var output_bits = state_bits - 1

'#define debug
#ifdef debug
tally_if_good_period 5,1,1, output_bits, state_bits, true
sleep: end
#endif

''
var max_p = 0
var show_me = state_bits < 5

for mul as long = 1 to 5 step 2
for add as long = 0 to 4 step 1
for ror as long = 0 to 2
var period = tally_if_good_period( mul, add, ror, output_bits, state_bits, show_me )
if period > max_p then max_p = period
if period >= 2^(state_bits*1+3) then show_me = state_bits < 5
Next
Next
Next

? "max period "; max_p
sleep
Last edited by dafhi on Sep 14, 2018 17:24, edited 10 times in total.
deltarho[1859]
Posts: 2181
Joined: Jan 02, 2017 0:34
Location: UK

Re: FreeBASIC's PRNG #2

Yours truly wrote:*** The question of the day: Have I restored your faith? ***

@dodicat

Well, did I?
dodicat
Posts: 6155
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: FreeBASIC's PRNG #2

Hi deltarho[]
Yes, I tried it again.
success.
But now I have a question.
You use put in a loop?
So it looks like actual numbers (0 to 9) appear only in amongst the other ascii characters.

Here I have extracted them

mean 7.334707276922197
smallest 0
biggest 27201
Standard deviation 66.72545724844514
File size 16777216
Chi distance 607.0153044014867
time taken for everything 11.81932243425399
number of distinct numerical entries 629332

Here, your creator of 16mb.txt and using print # instead of put in a loop.

Code: Select all

#Include "file.bi"

Dim As Ulong i, n = 4*1024*1024

Randomize 666, 3
'===========
Sub savefile(filename As String,p As String) 'unused
Dim As Integer n
n=Freefile
If Open (filename For Binary Access Write As #n)=0 Then
Put #n,,p
Close
Else
Print "Unable to save " + filename
End If
End Sub
Function loadfile(file as string) as String
var  f=freefile
Open file For Binary Access Read As #f
Dim As String text
If Lof(1) > 0 Then
text = String(Lof(f), 0)
Get #f, , text
End If
Close #f
return text
end Function
'============
If FileExists( "16MB.txt" ) Then Kill "16MB.txt"
Open "16MB.txt" For Binary As #1
For i = 1 To n
Put #1,, Cast( Ulong, Rnd*2^32 )
Next
Close #1

print "small sample of 16MB.txt"

print left(sample,60)

var f=freefile
Open "16MB3.txt" For binary As #f
For i = 1 To n
print #f, Cast( Ulong, Rnd*2^32 )
Next

Close #f

print "small sample of 16MB3.txt"
print left(sample,60)

Print "Finished"
Sleep
jj2007
Posts: 1318
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: FreeBASIC's PRNG #2

jj2007 wrote:In the meantime, I studied your code (excellent btw) and found that the real work is all done here:

Code: Select all

Private Function pcg32.randse() As Double
Under the hood, in the disassembly, it looks messy, and I am tempted to rewrite it in assembly. It could be a lot shorter and faster.

Just to let you know, I am working on it, lots of testing still needed, but speedwise it looks promising - a factor 2 (with 100 Mio numbers generated):

Code: Select all

2477 ms for FB, last result: 9E777B4C 9E78660B
1220 ms for MB, last result: 9E777B4C 9E78660B
202     bytes for FB
113     bytes for MB
deltarho[1859]
Posts: 2181
Joined: Jan 02, 2017 0:34
Location: UK

Re: FreeBASIC's PRNG #2

@dodicat

The file to be analyzed needs to be binary and not ascii.

@jj2007

A factor of 2? WOW!
jj2007
Posts: 1318
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: FreeBASIC's PRNG #2

Yes, but I'm not happy that it's only a factor 2 <smile>

Btw have you ever tested the randomness of one of the PCG32 state DWORDs? This part generates the true random number, in 2 DWORDs:

Code: Select all

this.state = oldstate * 6364136223846793005ULL + (this.seq Or 1)

This part just shuffles that number around to make it look more random, but I have a little doubt whether it really matters:

Code: Select all

Dim As Ulong xorshifted = ((oldstate Shr 18u) xor oldstate) Shr 27u
Dim As Ulong rot = oldstate Shr 59u
Return (xorshifted Shr rot) Or (xorshifted Shl ((-rot) And 31))
deltarho[1859]
Posts: 2181
Joined: Jan 02, 2017 0:34
Location: UK

Re: FreeBASIC's PRNG #2

@jj2007

To my mind this.state is a LCG which we know has issues. The next part is a conditioning element to mitigate the LCG's issues. The conditioning obviously works because LCGs fail PractRand at either 4KB or 8KB whereas PCG32 has been shown to get to 16TB with PractRand.

I have no idea how the 18, 27, 59 and 31 are derived.

David Blackman and Sebastiano Vigna's 128-bit generators use three numbers and I have seen a paper by them which considers a whole bunch of differing triples. Here too I have no idea how they derived the various triples.
jj2007
Posts: 1318
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: FreeBASIC's PRNG #2

Interesting. To me it sounds a bit counter-intuitive that a conditioned version of a random number should be more random than the original number, but you are much more expert on these issues. And I admit that my math skills are not good enough to understand the theory behind it...
dafhi
Posts: 1323
Joined: Jun 04, 2005 9:51

Re: FreeBASIC's PRNG #2

The shift by 59 is pretty straight-forward. It's the only thing I remember about pcg from
jj2007
Posts: 1318
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: FreeBASIC's PRNG #2

dafhi wrote:The shift by 59 is pretty straight-forward. It's the only thing I remember about pcg from
1. Gather approximately a hundred of the best tests for statistical randomness into a test suite.
2. Put most of the used pseudo-random number generators through the test suite. (Most fail).
3. Select a couple of the easiest pseudo-random number generators to reason about.
4. Try an easy extension of their state to see if and when (or if ever), they can pass.
5. Knowing that some state bits are more random than others, and that less random bits are usually dropped, use the most random state bits to (invertably) permute and select the bits returned.
6. Achieve PRNG nirvanna!
My highlighting re conditioned version.
dafhi
Posts: 1323
Joined: Jun 04, 2005 9:51

Re: FreeBASIC's PRNG #2

this is my [incomplete] analysis of pcg

maybe someone can figure out the significance of the "extra" xor step

Code: Select all

''  'x' is a copy of 'state'

dim as ulongint x = state

dim as ubyte count = (x shr 59)     'Top bits are most changed from LCG mul.
'Keeping 5 bits for bit rotation.  2^5 = 32.

'The numbers below correlate with 5 and 32

state = x * multiplier + increment

'The 18, below, you have to view in terms of
'what happens in the last step, the rotate

x xor= x shr 18                     ' [still trying to understand]
'
'original source comment: 18 = (64 - 27)/2

return rotr32(x shr 27, count)      'first arg:  top 5 bits of low dword.
'even without the shift, the low dword is
'"pretty scrambled" from previous operation
Last edited by dafhi on Sep 10, 2018 19:01, edited 1 time in total.
jj2007
Posts: 1318
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: FreeBASIC's PRNG #2

I've implemented Pcg32 in assembler, some results here. Is there any PractRand binary for Windows?
deltarho[1859]
Posts: 2181
Joined: Jan 02, 2017 0:34
Location: UK

Re: FreeBASIC's PRNG #2

jj2007 wrote:Is there any PractRand binary for Windows?

Anyway, here it is: PractRand_0.94.zip

All you need is RNG_test.

Go HERE to see how I use it.
deltarho[1859]
Posts: 2181
Joined: Jan 02, 2017 0:34
Location: UK

Re: FreeBASIC's PRNG #2

Earlier I mentioned that the multiplier in PCG32 is the same as Knuth64. The constant is not used. The 6364136223846793005 is not any old number. It will be a prime of sorts - perhaps a Sophie Germain prime. George Marsaglia used Sophie Germain primes for the multiplier in his Multiply With Carry (MWC) generators.

I found another 64-bit LCG:

Code: Select all

Rand = (2862933555777941757ull * Rand + 3037000493ull)

It was cited as being the fastest 64-bit LCG but, in fact, it's speed is the same as Knuth64. It also failed PractRand at 8KB. So, it has nothing on Knuth64 other than a completely different sequence will be output.

PCG32 is now begging to have it's multiplier changed so I did just that.

I got a bit worried as I thought I was heading for 1TB PractRand clean sweep and that would have meant my letting it run on; I am not getting any younger. <smile> Thankfully, I got a small anomaly at 1TB.

This result has not surprised me. John Gleason, of chiSquare2g fame, and I collaborated on a 64-bit MWC way back in 2008. The original code, which I called RND2, was written in BASIC. John converted much of the code to asm. He then wrote some code to generate 384 Sophie Germain primes. When RND2 was executed it chose, at random, one of those primes. This gave us 384 possible sequences to choose from. I further developed RND2 to Complimentary MWC status, base 2^32-1 with a lag array. My CMWC4096 is based upon that work.

PCG32II using 286293355577794175 as a multiplier.

Code: Select all

F:\PR64>My_RNG | RNG_test stdin32
RNG_test using PractRand version 0.94
RNG = RNG_stdin32, seed = unknown
test set = core, folding = standard (32 bit)

rng=RNG_stdin32, seed=unknown
length= 256 megabytes (2^28 bytes), time= 2.9 seconds
no anomalies in 165 test result(s)

rng=RNG_stdin32, seed=unknown
length= 512 megabytes (2^29 bytes), time= 6.1 seconds
no anomalies in 178 test result(s)

rng=RNG_stdin32, seed=unknown
length= 1 gigabyte (2^30 bytes), time= 12.2 seconds
no anomalies in 192 test result(s)

rng=RNG_stdin32, seed=unknown
length= 2 gigabytes (2^31 bytes), time= 24.0 seconds
no anomalies in 204 test result(s)

rng=RNG_stdin32, seed=unknown
length= 4 gigabytes (2^32 bytes), time= 46.7 seconds
no anomalies in 216 test result(s)

rng=RNG_stdin32, seed=unknown
length= 8 gigabytes (2^33 bytes), time= 93.8 seconds
no anomalies in 229 test result(s)

rng=RNG_stdin32, seed=unknown
length= 16 gigabytes (2^34 bytes), time= 186 seconds
no anomalies in 240 test result(s)

rng=RNG_stdin32, seed=unknown
length= 32 gigabytes (2^35 bytes), time= 366 seconds
no anomalies in 251 test result(s)

rng=RNG_stdin32, seed=unknown
length= 64 gigabytes (2^36 bytes), time= 740 seconds
no anomalies in 263 test result(s)

rng=RNG_stdin32, seed=unknown
length= 128 gigabytes (2^37 bytes), time= 1487 seconds
no anomalies in 273 test result(s)

rng=RNG_stdin32, seed=unknown
length= 256 gigabytes (2^38 bytes), time= 2997 seconds
no anomalies in 284 test result(s)

rng=RNG_stdin32, seed=unknown
length= 512 gigabytes (2^39 bytes), time= 6097 seconds
no anomalies in 295 test result(s)

rng=RNG_stdin32, seed=unknown
length= 1 terabyte (2^40 bytes), time= 12202 seconds
Test Name                         Raw       Processed     Evaluation
[Low1/32]DC6-9x1Bytes-1           R=  +4.6  p =  4.4e-3   unusual
...and 303 test result(s) without anomalies