FreeBASIC's PRNG #2

General FreeBASIC programming questions.
Post Reply
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: FreeBASIC's PRNG #2

Post by deltarho[1859] »

@dodicat

Re chiSquare2g

Run the following:

Code: Select all

#Include "file.bi"
 
Dim As Ulong i, n = 4*1024*1024
 
Randomize 666, 3
 
If FileExists( "16MB.txt" ) Then Kill "16MB.txt"
Open "16MB.txt" For Binary As #1
For i = 1 To n
  Put #1,, Cast( Ulong, Rnd*2^32 )
Next
Close #1
Print "Finished"
Sleep
This will create a 16MB file filled with Ulongs generated by FB #3. I have used a fixed seed so that your results will match mine.

Now run 'chisquare2g 16MB.txt' at the command line.

You will get this:
Image
With Fourmilab's ENT I get this:

Code: Select all

Entropy = 7.999989 bits per byte.
 
Optimum compression would reduce the size
of this 16777216 byte file by 0 percent.
 
Chi square distribution for 16777216 samples is 246.23, and randomly
would exceed this value 64.18 percent of the times.
 
Arithmetic mean value of data bytes is 127.4657 (127.5 = random).
Monte Carlo value for Pi is 3.142236505 (error 0.02 percent).
Serial correlation coefficient is -0.000197 (totally uncorrelated = 0.0).
So,

Code: Select all

chiSquare2g => 64.231%
ENT         => 64.18%
<1% : >99% => Very strong evidence to suspect that data is not random
<5%, >=1% : >95%, <=99% => Strong evidence to suspect that data is not random
<10%, >=5% : >90%, <=95% => Some evidence to suspect that data is not random

otherwise no grounds to question data.

With FB #3 at 64%, we have a Chi-square pass.

*** The question of the day: Have I restored your faith? ***
dafhi
Posts: 1641
Joined: Jun 04, 2005 9:51

Re: FreeBASIC's PRNG #2

Post by dafhi »

Think I've got something of cryptographic quality.

Less buggy now. [comments updated]

krr .. code doesn't run (looking at it from the future) .. will leave code up

Code: Select all

/' -- low granularity sequencer

  When you run this, keep in mind i'm using a 2-bit internal state.
  Bit size is adjustable; loop is at the bottom.
 
    - More Info -

  The base class lets me use <= 8 bits of granularity to represent
  the output (usually lower grain than internal state)
 
  Threw in 2 extra state vars in the sequencer class
  output:  early repeats are fine, and in fact preferred.

  This demo may contain errors.  I just kind of threw everything together
 
'/


type MiniBits
 
    /' -- 2018 Sep 10 - by dafhi
     
      simulate <= 8 bit integers
   
    '/
 
    as ubyte          u
    as ushort         c ''2018 Sep 9 (from ubyte)
   
    declare property  bits as ubyte
    declare property  bits( as ubyte)
 
    declare operator  let( as ubyte)
    declare operator  let( as MiniBits)
    declare operator  cast as ubyte
    declare operator  cast as string
   
    declare sub       ror(as byte) '' bit rotate
   
    declare constructor( as ubyte = 0, as ubyte = 0)
    declare constructor( as MiniBits)
   
  private:
    as ubyte          b, _bits
End Type

constructor MiniBits(i as ubyte, b as ubyte)
    if b = 0 then b = 4
    bits = b
    this = i
End Constructor
constructor MiniBits(i as MiniBits)
    constructor i.b
End Constructor

operator MiniBits.let(i as MiniBits)
    this = i.b
End Operator
operator MiniBits.let(i as ubyte)
    b = i and u
End Operator

operator MiniBits.cast as ubyte
    return b
End Operator
operator MiniBits.cast as string
    return str(cubyte(this))
End Operator

property MiniBits.bits(i as ubyte)
    i = (i-1)and 7:  _bits = i + 1:  c = 2 ^ _bits:  u = c - 1
End Property
property MiniBits.bits as ubyte
    return _bits
End Property

sub MiniBits.ror(i as byte)
    i -= _bits * int(i / _bits)
    b = (b shl (_bits-i) or  b shr i)and u
End Sub


type Sequencer
    as MiniBits             state = type(,2) 'bits, val
    as MiniBits             ret = type(,2)
   
    declare constructor(as ubyte=3, as ubyte=1, as ubyte=1, as ubyte=2, as ubyte=2)
    declare constructor(as sequencer)
   
    declare operator        cast as ubyte
    declare operator        cast as string
    declare operator        let( as sequencer)
   
    declare sub             iterate
 
    as ubyte                mul
    as ubyte                add
    as ubyte                ror
    
    '' extra state vars
    as minibits             a, b
  
  private:
    declare sub             _let(as ubyte, as ubyte, as ubyte, as ubyte, as ubyte)
End Type

sub Sequencer._let(m as ubyte, _add as ubyte, r as ubyte, retbits as ubyte, statebits as ubyte)
    mul=m: add=_add: ror=r:  ret.bits = retbits
    state.bits = statebits
    a.bits = state.bits
    b.bits = state.bits
End Sub

constructor Sequencer(m as ubyte, _add as ubyte, r as ubyte, retbits as ubyte, statebits as ubyte)
    _let m, _add, r, retbits, statebits
End Constructor

constructor Sequencer(i as sequencer)
    constructor i.mul, i.add, i.ror, i.ret.bits, i.state.bits
End Constructor

operator Sequencer.Let(i as sequencer)
    _let i.mul, i.add, i.ror, i.ret.bits, i.state.bits
End Operator

operator Sequencer.cast as ubyte
    iterate:  return ret
End Operator

operator Sequencer.cast as string
    return str(cubyte(this))
End Operator

/' -- ghost copy.  moved to vicinity of loop (bottom of page) for easy experimentation
sub Sequencer.iterate
   
    /'  internal state variables (usually higher than output)  '/

    a += 1        '- state=1
    state.ror ror '+ state shr (state.bits \ 2)
    b = mul * (state + add)
    state = a xor b
   
    /' regular strength '/
    ret = b
End Sub
'/



type myint as integer

#ifndef max
  #define max(i,j) iif(i>j, i, j)
  #define min(i,j) iif(i<j, i, j)
#EndIf


function get_period(A as sequencer, show as boolean = false)as myint
   
    dim as myint      c = min( 2^( A.state.bits*2 + 1), 1e7 )
    dim as myint      i, j, offs, bins(A.ret.u), seq(1 to c), missing
  
    for i = 1 to ubound(seq)
      seq(i) = A:  bins( seq(i) ) += 1
        '? seq(i);
    Next
    '?: ?

    for i = 0 to A.ret.u
      if bins(i) = 0 then j += 1: missing = i + 1
      if j = 1 then return 0
    Next
    'if missing then ? "value not in sequence: "; missing - 1

    '' frequency analyzer
    for i = 1 to c\2
      for j = i+a.state.c-1 to c-1
        if seq(i)=seq(j) then
          for offs = 1 to j-2:  if seq(i+offs)<>seq(j+offs) then exit for
          Next
        EndIf:  if offs = j - 1 then goto exit_i
      Next
    next
    exit_i:
    if j >= c-1 then return 0
    
    j -= i '+ 1
    
    if j <= A.state.c then return 0
   
    if show then
      ? A.mul; " "; A.add; " "; A.ror; "  ret "; A.ret.bits; " state "; A.state.bits
      ? "period: "; j
     
      for i = 1 to c
        ? seq(i);
      Next
      ?: ?
    endif
   
    return j

End Function


function tally_if_good_period(mul as ubyte, _
                          add as ubyte, _
                          ror as ubyte, _
                          retbits as ubyte = 2, _
                          statebits as ubyte = 2, _
                          show as boolean = false) as myint
   
    dim as Sequencer  A = type(mul, add, ror, retbits, statebits)
    return get_period(A, show)
End function


'
' -- Main
'

sub Sequencer.iterate
    /'  internal state variables (usually higher than output)  '/

    a += 1        - (state=1)
    state.ror ror '+ state shr (state.bits \ 2)
    b = mul * (state + add)
    state = a xor b
   
    /' regular strength '/
    ret = b
End Sub
' ------------

'
                    ' anything up to 8
var state_bits = 2


var output_bits = state_bits - 1

'#define debug
#ifdef debug
  tally_if_good_period 5,1,1, output_bits, state_bits, true
  sleep: end
#endif

''
var max_p = 0
var show_me = state_bits < 5

for mul as long = 1 to 5 step 2
  for add as long = 0 to 4 step 1
    for ror as long = 0 to 2
      var period = tally_if_good_period( mul, add, ror, output_bits, state_bits, show_me )
      if period > max_p then max_p = period
      if period >= 2^(state_bits*1+3) then show_me = state_bits < 5
    Next
  Next
Next

? "max period "; max_p
sleep
Last edited by dafhi on Sep 14, 2018 17:24, edited 10 times in total.
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: FreeBASIC's PRNG #2

Post by deltarho[1859] »

Yours truly wrote:*** The question of the day: Have I restored your faith? ***
@dodicat

Well, did I?
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: FreeBASIC's PRNG #2

Post by dodicat »

Hi deltarho[]
Yes, I tried it again.
success.
But now I have a question.
You use put in a loop?
So it looks like actual numbers (0 to 9) appear only in amongst the other ascii characters.

Here I have extracted them

mean 7.334707276922197
smallest 0
biggest 27201
Standard deviation 66.72545724844514
File size 16777216
Chi distance 607.0153044014867
time taken for everything 11.81932243425399
number of distinct numerical entries 629332

Here, your creator of 16mb.txt and using print # instead of put in a loop.

Code: Select all

#Include "file.bi"
 
Dim As Ulong i, n = 4*1024*1024
 
Randomize 666, 3
'===========
 Sub savefile(filename As String,p As String) 'unused
    Dim As Integer n
    n=Freefile
    If Open (filename For Binary Access Write As #n)=0 Then
        Put #n,,p
        Close
    Else
        Print "Unable to save " + filename
    End If
End Sub
Function loadfile(file as string) as String
	If FileExists(file)=0 Then Print file;" not found":Sleep:end
   var  f=freefile
    Open file For Binary Access Read As #f
    Dim As String text
    If Lof(1) > 0 Then
      text = String(Lof(f), 0)
      Get #f, , text
    End If
    Close #f
    return text
end Function
'============
If FileExists( "16MB.txt" ) Then Kill "16MB.txt"
Open "16MB.txt" For Binary As #1
For i = 1 To n
  Put #1,, Cast( Ulong, Rnd*2^32 )
Next
Close #1

print "small sample of 16MB.txt"

var sample=loadfile("16MB.txt")
print left(sample,60)

var f=freefile
Open "16MB3.txt" For binary As #f
For i = 1 To n
  print #f, Cast( Ulong, Rnd*2^32 )
Next

Close #f

print "small sample of 16MB3.txt"
sample=loadfile("16MB3.txt")
print left(sample,60)


Print "Finished"
Sleep   
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: FreeBASIC's PRNG #2

Post by jj2007 »

jj2007 wrote:In the meantime, I studied your code (excellent btw) and found that the real work is all done here:

Code: Select all

Private Function pcg32.randse() As Double
Under the hood, in the disassembly, it looks messy, and I am tempted to rewrite it in assembly. It could be a lot shorter and faster.
Just to let you know, I am working on it, lots of testing still needed, but speedwise it looks promising - a factor 2 (with 100 Mio numbers generated):

Code: Select all

2477 ms for FB, last result: 9E777B4C 9E78660B
1220 ms for MB, last result: 9E777B4C 9E78660B
202     bytes for FB
113     bytes for MB
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: FreeBASIC's PRNG #2

Post by deltarho[1859] »

@dodicat

The file to be analyzed needs to be binary and not ascii.

@jj2007

A factor of 2? WOW!
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: FreeBASIC's PRNG #2

Post by jj2007 »

Yes, but I'm not happy that it's only a factor 2 <smile>

Btw have you ever tested the randomness of one of the PCG32 state DWORDs? This part generates the true random number, in 2 DWORDs:

Code: Select all

  this.state = oldstate * 6364136223846793005ULL + (this.seq Or 1)
This part just shuffles that number around to make it look more random, but I have a little doubt whether it really matters:

Code: Select all

  Dim As Ulong xorshifted = ((oldstate Shr 18u) xor oldstate) Shr 27u
  Dim As Ulong rot = oldstate Shr 59u
  Return (xorshifted Shr rot) Or (xorshifted Shl ((-rot) And 31))
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: FreeBASIC's PRNG #2

Post by deltarho[1859] »

@jj2007

To my mind this.state is a LCG which we know has issues. The next part is a conditioning element to mitigate the LCG's issues. The conditioning obviously works because LCGs fail PractRand at either 4KB or 8KB whereas PCG32 has been shown to get to 16TB with PractRand.

I have no idea how the 18, 27, 59 and 31 are derived.

David Blackman and Sebastiano Vigna's 128-bit generators use three numbers and I have seen a paper by them which considers a whole bunch of differing triples. Here too I have no idea how they derived the various triples.
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: FreeBASIC's PRNG #2

Post by jj2007 »

Interesting. To me it sounds a bit counter-intuitive that a conditioned version of a random number should be more random than the original number, but you are much more expert on these issues. And I admit that my math skills are not good enough to understand the theory behind it...
dafhi
Posts: 1641
Joined: Jun 04, 2005 9:51

Re: FreeBASIC's PRNG #2

Post by dafhi »

The shift by 59 is pretty straight-forward. It's the only thing I remember about pcg from
https://www.youtube.com/watch?v=45Oet5qjlms
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: FreeBASIC's PRNG #2

Post by jj2007 »

dafhi wrote:The shift by 59 is pretty straight-forward. It's the only thing I remember about pcg from
https://www.youtube.com/watch?v=45Oet5qjlms
Paddy McCarthy
1. Gather approximately a hundred of the best tests for statistical randomness into a test suite.
2. Put most of the used pseudo-random number generators through the test suite. (Most fail).
3. Select a couple of the easiest pseudo-random number generators to reason about.
4. Try an easy extension of their state to see if and when (or if ever), they can pass.
5. Knowing that some state bits are more random than others, and that less random bits are usually dropped, use the most random state bits to (invertably) permute and select the bits returned.
6. Achieve PRNG nirvanna!
My highlighting re conditioned version.
dafhi
Posts: 1641
Joined: Jun 04, 2005 9:51

Re: FreeBASIC's PRNG #2

Post by dafhi »

this is my [incomplete] analysis of pcg

maybe someone can figure out the significance of the "extra" xor step

Code: Select all

''  'x' is a copy of 'state'
 
dim as ulongint x = state
 
dim as ubyte count = (x shr 59)     'Top bits are most changed from LCG mul.
                                    'Keeping 5 bits for bit rotation.  2^5 = 32.

                                    'The numbers below correlate with 5 and 32
 
state = x * multiplier + increment
                                   
                                    'The 18, below, you have to view in terms of
                                    'what happens in the last step, the rotate
                                       
x xor= x shr 18                     ' [still trying to understand]
                                    '
                                    'original source comment: 18 = (64 - 27)/2

return rotr32(x shr 27, count)      'first arg:  top 5 bits of low dword.
                                    'even without the shift, the low dword is
                                    '"pretty scrambled" from previous operation
Last edited by dafhi on Sep 10, 2018 19:01, edited 1 time in total.
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: FreeBASIC's PRNG #2

Post by jj2007 »

I've implemented Pcg32 in assembler, some results here. Is there any PractRand binary for Windows?
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: FreeBASIC's PRNG #2

Post by deltarho[1859] »

jj2007 wrote:Is there any PractRand binary for Windows?
I thought you had PractRand.

Anyway, here it is: PractRand_0.94.zip

All you need is RNG_test.

Go HERE to see how I use it.
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: FreeBASIC's PRNG #2

Post by deltarho[1859] »

Earlier I mentioned that the multiplier in PCG32 is the same as Knuth64. The constant is not used. The 6364136223846793005 is not any old number. It will be a prime of sorts - perhaps a Sophie Germain prime. George Marsaglia used Sophie Germain primes for the multiplier in his Multiply With Carry (MWC) generators.

I found another 64-bit LCG:

Code: Select all

Rand = (2862933555777941757ull * Rand + 3037000493ull)
It was cited as being the fastest 64-bit LCG but, in fact, it's speed is the same as Knuth64. It also failed PractRand at 8KB. So, it has nothing on Knuth64 other than a completely different sequence will be output.

PCG32 is now begging to have it's multiplier changed so I did just that.

I got a bit worried as I thought I was heading for 1TB PractRand clean sweep and that would have meant my letting it run on; I am not getting any younger. <smile> Thankfully, I got a small anomaly at 1TB.

This result has not surprised me. John Gleason, of chiSquare2g fame, and I collaborated on a 64-bit MWC way back in 2008. The original code, which I called RND2, was written in BASIC. John converted much of the code to asm. He then wrote some code to generate 384 Sophie Germain primes. When RND2 was executed it chose, at random, one of those primes. This gave us 384 possible sequences to choose from. I further developed RND2 to Complimentary MWC status, base 2^32-1 with a lag array. My CMWC4096 is based upon that work.

PCG32II using 286293355577794175 as a multiplier.

Code: Select all

F:\PR64>My_RNG | RNG_test stdin32
RNG_test using PractRand version 0.94
RNG = RNG_stdin32, seed = unknown
test set = core, folding = standard (32 bit)
 
rng=RNG_stdin32, seed=unknown
length= 256 megabytes (2^28 bytes), time= 2.9 seconds
  no anomalies in 165 test result(s)
 
rng=RNG_stdin32, seed=unknown
length= 512 megabytes (2^29 bytes), time= 6.1 seconds
  no anomalies in 178 test result(s)
 
rng=RNG_stdin32, seed=unknown
length= 1 gigabyte (2^30 bytes), time= 12.2 seconds
  no anomalies in 192 test result(s)
 
rng=RNG_stdin32, seed=unknown
length= 2 gigabytes (2^31 bytes), time= 24.0 seconds
  no anomalies in 204 test result(s)
 
rng=RNG_stdin32, seed=unknown
length= 4 gigabytes (2^32 bytes), time= 46.7 seconds
  no anomalies in 216 test result(s)
 
rng=RNG_stdin32, seed=unknown
length= 8 gigabytes (2^33 bytes), time= 93.8 seconds
  no anomalies in 229 test result(s)
 
rng=RNG_stdin32, seed=unknown
length= 16 gigabytes (2^34 bytes), time= 186 seconds
  no anomalies in 240 test result(s)
 
rng=RNG_stdin32, seed=unknown
length= 32 gigabytes (2^35 bytes), time= 366 seconds
  no anomalies in 251 test result(s)
 
rng=RNG_stdin32, seed=unknown
length= 64 gigabytes (2^36 bytes), time= 740 seconds
  no anomalies in 263 test result(s)
 
rng=RNG_stdin32, seed=unknown
length= 128 gigabytes (2^37 bytes), time= 1487 seconds
  no anomalies in 273 test result(s)
 
rng=RNG_stdin32, seed=unknown
length= 256 gigabytes (2^38 bytes), time= 2997 seconds
  no anomalies in 284 test result(s)
 
rng=RNG_stdin32, seed=unknown
length= 512 gigabytes (2^39 bytes), time= 6097 seconds
  no anomalies in 295 test result(s)
 
rng=RNG_stdin32, seed=unknown
length= 1 terabyte (2^40 bytes), time= 12202 seconds
  Test Name                         Raw       Processed     Evaluation
  [Low1/32]DC6-9x1Bytes-1           R=  +4.6  p =  4.4e-3   unusual
  ...and 303 test result(s) without anomalies
Post Reply