PCG32II Help file
-
- Posts: 4292
- Joined: Jan 02, 2017 0:34
- Location: UK
- Contact:
Re: PCG32II Help file
@Provoni
Your query has come in at an opportune moment because dodicat has come up with a method regarding an integral number in the range which is coming in at 62% faster in 32-bit mode and 25% faster in 64-bit mode than PCG32II's published method. It still may not be fast enough for you but 595MHz in 32-bit mode is blinding fast and as fast as a random [0,1). I have further testing to do but if all is well I publish a new PCG32II.bas and will let you know.
Your query has come in at an opportune moment because dodicat has come up with a method regarding an integral number in the range which is coming in at 62% faster in 32-bit mode and 25% faster in 64-bit mode than PCG32II's published method. It still may not be fast enough for you but 595MHz in 32-bit mode is blinding fast and as fast as a random [0,1). I have further testing to do but if all is well I publish a new PCG32II.bas and will let you know.
Re: PCG32II Help file
hey deltarho[1859],
An integer is needed, not a float. I am thinking it may be the overhead of the function call also and I am trying to convert the code so that it can be used locally in the thread.
You say that it is running at 300 to 600 Mhz. My program does about 100,000,000 iterations per second (100 Mhz) with the Lehmer generator where one random number is needed per iteration. The Lehmer generator is a small part of the code and does not much influence the speed of the program much. After replacing the Lehmer with "random_number=pcg.Range(1,s)" the speed drops to 20,000,000 iterations per second. If it is running at 300 to 600 Mhz then it should not slow down my program with a factor of 5 so I do not understand what is going on.
Actually the same things happens when calling any of the standard FreeBASIC generators and that's why I had to resort to a local generator.
I replaced:
With:
I don't understand how to convert this.state or this.sequence to local variables in the following code:
Thanks
An integer is needed, not a float. I am thinking it may be the overhead of the function call also and I am trying to convert the code so that it can be used locally in the thread.
You say that it is running at 300 to 600 Mhz. My program does about 100,000,000 iterations per second (100 Mhz) with the Lehmer generator where one random number is needed per iteration. The Lehmer generator is a small part of the code and does not much influence the speed of the program much. After replacing the Lehmer with "random_number=pcg.Range(1,s)" the speed drops to 20,000,000 iterations per second. If it is running at 300 to 600 Mhz then it should not slow down my program with a factor of 5 so I do not understand what is going on.
Actually the same things happens when calling any of the standard FreeBASIC generators and that's why I had to resort to a local generator.
I replaced:
Code: Select all
state=48271*state and 2147483647
random_number=1+s*state shr 31
Code: Select all
random_number=pcg.Range(1,s)
Code: Select all
Function pcg32.range( Byval One As Double, Byval Two As Double ) As Double
Dim TempVar As Ulong
Dim As Ulongint oldstate = this.state
this.state = oldstate * 6364136223846793005ULL + this.sequence
Dim As Ulong xorshifted = ((oldstate Shr 18u) xor oldstate) Shr 27u
Dim As Ulong rot = oldstate Shr 59u
TempVar = (xorshifted Shr rot) Or (xorshifted Shl ((-rot) And 31))
Return TempVar/4294967296.0*( Two - One ) + One
end function
Re: PCG32II Help file
Your "pcg.Range(1,s)" does about 300 Mhz for me on a single thread. Obviously the multi-threading is causing a massive slow down somehow.
Re: PCG32II Help file
Here's some (very bad) code to illustrate the problem. It generates (300,000,000 / threads) random numbers per single thread. Change threads from 1 to 4 to see the effect.
- At 1 thread PCG32II and FreeBASIC rng 4 take 1 second for the 300,000,000 random numbers.
- At 4 threads PCG32II takes 3.3 seconds and rng 4 about 7 seconds for the 300,000,000. This kind of slow down should not occur since the 300,000,000 number is divided by the number of threads.
- The custom rnd shows that the slow down from sleep 10 and threadwait are minimal in my (very bad) code.
From 2017: viewtopic.php?f=3&t=25603&p=231065&hili ... ng#p231065
- At 1 thread PCG32II and FreeBASIC rng 4 take 1 second for the 300,000,000 random numbers.
- At 4 threads PCG32II takes 3.3 seconds and rng 4 about 7 seconds for the 300,000,000. This kind of slow down should not occur since the 300,000,000 number is divided by the number of threads.
- The custom rnd shows that the slow down from sleep 10 and threadwait are minimal in my (very bad) code.
From 2017: viewtopic.php?f=3&t=25603&p=231065&hili ... ng#p231065
Code: Select all
screenres 640,480,32
randomize timer,4
#include "PCG32II.bas"
Dim shared pcg as pcg32
declare sub freebasic_rnd(byval nopointer as any ptr)
declare sub pcg32_rnd(byval nopointer as any ptr)
declare sub custom_rnd(byval nopointer as any ptr)
dim as integer i,j,k
dim shared as integer threads=4 'change to 4 and compare timings
dim as any ptr thread_ptr(threads)
dim as double t=timer
for i=1 to threads
thread_ptr(i)=threadcreate(@freebasic_rnd,0)
sleep 10
next i
for i=1 to threads
threadwait(thread_ptr(i))
next i
print "FreeBASIC rnd timing: "+str(timer-t)
sleep 100
t=timer
for i=1 to threads
thread_ptr(i)=threadcreate(@pcg32_rnd,0)
sleep 10
next i
for i=1 to threads
threadwait(thread_ptr(i))
next i
print "PCG32II rnd timing: "+str(timer-t)
sleep 100
t=timer
for i=1 to threads
thread_ptr(i)=threadcreate(@custom_rnd,0)
sleep 10
next i
for i=1 to threads
threadwait(thread_ptr(i))
next i
print "Custom rnd timing: "+str(timer-t)
sleep
sub freebasic_rnd(byval nopointer as any ptr)
dim as longint i,j
for i=1 to 300000000/threads
j+=rnd*123
next i
end sub
sub pcg32_rnd(byval nopointer as any ptr)
dim as longint i,j,s=123
for i=1 to 300000000/threads
j+=pcg.range(1,s)
next i
end sub
sub custom_rnd(byval nopointer as any ptr)
dim as longint i,j,m=123
for i=1 to 300000000/threads
m=(214013*m+2531011)mod 2147483648
j+=((m shr 16)/32768)*123 '32767
next i
end sub
-
- Posts: 4292
- Joined: Jan 02, 2017 0:34
- Location: UK
- Contact:
Re: PCG32II Help file
The Lehmer random number generator is a type of linear congruential generator (LCG) and they are blindingly fast. However, the 32-bit versions fail the PractRand test very shortly after they start to generate numbers. The 64-bit versions are better, but they fail PractRand not long after the 32-bit versions. LCGs are OK if the quality of randomness is not an issue or when just a few KB are needed before their lack of randomness manifests itself.
PCG32 used a function and if called from different threads we got massive collisions as would happen with all of FreeBASIC's generators because none of them are thread safe.
What you want is a macro and I will look into that.
Hmmm, it shouldn't because PCG32II is thread safe. PCG32 was not thread safe. I have tested PCG32II with multi-threading and the throughput was the same for all threads.Obviously, the multi-threading is causing a massive slow down somehow.
Neither do I.If it is running at 300 to 600 Mhz then it should not slow down my program with a factor of 5 so I do not understand what is going on.
You have posted the float version.I don't understand how to convert this.state or this.sequence to local variables
PCG32 used a function and if called from different threads we got massive collisions as would happen with all of FreeBASIC's generators because none of them are thread safe.
What you want is a macro and I will look into that.
-
- Posts: 4292
- Joined: Jan 02, 2017 0:34
- Location: UK
- Contact:
Re: PCG32II Help file
Your last post came in whilst I was composing my last post. I'll have a look at it.
-
- Posts: 4292
- Joined: Jan 02, 2017 0:34
- Location: UK
- Contact:
Re: PCG32II Help file
I have not read your code yet but have run it.
One thread:
FreeBASIC 4.86
PCG32II 0.61
Custom 5.47
Four threads:
FreeBASIC 6.23
PCG32II 0.23
Custom 1.59
This seems to contradict your results.
I will now read your code.
One thread:
FreeBASIC 4.86
PCG32II 0.61
Custom 5.47
Four threads:
FreeBASIC 6.23
PCG32II 0.23
Custom 1.59
This seems to contradict your results.
I will now read your code.
Re: PCG32II Help file
Without optimizations on FreeBASIC-1.07.1-win64-gcc-5.2.0:
1 thread:
FreeBASIC rng 4: 10.65
PCG32II: 3.34
Custom: 10.22
4 threads:
FreeBASIC rng 4: 13.39
PCG32II: 7.28
Custom: 2.81
8 threads:
FreeBASIC rng 4: 13.32
PCG32II: 8.28
Custom: 1.58
The PCG32II was downloaded from your source a few days ago.
1 thread:
FreeBASIC rng 4: 10.65
PCG32II: 3.34
Custom: 10.22
4 threads:
FreeBASIC rng 4: 13.39
PCG32II: 7.28
Custom: 2.81
8 threads:
FreeBASIC rng 4: 13.32
PCG32II: 8.28
Custom: 1.58
The PCG32II was downloaded from your source a few days ago.
Re: PCG32II Help file
Sorry to cause all this stir deltarho[1857] but PCG32II is performing up to par now! I was simply not using a unique generator for every thread.
-
- Posts: 4292
- Joined: Jan 02, 2017 0:34
- Location: UK
- Contact:
Re: PCG32II Help file
OK, I have spotted your problem.
Dim shared pcg as pcg32
All your threads are sharing the same generator so you will get collisions.
For four threads you need
Dim shared as pcg32 pcgA, pcgB, pcgC, pcgD.
pcgA should be given to the primary thread and the others to the other threads.
Not quite sure how to introduce them to your code yet but you may know how to do that since you are better acquainted with it.
The principle, though, is that each thread should invoke its own generator and that way we do not get collisions.
If you look in the Help file, about a third down in the 'Usage examples' section, you will see a multi-threading example.
Dim shared pcg as pcg32
All your threads are sharing the same generator so you will get collisions.
For four threads you need
Dim shared as pcg32 pcgA, pcgB, pcgC, pcgD.
pcgA should be given to the primary thread and the others to the other threads.
Not quite sure how to introduce them to your code yet but you may know how to do that since you are better acquainted with it.
The principle, though, is that each thread should invoke its own generator and that way we do not get collisions.
If you look in the Help file, about a third down in the 'Usage examples' section, you will see a multi-threading example.
Re: PCG32II Help file
Hey deltarho[1859],
I read your comment in the other thread and
has fixed the speed issue for me. But is that okay to do?
I read your comment in the other thread and
Code: Select all
Dim shared pcg(threads) as pcg32
-
- Posts: 4292
- Joined: Jan 02, 2017 0:34
- Location: UK
- Contact:
Re: PCG32II Help file
You keep posting whilst I am composing my posts.
Anyway, PCG32II's reputation has been restored and seems to be coming up trumps in your code.
The amazing thing is that with four generators working your PC is actually generating four times as many random numbers, aren't threads wonderful?
Anyway, PCG32II's reputation has been restored and seems to be coming up trumps in your code.
YepBut is that okay to do?
The amazing thing is that with four generators working your PC is actually generating four times as many random numbers, aren't threads wonderful?
Re: PCG32II Help file
Hey deltarho[1859],
Probably a noob question but how does your PCG32II know whether the use Function pcg32.range( Byval One As Long, Byval Two As Long ) As Long or Function pcg32.range( Byval One As Double, Byval Two As Double ) As Double. Just want to make sure that my program is using the integer version. Thanks.
Haha, sorry.deltarho[1859] wrote: You keep posting whilst I am composing my posts.
Probably a noob question but how does your PCG32II know whether the use Function pcg32.range( Byval One As Long, Byval Two As Long ) As Long or Function pcg32.range( Byval One As Double, Byval Two As Double ) As Double. Just want to make sure that my program is using the integer version. Thanks.
How about 30. :)deltarho[1859] wrote: The amazing thing is that with four generators working your PC is actually generating four times as many random numbers, aren't threads wonderful?
-
- Posts: 4292
- Joined: Jan 02, 2017 0:34
- Location: UK
- Contact:
Re: PCG32II Help file
Probably a noob question
Code: Select all
Declare Function range Overload( Byval One As Long, Byval Two As Long ) As Long
Declare Function range Overload ( Byval One As double, Byval Two As Double ) as Double
You are joking, right?How about 30. :)
I would like to see an Intel Core i9-9900 with 8 cores/16 threads doing its stuff.
Re: PCG32II Help file
Nope, my program (AZdecrypt) can use up to 65,536 threads (current artificial limit) and the problems it solves can be parallelized by dividing the restarts over the amount of threads. Here is the project page: http://www.zodiackillersite.com/viewtop ... =81&t=3198
I am currently testing whether the improved randomness from PCG32II over the Lehmer is worth taking. If so I will let you know and quote your PCG32II as being used in the readme file. Thanks for your help so far!
I do not understand the Overload functionality. If I specify pcg.range(1,123) will it then invoke the Long version?
I am currently testing whether the improved randomness from PCG32II over the Lehmer is worth taking. If so I will let you know and quote your PCG32II as being used in the readme file. Thanks for your help so far!
Code: Select all
Declare Function range Overload( Byval One As Long, Byval Two As Long ) As Long
Declare Function range Overload ( Byval One As double, Byval Two As Double ) as Double