Threadsafe RANDOMIZE and RND

Forum for discussion about the documentation project.
coderJeff
Site Admin
Posts: 4323
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Threadsafe RANDOMIZE and RND

Post by coderJeff »

Thanks paul. I'm kind of happy with last rewrite. It has some generic qualities so it will be easy to add in additional generators. But the performance is still bad due the function pointers, by like a factor of 10.

Code: Select all

namespace fbc
extern "rtlib"
	type RndMSWS32 extends FB_RNDSTATE
		declare constructor cdecl alias"fb_RndCtxInitMSWS32" ( byval seed as ulongint = &hb5ad4eceda1ce2a9ull )
		declare constructor cdecl alias"fb_RndCtxCopy" ( byref src as const FB_RNDSTATE )
		declare function rnd alias "fb_RndCtx" ( byval n as single = 1.0 ) as double
		declare function rnd32 alias "fb_RndCtx32" () as ulong
		declare destructor cdecl alias "fb_RndCtxExit" ()
	end type
end extern
end namespace
I'm thinking if I remove the function pointers and and direct call the procedures directly, gcc should be able to inline the functions.
adeyblue
Posts: 300
Joined: Nov 07, 2019 20:08

Re: Threadsafe RANDOMIZE and RND

Post by adeyblue »

If you want new RNGs but you want to keep with the same Randomize/RND interface, an obvious extensible solution would be to eschew hardcoding them into rtlib and instead provide an RNGCreate API (like ThreadCreate) where RNG writers pass in a function and some user data (or a class derived from an interface, whatever) and get back an id.
The app writer then passes that ID to Randomize and calls to RND just forward to the thing provided to RNGCreate.

That's extensible, it doesn't require new versions of FB for new rngs, the algos can be written in FB, it wouldn't be rtlib's job to ensure the RNG is thread safe or has any other fancy features, the implementation in rtlib is a single resizable array, and best of all I think even our own Del The Funky Homosapien here has managed to call ThreadCreate without needing his hand holding, so he should be gucci and can add all the fancy RNGs to FB he wants.

Or you know, Jeff, let's not and do something more important than a 6th RNG. If the first four are crap for the intended purpose and real is too slow - oh well, we tried, time to write and/or include your own. FB is not missing out on a deluge of users who really care about and have super knowledge of RNG theory but who were also scared to go to the door on halloween in case anybody came dressed as the word 'include'
deltarho[1859]
Posts: 4305
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Threadsafe RANDOMIZE and RND

Post by deltarho[1859] »

adeyblue wrote:but who were also scared to go to the door on halloween in case anybody came dressed as the word 'include'
Image
coderJeff
Site Admin
Posts: 4323
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Threadsafe RANDOMIZE and RND

Post by coderJeff »

The gcc optimizations didn't work out the way I thought. So I need to regroup and figure out my exit strategy.

fbc 1.07 is neither thread-safe nor has rnd32. So my base line test timings are limited.
On -O2 production code built-in RND() - it doesn't really matter much if it is gas or gcc because it's a function call in the rtlib:
- 32-bit mtwist ~43 Mhz
- 64-bit mtwist ~77 Mhz

With the API changes I've been playing around with, it's just slower no matter what I do, so yeah, doesn't really doesn't seem worth it to spend any more time on fbc's built-in RND's. However, I think my last change to math_rnd.c will be to just clean it up so it's easier to read, and test that I get the same timings in fbc-1.08 as fbc-1.07.

To get best optimization with gcc is to inline the functions, what can only be done with #included source code (as already suggested).

For a very simple 'inc/fbmath.bi' starting point:

Code: Select all

#ifndef __FBMATH_BI__
#define __FBMATH_BI__

# if __FB_LANG__ = "qb"
# error not supported in qb dialect
# endif

namespace FB

	''
	'' built-in random number generators
	'' to be used with:
	''    RANDOMIZE seed, algorithm
	''
	enum FB_RND_ALGORITHMS
		FB_RND_AUTO
		FB_RND_CRT
		FB_RND_FAST
		FB_RND_MTWIST
		FB_RND_QB
		FB_RND_REAL
	end enum

	'' extern "rtlib"
	''	'' declare built-in RANDOMIZE, RND, & RND32 in the FB namespace
	''	declare sub randomize alias "fb_Randomize" ( byval seed as double = -1.0, byval algorithm as long = FB.FB_RND_AUTO )
	''	declare function rnd alias "fb_Rnd" ( byval n as single = 1.0 ) as double
	''  declare function rnd32 alias "fb_Rnd32" ( ) as ulong
	'' end extern

	''
	'' "FAST" PRNG from 'Numerical recipes in C' chapter 7.1
	''
	type RndFAST32
		iseed as ulong = 327680
		declare function rnd() as double
		declare function rnd32() as ulong
	end type

	private function RndFAST32.rnd32() as ulong
		this.iseed = this.iseed * 1664525 + 1013904223
		return this.iseed
	end function

	private function RndFAST32.rnd() as double
		return this.rnd32()/cdbl(4294967296ull)
	end function

	''
	'' Middle Square Weyl Sequence PRNG / Bernard Widynski / 20 May 2020
	'' https://arxiv.org/abs/1704.00358v5
	'' 
	''
	type RndMSWS32
		s as ulongint = &hb5ad4eceda1ce2a9ull
		w as ulongint
		x as ulongint
		declare function rnd() as double
		declare function rnd32() as ulong
	end type

	private function RndMSWS32.rnd32() as ulong
		with this
			.x *= .x
			.w += .s
			.x += .w
			.x = ( .x shl 32 ) or ( .x shr 32 )
			return .x
		end with
	end function

	private function RndMSWS32.rnd() as double
		return this.rnd32()/cdbl(4294967296ull)
	end function

	''
	'' Squares: A Fast Counter Based PRNG / Bernard Widynski / 5 May 2020
	'' https://arxiv.org/abs/2004.06278
	''
	type RndSquares32
		key as ulongint = &hb5ad4eceda1ce2a9ull
		ctr as ulongint
		declare function rnd() as double
		declare function rnd32() as ulong
	end type

	private function RndSquares32.rnd32() as ulong
		dim as ulongint x = this.key * this.ctr
		dim as ulongint y = x
		dim as ulongint z = y + key
		x = x * x + y
		x = ( x shr 32 ) or ( x shl 32 )
		x = x * x + z
		x = ( x shr 32 ) or ( x shl 32 )
		this.ctr += 1
		return (x * x + y) shr 32
	end function

	private function RndSquares32.rnd() as double
		return this.rnd32()/cdbl(4294967296ull)
	end function

	''
	'' *Really* minimal PCG32 code / (c) 2014 M.E. O'Neill / pcg-random.org
	'' Licensed under Apache License 2.0 (NO WARRANTY, etc. see website)
	''
	type RndPCG32
		state as ulongint
		inc_ as ulongint
		declare function rnd() as double
		declare function rnd32() as ulong
	end type

	private function RndPCG32.rnd32() as ulong
		with this
			var oldstate = this.state
			'' Advance internal state
			.state = oldstate * 6364136223846793005ULL + ( .inc_ or 1 )
			'' Calculate output function (XSH RR), uses old state for max ILP
			dim as ulong xorshifted = ((oldstate shr 18u) xor oldstate) shr 27u
			dim as ulong rot = oldstate shr 59u
			return (xorshifted shr rot) or (xorshifted shl ((-rot) and 31))
		end with
	end function

	private function RndPCG32.rnd() as double
		return this.rnd32()/cdbl(4294967296ull)
	end function

end namespace

#endif '' __FBMATH_BI__
Timings in MHz (millions of rnd per second):

Code: Select all

64-bit - gcc - fbc 1.08.0
Max Threads      : 1
fast_rnd (1)               565
fast_rnd32 (1)             564
msws_rnd (1)               386
msws_rnd32 (1)             387
squares_rnd (1)            218
squares_rnd32 (1)          278
pcg32_rnd (1)              329
pcg32_rnd32 (1)            330

64-bit - gcc - fbc 1.08.0
Max Threads      : 4
fast_rnd (4)              1335
fast_rnd32 (4)            1336
msws_rnd (4)              1041
msws_rnd32 (4)            1057
squares_rnd (4)            680
squares_rnd32 (4)          836
pcg32_rnd (4)              933
pcg32_rnd32 (4)            940
deltarho[1859]
Posts: 4305
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Threadsafe RANDOMIZE and RND

Post by deltarho[1859] »

Very nice, coderJeff.

Give the following a try. Code composed in your style.

This is so fast your system clocks may slow down - Time dilation. Image

Code: Select all

''
'' xoroshiro128** Written in 2018 by David Blackman and Sebastiano Vigna (vigna@acm.org)
''  http://prng.di.unimi.it/xoroshiro128starstar.c
''
''

#define rotl(x,k) ( (x shl k) or (x shr(64-k)) ) ' Note the extra outer parentheses
 
type Rndxoroshiro128
  as ulongint s(0 To 1)
  declare function rnd() as double
  declare function rnd64() as ulongint
end type
 
private function Rndxoroshiro128.rnd64() as ulongint
dim as ulongint s0, s1, result
s0 = this.s(0)
s1 = this.s(1)
result = rotl(s0*5,7)*9
s1 xor= s0
this.s(0) =  rotl(s0,24) xor s1 xor (s1 shl 16)
this.s(1) = rotl(s1,37)
return result
end function
 
private function Rndxoroshiro128.rnd() as double
  return this.rnd64()/2^64
end function
deltarho[1859]
Posts: 4305
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Threadsafe RANDOMIZE and RND

Post by deltarho[1859] »

@coderJeff

I've just realized that I have upset the apple cart by using rnd64(), chosen because xoroshiro128** outputs 64-bit. I will leave it to you on how to deal with that.

Needless to say xoroshiro128** is PractRand resistant, I would not have mentioned it otherwise. Rnd() also has a granularity of 53-bit unlike the rnd32() generators with a granularity of 32-bit.
coderJeff
Site Admin
Posts: 4323
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Threadsafe RANDOMIZE and RND

Post by coderJeff »

Naming of "rnd64()" looks good to me. I can add xoroshiro128 to fbmath.bi header also. Maybe tonight.

At first I was seeing truly mind boggling timings with xoroshiro128. Like it defied reality. ...Evidently, if seeded with {0,0}, seems that gcc optimizes xoroshiro128 out of existence. :)

The pattern I posted for fbmath.bi is very simple starting block. It will have to get fleshed out by others with correct seed values - beyond my expertise. In xoroshiro128 "this.rnd64/(2^64)" seems slow, and I get 3x better timings with "cdbl(this.rnd64() shr 11)/cdbl(2^53)" - which I think is correct for double.

Timings in MHz:

Code: Select all

64-bit - gcc - fbc 1.08.0
Max Threads      : 1
xoroshiro128_rnd (1)            292
xoroshiro128_rnd64 (1)          382

64-bit - gcc - fbc 1.08.0
Max Threads      : 4
xoroshiro128_rnd (4)           1114
xoroshiro128_rnd64 (4)         1442
deltarho[1859]
Posts: 4305
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Threadsafe RANDOMIZE and RND

Post by deltarho[1859] »

Blackman and Vigna write: "The state must be seeded so that it is not everywhere zero."
In xoroshiro128 "this.rnd64/(2^64)" seems slow, and I get 3x better timings with "cdbl(this.rnd64() shr 11)/cdbl(2^53)"
On my machine I get the same timings. I use '/2^64' and let gcc replace with its best shot. We may need to 'tweak' things with gas but I don't use gas.

With PRNGs I am not a fan of functions calling functions - I'd rather define an engine and use code repetition.
coderJeff
Site Admin
Posts: 4323
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Threadsafe RANDOMIZE and RND

Post by coderJeff »

I've merged new changes with https://github.com/freebasic/fbc/pull/267
It adds the new header ./inc/fbmath.bi that has the new PRNG's. I'm hoping that with the source in fb itself there will be contributors from the community that can expand on the very simple starting point.

I'll update this thread (previous posts too) with examples soon to cover the new changes.

Going back to the earlier change of thread-safety for built-in RND - this is the only point where I'm not sure how users will react. If there is a lot of complaints about RND now being slower with multi-threaded programs due the mutex locking (no change for single threaded) then will have to choose one of these responses:
1) keep thread-safe RND as-is (i.e. by design) and recommend using the new PRNGs in fbmath.bi for multi-threaded programs
2) remove the automatic locking for RND but have fb.MathLock & fb.MathUnlock available for convenience, and let user decide how to deal with it in their multi-threaded program
3) scrap thread-safe RND altogether and document as not thread-safe. Let user deal with everything.
deltarho[1859]
Posts: 4305
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Threadsafe RANDOMIZE and RND

Post by deltarho[1859] »

This post should come with a health warning - it is a bit distressing.

The time has come,' the Walrus said,
To talk of many things:
Of shoes — and ships — and sealing-wax —
Of cabbages — and kings —
And why the sea is boiling hot —
And whether pigs have wings;
Except random number generators.' Image

If my experience of dropping a random number generator into the forum, from time to time, is anything to go by I shouldn't be surprised at the apparent lack of interest in coderJeff's providing access to four generators which are PractRand resistant to at least 1TB and are very much faster than those that come with fbc-1.07.

I am sorry to say but I reckon that fbmath.bi is destined to be found dead in the water.

coderJeff has not wasted his time any more than I have this last few years.

The reason for an 'apparent lack of interest': Top drawer randomness or speed of generation is not a requirement for most, if not all, of code written by members.

Anyway I'd just like to thank coderJeff for highlighting the fact that there are other generators besides those that already existed within FreeBASIC before this thread disappears into the ether.
paul doe
Moderator
Posts: 1732
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: Threadsafe RANDOMIZE and RND

Post by paul doe »

deltarho1859 wrote:...
The reason for an 'apparent lack of interest': Top drawer randomness or speed of generation is not a requirement for most, if not all, of code written by members.
...

No, it is not. For example, I have seen very little work done on PCG here (one of the areas where fast, quality PRNGs are a must). Not everyone will drool over the prospect of super-fast, quality PRNGs, though...
deltarho[1859]
Posts: 4305
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Threadsafe RANDOMIZE and RND

Post by deltarho[1859] »

Hi Paul
No, it is not. For example, I have seen very little work done on PCG here (one of the areas where fast, quality PRNGs are a must).
When you write 'on PCG' did you mean 'with PCG' and when you write 'here' do you mean this thread or the forum? "one of the areas" - what areas?

I have read that quote quite a few times and I don't understand a word of it. Perhaps it is time for a cup of tea. Image
paul doe
Moderator
Posts: 1732
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: Threadsafe RANDOMIZE and RND

Post by paul doe »

deltarho[1859] wrote:When you write 'on PCG' did you mean 'with PCG' and when you write 'here' do you mean this thread or the forum? "one of the areas" - what areas?
PCG stands for Procedural Content Generation, and by 'here' I mean the forum at large. Similarly, 'areas' stands for 'areas where a good, fast Pseudo-Random Number Generator might be applicable'.
deltarho[1859]
Posts: 4305
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Threadsafe RANDOMIZE and RND

Post by deltarho[1859] »

PCG stands for Procedural Content Generation
Image
I read PCG as Permuted Congruential Generator. I have never heard of Procedural Content Generation and had to look it up; but then I would because I am not 'into' games.

There are, of course, many other areas where "a good, fast Pseudo-Random Number Generator might be applicable" but that relies on the user knowing when they are applicable.

Added:

I am currently reading a book on Psychology, a subject that I have been interested since my early twenties, and came across this: Truth can be tolerated only if you discover it yourself, Fritz Perls (1893-1970). So trying to persuade people to do something may not be a good idea. A better tack would be trying to persuade people to try something. If they like it and stay with it, then that will be their choice and not ours. If we are really clever such that they think it was their idea in the first place then all the better. The secret then is for us to then keep our mouth shut. Saying "I told you so." is not a good idea. "Try gcc 8.3 and let me know what you think." is better than "Use gcc 8.3, it is usually faster than gcc 5.2 and has smaller binaries." Image
paul doe
Moderator
Posts: 1732
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: Threadsafe RANDOMIZE and RND

Post by paul doe »

deltarho[1859] wrote: I read PCG as Permuted Congruential Generator. I have never heard of Procedural Content Generation and had to look it up; but then I would because I am not 'into' games.
It is not only applicable into games (though it is one of the more known uses). It is used in movies, architecture, biology, just to name a few disciplines. The term is not only applicable to 'content' for games but for any application that might need to simulate vast amounts of data with minimal storage (or just for sheer convenience). The algorithms used are another matter entirely.
...
There are, of course, many other areas where "a good, fast Pseudo-Random Number Generator might be applicable" but that relies on the user knowing when they are applicable.
...
Isn't this always the case? If you can't figure out what use something could have, it is not useful for you yet.
Post Reply