Single layer neural net with 2 way nonlinearity

greenink · Post by **greenink** » Jun 09, 2016 10:36

I have just finished this and am trying it out at the moment:
https://drive.google.com/file/d/0BwsgML ... sp=sharing
It is a chore to run such code, I find. What can you do though?
Not everything is linearly separable, that much is well know in machine learning. The idea here is to allow separation of both the input and the target of simple single layer neural net with non-linear functions. This should allow the net to associate regions of the input with regions of the target. Rather than say regions of the input with the entirety of the target.
Who knows if it makes much sense, I have to try it out some more to see if it is working the way I think it is.

greenink · Post by **greenink** » Jun 10, 2016 6:34

Simplenet25:
https://drive.google.com/file/d/0BwsgML ... sp=sharing

D.J.Peters · Post by **D.J.Peters** » Jun 10, 2016 7:34

It's only for X86_64 :-(

Joshy

greenink · Post by **greenink** » Jun 10, 2016 13:51

I should have flagged that. You only get around a 4-fold speed up using assembly language. I probably have all the things in straight FB on my computer. If I can find the time I'll do an all FB version, I think tomorrow though I do some kind of self structuring tree.

greenink · Post by **greenink** » Jun 11, 2016 4:51

All the morning gone!!!
Edit:

Code: Select all

namespace rptk

	'Walsh Hadamard transform.  Array must have 2^n elements (2,4,8,16....)
	sub wht(x() as single)
		dim as ulongint hs=1,n=ubound(x)+1
		dim as single sc=1.0/sqr(n)
		if (n and (n-1))<>0 then error(1)	'x must have 2^n elements
		while hs<n
			var i=0ULL
			while i<n
				var k=i+hs
				while i<k
					var a=x(i)
					var b=x(i+hs)
					x(i)=a+b
					x(i+hs)=a-b
					i+=1
				wend
				i=k+hs
			wend
			hs+=hs
		wend
		for i as ulongint =0 to n-1
			x(i)*=sc
		next
	end sub

	'recomputable random sign flip
	sub hashflip(x() as single, h as ulongint)
		const as ulongint PHI= &h9E3779B97F4A7C15ULL
		const as ulongint FHI= &h6A09E667F3BCC908ULL
		dim as ulong m=ubound(x)
		var xptr=cptr(ulong ptr,@x(0))
		for i as ulong=0 to m
			h+=FHI
			h*=PHI
			xptr[i]=xptr[i] xor ((h shr 32) and &h80000000ULL)	
		next
	end sub

	' Sphere the data (give it a constant vector length.)
	sub adjust(x() as single,y() as single)
		var sumsq=0!
		var m=ubound(x)
		for i as ulong=0 to m
			sumsq+=y(i)*y(i)
		next
		var vl=sqr(sumsq/(m+1))
		if vl<1e-30 then 
			vl=0
		else
			vl=1.0/vl
		end if
		for i as ulong=0 to m
			x(i)=y(i)*vl
		next
	end sub
	
	' Hash function Stafford Mix 13
	function hash64(h as ulongint) as ulongint
		h = ( h xor ( h shr 30 ) ) * &hBF58476D1CE4E5B9 
		h = ( h xor ( h shr 27 ) ) * &h94D049BB133111EB
		return h xor ( h shr 31 )
	end function
	
	sub zero(x() as single)
		var m=ubound(x)
		for i as ulong=0 to m:x(i)=0!:next
	end sub
	
	sub copy(x() as single,y() as single)
		var m=ubound(x)
		for i as ulong=0 to m:x(i)=y(i):next
	end sub
	
	sub add(x() as single,y() as single,z() as single)
		var m=ubound(x)
		for i as ulong=0 to m:x(i)=y(i)+z(i):next
	end sub
	
	sub subtract(x() as single,y() as single,z() as single)
		var m=ubound(x)
		for i as ulong=0 to m:x(i)=y(i)-z(i):next
	end sub
	
	sub multiply(x() as single,y() as single,z() as single)
		var m=ubound(x)
		for i as ulong=0 to m:x(i)=y(i)*z(i):next
	end sub
	
	sub multiplyscalar(x() as single,y() as single,z as single)
		var m=ubound(x)
		for i as ulong=0 to m:x(i)=y(i)*z:next
	end sub
	
'   array must have 2^n elements (2,4,8,16....)
	sub xorpermute(x() as single,h as ulongint)
		var m=ubound(x)
		if (m and (m+1)) <> 0 then error(1) 'need 2^n elements
		dim as ulong k=hash64(h) and m
		for i as ulong=0 to m
			var r=k xor i
			if r>i then swap x(i),x(r)
		next
	end sub
		
	sub absolute(x() as single,y() as single)
		var m=ubound(x)
		var xptr=cptr(ulong ptr,@x(0)),yptr=cptr(ulong ptr,@y(0))
		for i as ulong=0 to m
			xptr[i]=yptr[i] and &h7fffffff
		next
	end sub
	
	'http://www.codeproject.com/Articles/69941/Best-Square-Root-Method-Algorithm-Function-Precisi
	function approxSignedSqrt(x as single) as single
		var y=*cptr(ulong ptr,@x)
		var s=y and &h80000000
		y and=&h7fffffff
		y+=127 shl 23
		y shr=1
		y or=s
		return *cptr(single ptr,@y)
	end function
	
	sub signedsqrt(x() as single,y() as single)
		var m=ubound(x)
		for i as ulong=0 to m:x(i)=approxSignedSqrt(y(i)):next
	end sub
	
	'Quick random projection
	sub quickrp(x() as single,h as ulongint)
		hashflip(x(),h)
		xorpermute(x(),h)
		wht(x())
	end sub

	sub quickrpinv(x() as single,h as ulongint)
		wht(x())
		xorpermute(x(),h)
		hashflip(x(),h)
	end sub
	
end namespace
/'
print "Test approxSignedSqrt(x)"
for i as ulong=0 to 100
	print rptk.approxSignedSqrt(-i*0.01),sqr(i*0.01)
next

print "Test WHT 2"
dim as single x(1)=>{1,1}
rptk.wht(x())
print x(0),x(1)
rptk.wht(x())
print x(0),x(1)

print "Test WHT 4"
dim as single y(3)=>{1,1,1,1}
rptk.wht(y())
print y(0),y(1),y(2),y(3)
rptk.wht(y())
print
print y(0),y(1),y(2),y(3)
print
print "Test WHT 8"
dim as single z(7)=>{1,2,3,4,5,6,7,8}
rptk.wht(z())
print z(0),z(1),z(2),z(3),z(4),z(5),z(6),z(7)
rptk.wht(z())
print
print z(0),z(1),z(2),z(3),z(4),z(5),z(6),z(7)
print
print "Test WHT 16"
dim as single u(15)=>{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16}
rptk.wht(u())
print u(0),u(1),u(2),u(3),u(4),u(5),u(6),u(7),u(8),u(9),u(10),u(11),u(12),u(13),u(14),u(15)
rptk.wht(u())
print
print u(0),u(1),u(2),u(3),u(4),u(5),u(6),u(7),u(8),u(9),u(10),u(11),u(12),u(13),u(14),u(15)
print
print "Test HashFlip()"
rptk.hashflip(u(),1001)
print
print u(0),u(1),u(2),u(3),u(4),u(5),u(6),u(7),u(8),u(9),u(10),u(11),u(12),u(13),u(14),u(15)
rptk.hashflip(u(),1001)
print
print u(0),u(1),u(2),u(3),u(4),u(5),u(6),u(7),u(8),u(9),u(10),u(11),u(12),u(13),u(14),u(15)
print
print "Test XorPermute()"
rptk.xorpermute(u(),1003)
print
print u(0),u(1),u(2),u(3),u(4),u(5),u(6),u(7),u(8),u(9),u(10),u(11),u(12),u(13),u(14),u(15)
rptk.xorpermute(u(),1003)
print
print u(0),u(1),u(2),u(3),u(4),u(5),u(6),u(7),u(8),u(9),u(10),u(11),u(12),u(13),u(14),u(15)
print
print "Test Absolute()"
rptk.hashflip(u(),100001)
print u(0),u(1),u(2),u(3),u(4),u(5),u(6),u(7),u(8),u(9),u(10),u(11),u(12),u(13),u(14),u(15)
rptk.absolute(u(),u())
print
print u(0),u(1),u(2),u(3),u(4),u(5),u(6),u(7),u(8),u(9),u(10),u(11),u(12),u(13),u(14),u(15)
print
print "Test QuickRP()"
rptk.quickrp(u(),2222)
print u(0),u(1),u(2),u(3),u(4),u(5),u(6),u(7),u(8),u(9),u(10),u(11),u(12),u(13),u(14),u(15)
print
rptk.quickrpinv(u(),2222)
print u(0),u(1),u(2),u(3),u(4),u(5),u(6),u(7),u(8),u(9),u(10),u(11),u(12),u(13),u(14),u(15)
print
print "Test Adjust"
rptk.adjust(u(),u())
print u(0),u(1),u(2),u(3),u(4),u(5),u(6),u(7),u(8),u(9),u(10),u(11),u(12),u(13),u(14),u(15)
print
print "Test Zero"
rptk.zero(u())
print u(0),u(1),u(2),u(3),u(4),u(5),u(6),u(7),u(8),u(9),u(10),u(11),u(12),u(13),u(14),u(15)
print
'/

greenink · Post by **greenink** » Jun 11, 2016 14:06

FB friendly version (2 minute test, surely alpha then.)
https://drive.google.com/file/d/0BwsgML ... sp=sharing
It feels around 10 times slower than the assembly language version but I haven't done timings, though the random projections should be higher quality.

greenink · Post by **greenink** » Jun 12, 2016 13:06

I forgot to change the compile option from -exx for testing to -O 3. That's why it was so slow. A 10 to 1 difference in speed would be definitive. 4 to 1 puts me in a dilemma as to which to proceed with (assembly or straight code.) A GPU would take things to light speed, however there is a discipline to cheap hardware that forces you to creativity if you want any results.

greenink · Post by **greenink** » Jun 13, 2016 6:27

I'll see if training this type of repeating net for a long time is useful:
https://drive.google.com/file/d/0BwsgML ... sp=sharing
If not I will experiment with some options there are for learning via evolution.

greenink · Post by **greenink** » Jun 13, 2016 16:07

It seems after running the 2 different ideas for a while the 2 way nonlinear net is more interesting. Maybe you could even extend that idea with longer chains of random projections, applying the non linearity in the forward direction only. There are too many options and it is difficult to sort them out.

greenink · Post by **greenink** » Jun 14, 2016 4:35

PosPos Neural Net (Linux AMD64 only):
https://drive.google.com/file/d/0BwsgML ... sp=sharing
I didn't start a new message thread about it because it is so new and untested. It is a perhaps naive attempt at a deep neural network (that are proving so successful these days.)
I'll do a non-assembly language version later so that Mr Peters can run it on a raspberry pi zero.

greenink · Post by **greenink** » Jun 14, 2016 5:38

PosPos Neural Net (General):
https://drive.google.com/file/d/0BwsgML ... sp=sharing

greenink · Post by **greenink** » Jun 17, 2016 21:30

Triggered Delta neural net (Linux AMD64 only):
https://drive.google.com/file/d/0BwsgML ... sp=sharing

greenink · Post by **greenink** » Jun 18, 2016 4:16

Triggered delta 2 way nonlinear neural net (Linux AMD64 only):
https://drive.google.com/file/d/0BwsgML ... sp=sharing

I had a cool idea a few minutes ago. Rather than just triggering on what's happening with each weight individually when it is strongly signaling it wants to change, why not accumulate over patterns of weights and when that pattern wants to change then change the weights as a group. That is easy to arrange and maybe it could work out better. I'll ask Justin Bieber.

greenink · Post by **greenink** » Jun 18, 2016 14:48

Here is one way of doing error pattern accumulation:
https://drive.google.com/open?id=0BwsgM ... 0VHOWdPOGs
It's rather slow however. If you look at smaller subsets of the errors you should be able to get it to go faster.
It's too slow for me to evaluate if it's doing anything interesting but I presume it brings benefits.

greenink · Post by **greenink** » Jun 23, 2016 1:43

This is the last in this particular line of experiments:
https://drive.google.com/open?id=0BwsgM ... TVCdE82Q1k
(Linux AMD64 only).
It is really 2 neural nets where one neural net learns the weights for the other neural net. I actually implemented the idea a few years ago in Java but I never got around to testing it. I'll run it through some basic tests today to see if there is any advantage.
The idea can be extended from 2 nets to n nets.

Single layer neural net with 2 way nonlinearity

Single layer neural net with 2 way nonlinearity

Re: Single layer neural net with 2 way nonlinearity

Re: Single layer neural net with 2 way nonlinearity

Re: Single layer neural net with 2 way nonlinearity

Re: Single layer neural net with 2 way nonlinearity

Re: Single layer neural net with 2 way nonlinearity

Re: Single layer neural net with 2 way nonlinearity

Re: Single layer neural net with 2 way nonlinearity

Re: Single layer neural net with 2 way nonlinearity

Re: Single layer neural net with 2 way nonlinearity

Re: Single layer neural net with 2 way nonlinearity

Re: Single layer neural net with 2 way nonlinearity

Re: Single layer neural net with 2 way nonlinearity

Re: Single layer neural net with 2 way nonlinearity

Re: Single layer neural net with 2 way nonlinearity