Need speedup

New to FreeBASIC? Post your questions here.
12val12newakk
Posts: 6
Joined: Nov 14, 2019 17:04

Need speedup

Postby 12val12newakk » Nov 14, 2019 18:22

I need speedup iteration sub
Time complexity O(n2)

physik not simple 1/r^2
therefore, finished libraries are not suitable
+ and I do not want someone else's
I will probably correct the laws of interaction between the bodies of the system, so the possibility of further corrections should remain

Code: Select all

 sub vals () 
     for n=0 to Nmass
        dim as single dx,dy ,a,df,dfy,dfx,xm,ym
        dim as single Rglue ,Rgquadro
     for m=0 to Nmass
        if n<>m then 
                 dx=x(m)-x(n):dy=y(m)-y(n)
                      Rgquadro=(dx*dx+dy*dy): Rglue=sqr (Rgquadro)
             Select Case   Rglue
                 Case is >140   
                    df=9512/((Rgquadro))                             
                 Case 10 to 140
                     df=((-1e12 /Rgquadro)/Rgquadro/Rgquadro)+.4
                 Case 0 to 10   
                     df=0 : NUM_ERROR=NUM_ERROR+1
                           x_ERR_m(m) =x(m):x_ERR_m(m) =y(m):
                           x_ERR_n(n) =x(n):x_ERR_n(n) =y(n):
                           n_ERR_part =n
                           m_ERR_part =m
             End Select
                       df=df*mc(m)
                      dfy=dfy+df*(dy/Rglue):dfx=dfx+df*(dx/Rglue)                             
           end if
       next m
     dfy=dfy-y(n)*Kcnt:dfx=dfx-x(n) *Kcnt   rem   Centering power
     vy(n)=vy(n)*.99999:vx(n)=vx(n)*.99999          rem dissipation
     vy(n)=vy(n)+dfy*dt:vx(n)=vx(n)+dfx*dt           
     y(n)=y(n)+vy(n)*dt   :   x(n)=x(n)+vx(n)*dt
     dx=x(m)-x(n):dy=y(m)-y(n) 
 next n
 end sub

badidea
Posts: 1608
Joined: May 24, 2007 22:10
Location: The Netherlands

Re: Need speedup

Postby badidea » Nov 14, 2019 21:18

Hi, it is collision detection between many particles right?
What is often done is: Make a grid to divide the particles.
Different type of grids can be used. Most simple is a square/cubic grid.
Then only check collisions between particles within 1 grid cell AND with all neighboring cells. (or something like that).
The idea is that assigning the particles to the grid cell requires only 1 iteration. Then the number of collision checks needed is less. Particles in separated cells can never collide.
There are examples online, but I don't have a direct link at the moment.


Also, for small speed improvement, see if you can eliminate the 'square root' call. Do comparison against the 'squared' value.

Edit:
Here is 1 link that does not use a fixed grid, but a 'quad tree':
https://gamedevelopment.tutsplus.com/tu ... amedev-374
Another technique seems to be 'spacial hashing', which I am not familiar with.
https://conkerjo.wordpress.com/2009/06/ ... ollisions/
More reading material:
http://buildnewgames.com/broad-phase-co ... detection/
Last edited by badidea on Nov 15, 2019 10:58, edited 2 times in total.
12val12newakk
Posts: 6
Joined: Nov 14, 2019 17:04

Re: Need speedup

Postby 12val12newakk » Nov 15, 2019 7:00

There are no explicit conflicts - this is what distinguishes this system
just at a distance of less than 140
the repulsive force grows in proportion to the 6th degree
there are no clear boundaries of bodies (balls).
    Radii of bodies are drawn conditionally.
D.J.Peters
Posts: 7852
Joined: May 28, 2005 3:28

Re: Need speedup

Postby D.J.Peters » Nov 15, 2019 10:49

compare yours :-(

Code: Select all

for n = first to last
   for m = first to last
    if (n<>m) then calc_forces_gravitiy_with_cell(n,m)
  next
next
with this :-)

Code: Select all

for n = first to last-1
  for m = n+1 to last
    calc_forces_gravitiy_with_cell(n,m)
  next
next
dodicat
Posts: 6012
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Need speedup

Postby dodicat » Nov 15, 2019 11:41

You have
dim as single dx,dy ,a,df,dfy,dfx,xm,ym
dim as single Rglue ,Rgquadro
all within the outer loop.

You could just do all these dims before starting the loops.
i.e.
dim as single dx,dy ,a,df,dfy,dfx,xm,ym
dim as single Rglue ,Rgquadro
for n =0 to Nmass
dfx=0
dfy=0
for m =0 to Nmass

. . .
. . .
12val12newakk
Posts: 6
Joined: Nov 14, 2019 17:04

Re: Need speedup

Postby 12val12newakk » Nov 15, 2019 13:21

D.J.Peters your code wrong )
big boom in system diese code

dodicat tnx maybe it will add 1 percent to speed
deltarho[1859]
Posts: 2099
Joined: Jan 02, 2017 0:34
Location: UK

Re: Need speedup

Postby deltarho[1859] » Nov 15, 2019 13:42

Where we put Dim has scope implications but I do not think that it has any noticeable effect on speed. I looked at Dim inside loops a while ago and concluded that they are only executed once and not on every pass of the loop.
Last edited by deltarho[1859] on Nov 15, 2019 13:51, edited 1 time in total.
deltarho[1859]
Posts: 2099
Joined: Jan 02, 2017 0:34
Location: UK

Re: Need speedup

Postby deltarho[1859] » Nov 15, 2019 13:50

maybe it will add 1 percent to speed

With multi-tasking operating systems we can get 1% fluctuations in speed without doing anything.
badidea
Posts: 1608
Joined: May 24, 2007 22:10
Location: The Netherlands

Re: Need speedup

Postby badidea » Nov 15, 2019 13:51

This: dfy = dfy + df * (dy / Rglue)
Can be: dfy += df * (dy / Rglue)
But I don't expect speed improvement from this.
Likewise: dfy -= y(n) * Kcnt
And: df *= mc(m)

Not tested, maybe access to x(n) and y(n) inside the 'm-loop' can be speed up with an intermediate variable assigned in the 'n-loop'. But maybe optimized already by the compiler.

When compiling via GCC (default for 64-bit), there are also compiler options to optimize speed (-O1, -O2, -O3).

Something else to try: / ( Rgquadro * Rgquadro * Rgquadro)
Instaed of: / Rgquadro) / Rgquadro / Rgquadro

At the end: dx=x(m)-x(n):dy=y(m)-y(n) seems to have no purpose?

This part can have 1 multiplication less: dfy=dfy+df*(dy/Rglue):dfx=dfx+df*(dx/Rglue)
Via: dfy += (df / Rglue) * dy : dfx += (df / Rglue) * dx
To: extra_var = df / Rglue : dfy += extra_var * dy : dfx += extra_var * dx

BTW: Which force scales with the 6th power of 1/R ?
D.J.Peters
Posts: 7852
Joined: May 28, 2005 3:28

Re: Need speedup

Postby D.J.Peters » Nov 15, 2019 15:44

12val12newakk wrote:D.J.Peters your code wrong
may be you don't understand my hint in scope of speed !

let say your n,m loop's count like this

n 0,1,2,3,4
m 0,1,2,3,4

n=0 (m 1..4)
if n<>m then calculate(0,1)
if n<>m then calculate(0,2)
if n<>m then calculate(0,3)
if n<>m then calculate(0,4)
n=1 (m 0,2..4)
if n<>m then calculate(1,0) ' error 1 is the same as calculate(0,1)
if n<>m then calculate(1,2)
if n<>m then calculate(1,3)
if n<>m then calculate(1,4)
n=2 (m 0..1,3..4)
if n<>m then calculate (2,0) ' error 2 is the same as calculate(0,2)
if n<>m then calculate (2,1) ' error 3 is the same as calculate(1,2)
if n<>m then calculate (2,3)
if n<>m then calculate (2,4)
n=3 (m 0..2,4)
if n<>m then calculate (3,0) ' error 4 is the same as calculate(0,3)
if n<>m then calculate (3,1) ' error 5 is the same as calculate(1,3)
if n<>m then calculate (3,2) ' error 6 is the same as calculate(2,3)
if n<>m then calculate (3,4)
n=4 (m 0..3)
if n<>m then calculate (4,0) ' error 7 is the same as calculate(0,4)
if n<>m then calculate (4,1) ' error 8 is the same as calculate(1,4)
if n<>m then calculate (4,2) ' error 9 is the same as calculate(2,4)
if n<>m then calculate (4,3) ' error 10 is the same as calculate(3,4)

etc.

i posted this

n 0,1,2,3
m n+1 to 4

n=0 (m=1..4)
calculate (0,1)
calculate (0,2)
calculate (0,3)
calculate (0,4)

n=1 (m=2..4)
calculate (1,2)
calculate (1,3)
calculate (1,4)

n=2 (m=3..4)
calculate (2,3)
calculate (2,4)

n=3 (m=4..4)
calculate (3,4)

With other words you caluclate pairs of particles more than once !

Joshy
12val12newakk
Posts: 6
Joined: Nov 14, 2019 17:04

Re: Need speedup

Postby 12val12newakk » Nov 15, 2019 19:47

D.J.Peters ))
in my procedure, one-pass calculation
 increments of velocity and body coordinates N
in relation to M .. but not back
(1,0) not the same (0,1) !
Provoni
Posts: 346
Joined: Jan 05, 2014 12:33
Location: Belgium

Re: Need speedup

Postby Provoni » Nov 15, 2019 20:30

@12val12newakk

You may want to try to replace:

Code: Select all

Select Case Rglue

with:

Code: Select all

Select Case Rgquadro

and change the select case values accordingly. Removing the Rglue dependency may improve the CPU's out of order processing and speed up the code.

- Can you use a look up table to get the Rglue value instead of SQR?
- 64-bit FreeBASIC? If not try it.
- What compiler flags are you using? Try "-gen gcc -Wc -march=native,-Ofast,-funroll-loops,-fomit-frame-pointer,-ftree-loop-im,-fivopts".
RockTheSchock
Posts: 226
Joined: Mar 12, 2006 16:25

Re: Need speedup

Postby RockTheSchock » Nov 15, 2019 20:44

Maybe it's faster if you change this by using less divisions

Code: Select all

df=((-1e12 /Rgquadro)/Rgquadro/Rgquadro)+.4


also you could change the select case:

Code: Select all

'Maybe ? Dim as Double Rgquadro
If Rglue >140 Then
   df=9512/((Rgquadro))
Else
   If rglue>= 10 Then
      df=((-1e12 /(Rgquadro*Rgquadro*Rgquadro)+.4
   Else
      df=0
      NUM_ERROR=NUM_ERROR+1
      x_ERR_m(m) =x(m)  ' value in x_ERR_m(m)
      x_ERR_m(m) =y(m)  ' is overwritten here ???
      x_ERR_n(n) =x(n)
      x_ERR_n(n) =y(n)  ' same here ???
      n_ERR_part =n
      m_ERR_part =m
   EndIf
EndIf
12val12newakk
Posts: 6
Joined: Nov 14, 2019 17:04

Re: Need speedup

Postby 12val12newakk » Nov 15, 2019 20:45

quote="badidea"


At the end: dx=x(m)-x(n):dy=y(m)-y(n) seems to have no purpose?



BTW: Which force scales with the 6th power of 1/R ?[/quote]

Thanks not used deleted

6th power of 1/R roughly similar but can always be fixed
https://dic.academic.ru/pictures/enc_ph ... stvie1.jpg
RockTheSchock
Posts: 226
Joined: Mar 12, 2006 16:25

Re: Need speedup

Postby RockTheSchock » Nov 15, 2019 20:58

Another option is to use ASM with AVX / AVX2 instructions:
Each YMM register can hold and do simultaneous operations on eight 32-bit single-precision floating point numbers. So basically if the algorithm could be optimally transformed you could gain almost 8 times the speed. Realistically it could be around twice as fast.
https://en.wikipedia.org/wiki/Advanced_Vector_Extensions

About what size are the those arrays?

If you use freebasic with gcc and full optimizations maybe the compiler will use AVX automatically. At least there is a compiler switch -mavx2

Return to “Beginners”

Who is online

Users browsing this forum: No registered users and 1 guest