undefined reference to `__popcnt'

General FreeBASIC programming questions.
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: undefined reference to `__popcnt'

Post by deltarho[1859] »

We have an issue with x = 0

We don't with this:

Code: Select all

Function __popcnt( x As ULongInt ) As ULongInt
Dim As ULongInt c
  While x<>0
    x And= x-1
    c+=1
  Wend
  Return c
End Function
Yours truly wrote:but they are not worth pursuing unless we go past 'bit_count' millions of times
Not true, the 'x And= x-1' approach is blindingly fast.
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: undefined reference to `__popcnt'

Post by deltarho[1859] »

As we get older, our short-term memory tends to go out of the window. I used the 'x And= x-1' method when developing the optimal Hamming Weight approach for seeding PRNGS back in June 2021. I got the tip from adeyblue and probably forgot because I was concentrating on Hamming and not a popcnt replacement.

'x And= x-1' is one of the neatest pieces of code that I have ever seen. Given a value of 9223372036854776321 how many times do you think we go past the While statement? Well, it is only three because that is the value of __popcnt. How about a value of 4091. Here we go past While eleven times because that is its value of __popcnt. :)
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: undefined reference to `__popcnt'

Post by dodicat »

Nevertheless, using no loop at all beats looping.

Code: Select all

#cmdline "-gen gcc -O 2"
Function popcnt(x As Ulongint) As Ulong
      If x=18446744073709551615 Then Return 64
      x -= ((x Shr 1) And &h5555555555555555ull)
      x = (((x Shr 2) And &h3333333333333333ull) + (x And &h3333333333333333ull))
      x = (((x Shr 4) + x) And &hf0f0f0f0f0f0f0full)
      x += (x Shr 8)
      x += (x Shr 16)
      x+= (x Shr 32)
     return  x And &h0000003full
End function

Function __popcnt( x As ULongInt ) As ULongInt
Dim As ULongInt c
  While x<>0
    x And= x-1
    c+=1
  Wend
  Return c
End Function

dim as double t
var lim=10000000
#define i int(rnd*20000000)
var p=0

for k as long=1 to 5
      randomize 1
t=timer
for n as long=1 to lim
p=popcnt(i)
next
print timer-t,p,"no loop"
randomize 1
t=timer
for n as long=1 to lim
p=__popcnt(i)
next
print timer-t,p,"loop"
print
next k
sleep


 
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: undefined reference to `__popcnt'

Post by deltarho[1859] »

dodicat wrote:Nevertheless, using no loop at all beats looping.
Not analytical enough. Both are sensitive to the size of the bit count. Looping is faster for a few bit counts – about less than 10.

Not many applications will be looking at 10 million computations. How many computations will it take to notice a performance difference?

Your popcnt is not exactly elegant, is it? :)
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: undefined reference to `__popcnt'

Post by dodicat »

You said it deltarho
"Yours truly wrote:
but they are not worth pursuing unless we go past 'bit_count' millions of times
..
Not true, the 'x And= x-1' approach is blindingly fast."

Not me.
Anyway, anymore pop count my eyes will pop from their bottle stops.
I'll let the OP choose from all these methods.
deltarho[1859]
Posts: 4292
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: undefined reference to `__popcnt'

Post by deltarho[1859] »

Just out of curiosity I had a look at an asm version of the 'while' method. To cut a long story short, it was not worthwhile. For small bit counts it was the fastest method, but suffered as the bit count increased. Being Ulong based using 32-bit registers it, clearly, was not as versatile as the other methods above.

I checked dodicat's code and found an issue.

With p=popcnt(i) and __popcount(i) 'i' is determined from '#define i int(rnd*20000000)'. We have then a calculation inside the Timer calculation. Goodness knows what effect this will have on the timing, even though 'Randomize 1' is used to use the same random number sequence.

I rewrote the timing code to consider bit counts 1,2,4,8,16,and 32 with three tests each rather than five.

With the same value being calculated within a loop an optimized gcc will, but not always, rip the loop out, so I inserted 'Asm nop' to stop that. That does not always work. gcc have a pragma to stop selected sections being optimized, but fbc isn't using it. IT SHOULD !

So, what do we have now?

The 'while' method gets faster as the bit count reduces. dodicat's methd is not sensitive to the bit count.

dodicat's method is the faster, and noticeably so as the bit count increases.

Don't forget that we are using 'Var lim=10000000'. At a bit count of 32 dodicat's method is 10 times faster than the 'while' method. With a lim of 100 10 times faster is 10 times faster than virtually nothing which is – well – virtually nothing. I am reminded of someone who spent a whole afternoon getting a Function to execute twice as fast. When asked, how much of the application session time did the original take. After a few minutes, I was told: “About one percent”. I told him that he had wasted a whole afternoon, and he should profile and find something else to look at.

Code: Select all

#cmdline "-gen gcc -O 2"
Function popcnt(x As ULongInt) As ULong
      If x=18446744073709551615 Then Return 64
      x -= ((x Shr 1) And &h5555555555555555ull)
      x = (((x Shr 2) And &h3333333333333333ull) + (x And &h3333333333333333ull))
      x = (((x Shr 4) + x) And &hf0f0f0f0f0f0f0full)
      x += (x Shr 8)
      x += (x Shr 16)
      x+= (x Shr 32)
     Return  x And &h0000003full
End Function
 
Function __popcnt( x As ULongInt ) As ULongInt
Dim As ULongInt c
  While x<>0
    x And= x-1
    c+=1
  Wend
  Return c
End Function
 
Dim As Double t
Var lim=10000000
Dim BitCount(...) As ULong = {&b11111111111111111111111111111111, &b11101010001110101010100100101001 _
,&b1110101001100010, &b1010001000000010, &b101, &b010}
 
Var p=0
 
For j As Long = LBound(BitCount) To UBound(BitCount)
  For k As Long=1 To 3
    t=Timer
    For n As Long=1 To lim
    Asm nop
    p=popcnt(BitCount(j))
  Next
  Print Timer-t,p,"no loop"
  t=Timer
  For n As Long=1 To lim
    Asm nop
    p=__popcnt(BitCount(j))
    Next
    Print Timer-t,p,"loop"
    Print
  Next
Next
 
Sleep
Post Reply