Why is there a Known Compiler Bug from 2012 still in?

General FreeBASIC programming questions.
Manpcnin
Posts: 46
Joined: Feb 25, 2007 1:43
Location: N.Z.

Why is there a Known Compiler Bug from 2012 still in?

Post by Manpcnin »

I just wasted a stupid amount of time trying to track down a bug in my code, only to find that it was a (known) compiler bug.

Pretty sure it's [bug] https://sourceforge.net/p/fbc/bugs/572 [/bug]

I know FBC is free software by volunteers, but it's still pretty frustrating.
Anyway; Doing my bit to help so here is the offending code reduced to some easy (not minimal) test-cases:

Code: Select all

'Compile with -fpu sse -vec 2
'#DEFINE TRY_NAMESPACE 'A namespace doesn't trigger the issue. Seems to be specific to UDTs
#Ifdef TRY_NAMESPACE
    namespace X
        dim as single h0,h1,h2
    end namespace
#ELSE
    type vec
        as single h0,h1,h2
    end type
    dim as vec X
#endif
dim as single f0,f1,f2
dim as double d0

X.h0 = 1f:X.h1 = 2f:X.h2 =3f

f0 = X.h0+X.h1+X.h2
f1 = X.h0+X.h2
f2 = X.h2+X.h1+X.h0
d0 = X.h0+X.h1+X.h2
print X.h0;" +";X.h1;" +";X.h2;" =",f0    'in_order Expected:6, Got: 1
print X.h0;" +";X.h2;" =",f1                'Skipped Expected:4, Got: 4
print X.h2;" +";X.h1;" +";X.h0;" =",f2    'inverted Expected:6, Got: 6
print X.h0;" +";X.h1;" +";X.h2;" =",d0    'Double Expected:6, Got: 1
print X.h0;" +";X.h1;" +";X.h2;" =",X.h0+X.h1+X.h2 'in_order_but_passed_to_a_function Expected:6, Got: 6
sleep:end
What's really bad about this bug, is how silent it is. Unless you are doing FP math to calculate Ptrs or some other shenanigans that might segfault (Although texture lookup might actually do that so it's not that far-fetched actually.), you might never even notice that somethings is wrong. It'll just give you the wrong result. You'd have to check your computers Math. And how many people do that?

Oh yeah, tested with fbc-1.04.0-win32 & fbc-1.07.1-win64 The change-log for 1.07.2/1.07.3 doesn't mention a vector optimization bug-fix, and issue 572 is still open on Sourceforge so I didn't bother downloading them to test.
counting_pine
Site Admin
Posts: 6323
Joined: Jul 05, 2005 17:32
Location: Manchester, Lancs

Re: Why is there a Known Compiler Bug from 2012 still in?

Post by counting_pine »

Sorry for the time this bug has cost you.
The vectorisation changes were added a long time ago by Bryan Stoeberl, and I don't think he's been around for some time.

We should perhaps deprecate the -vec functionality and mark it as unstable/unmaintaned.
It's perhaps obsolete now with the gcc emitter, which should, I believe, do vectorisation optimisations with -gen gcc -O3.
coderJeff
Site Admin
Posts: 4326
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Why is there a Known Compiler Bug from 2012 still in?

Post by coderJeff »

@Manpcnin. thanks for the post with additional information. Yeah, the original author has not been around in 10+ years, and we're not experts in everything. Usually, when I go back to those old bugs, I have to do a lot self teaching before I can even begin to understand what's going on. I think I understand what the algorthm is trying / supposed to do.

The vectorisation changes predate the gcc/llvm backends. However, the vectorisation should only be applied to -gen gas -fpu sse. It makes no sense for for gcc / llvm. This -vec N AST optimization should never be applied to gcc / llvm. Neither of those backends know to do anything with it, add so bad code gets generated even in gcc. (Internally AST is a AST_OP_HADD, which only has implementation in gas + x86 + sse)

I have a couple fixes to apply later today:
- disable vectorisation option for gcc/llvm, they don't do anything with it.
- I think I found a fix for the -vec 2 bug. I have written many notes in the source - after 15+ hrs is like 2 changed lines, lol.
- Expand the tests in ./tests/optimizations/vector.bas (still todo)

@counting_pine, I agree,
- mark it in the documentation as unmaintained, only applies to gas + x86 + SSE
- remove from 'fbc -help', and move to 'fbc -help -v' with the unmaintained warning.

Because, even using only -fpu sse (without -vec), there are other bugs I notice with SSE code gen . Appears to be loss of precision in some conversions. maybe only to ulongint? Not 100% sure. SSE code emitter uses x87 processor for conversions. I fear I will take many many hours, and I feel like my volunteer hours could be spent better on other areas.
Manpcnin
Posts: 46
Joined: Feb 25, 2007 1:43
Location: N.Z.

Re: Why is there a Known Compiler Bug from 2012 still in?

Post by Manpcnin »

removing support for vec 2 would solve it I guess; if it's too much work to fix it.
If the new backend is better, why bother.
Maybe it's time me to switch to gcc? :(
I've been using GAS for a really long time. I get really poor performance from GCC. I don't know why.
I just tested the code I'm working on right now, and a performance critical part went from 80ms to 530ms. OUCH!

Any ideas on why the GCC emitted code would perform so much worse?

*EDIT*

Ok I tested with the 32 bit version ( 1.07.2 -gen gcc [-o 3] ) and it's nowhere near as slow as "-gen gcc" 64bit. 115ms. *ONLY* 40% slower, instead of 560% slower.
And you may be right about other sse optimization bugs still being in, even without -vec 2.
I don't have a test case for it (other than my entire program, lol) at this point; but I got noticeable artifacts in my output when compiled with "-gen gcc -fpu sse -vec 1"
Last edited by Manpcnin on Jan 11, 2021 22:18, edited 1 time in total.
MrSwiss
Posts: 3910
Joined: Jun 02, 2013 9:27
Location: Switzerland

Re: Why is there a Known Compiler Bug from 2012 still in?

Post by MrSwiss »

Try it with optimisatioms: -gen gcc -O 2 (capitalized "O", doesn't work with FBIDE!)
Don't use -O 3 (or higher) because then, -vec 2 is back 'in play'.
caseih
Posts: 2157
Joined: Feb 26, 2007 5:32

Re: Why is there a Known Compiler Bug from 2012 still in?

Post by caseih »

Is there a problem with GCC's -O3 optimization? I ran his test code with GCC and -O 3 and it works fine. Maybe I'm not reading this correctly but the -vec 2 issue only applies to the gas backend, does it not?
MrSwiss
Posts: 3910
Joined: Jun 02, 2013 9:27
Location: Switzerland

Re: Why is there a Known Compiler Bug from 2012 still in?

Post by MrSwiss »

Not really certain except: on GCC's website it states that only up to -O 2 is considered 'production code ready'.
Whatever that may mean to say ...
AFAIK, FB-devs also use -O 2 (to compile FB's internal libraries).
Last edited by MrSwiss on Jan 11, 2021 22:27, edited 1 time in total.
coderJeff
Site Admin
Posts: 4326
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Why is there a Known Compiler Bug from 2012 still in?

Post by coderJeff »

caseih wrote:Is there a problem with GCC's -O3 optimization? I ran his test code with GCC and -O 3 and it works fine. Maybe I'm not reading this correctly but the -vec 2 issue only applies to the gas backend, does it not?
The problem occurs when using '-fpu sse -vec 2' command line options with any backend. The compiler incorrectly tries to do vector optimizations on all backends when '-vec 2' should only be applied to gas+x86+sse. So when using '-gen gcc -fpu sse -vec 2' there's bad code generation.

'-vec N' should have no effect on '-gen gcc' and is completely separate option from '-O N' optimize level N.
coderJeff
Site Admin
Posts: 4326
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Why is there a Known Compiler Bug from 2012 still in?

Post by coderJeff »

Manpcnin wrote:And you may be right about other sse optimization bugs still being in, even without -vec 2.
I think warning the user in docs and '-help -v' is still not a bad idea.

To not have it completely dropped, the bug reports do really help. And if you want to help out the project narrowing down the test case is helpful. If you happen to understand what's supposed to happen and can read the assembly to narrow it down, even better.

I just finished writing the new tests for the single precision horizontal add optimization and they all pass, so that's a good sign.
Manpcnin
Posts: 46
Joined: Feb 25, 2007 1:43
Location: N.Z.

Re: Why is there a Known Compiler Bug from 2012 still in?

Post by Manpcnin »

@MrSwiss Thanks for the tip about capital O and FBIDE! I was using that. Compiling from command line doesn't help though. It just reveals another bug in "-O 2" & "-O 3" optimization... OTL

I'm getting bad code with "-gen gcc -O 2" and no other parameters. "-gen gcc -O 1" seems fine. No test-cases yet

"fbc-1.07.2-win32 file.bas -gen gcc -O 1"
Image
"fbc-1.07.2-win32 file.bas -gen gcc -O 2"
Image

Something's definitely #%$@. (And I'm sure it's bad optimization of my vector cross product code. All the calculated surface normals are dead. ( the water just loads a vec3 "UP" constant.))
counting_pine
Site Admin
Posts: 6323
Joined: Jul 05, 2005 17:32
Location: Manchester, Lancs

Re: Why is there a Known Compiler Bug from 2012 still in?

Post by counting_pine »

I don’t want to make any assumptions here, but it’s possible that errors with -O2 could be that something in the program is causing Undefined Behaviour, which works as desired in -O1 but not in -O2.
Something harder to detect, like a buffer overrun clobbering an important part of memory.
(Just speculating/offering an alternative theory here..)
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Why is there a Known Compiler Bug from 2012 still in?

Post by dodicat »

I have some cross products here.

Code: Select all

Type v3
    As Single x,y,z
    As Ulong col
    flag As Long
    Declare Function length As Single
    Declare Function unit As v3
End Type

Type Line
    As v3 v1,v2
End Type
#define cross ^
#define dot *
Operator + (Byref v1 As v3,Byref v2 As v3) As v3
Return Type(v1.x+v2.x,v1.y+v2.y,v1.z+v2.z)
End Operator
Operator -(Byref v1 As v3,Byref v2 As v3) As v3
Return Type(v1.x-v2.x,v1.y-v2.y,v1.z-v2.z)
End Operator
Operator * (Byval f As Single,Byref v1 As v3) As v3
Return Type(f*v1.x,f*v1.y,f*v1.z)
End Operator
Operator * (Byref v1 As v3,Byref v2 As v3) As Single 
Return v1.x*v2.x+v1.y*v2.y+v1.z*v2.z
End Operator
Operator ^ (Byref v1 As v3,Byref v2 As v3) As v3 
Return Type(v1.y*v2.z-v2.y*v1.z,-(v1.x*v2.z-v2.x*v1.z),v1.x*v2.y-v2.x*v1.y)
End Operator
Operator <>(Byref v1 As V3,Byref v2 As V3) As Integer
Return (v1.x<>v2.x) Or (v1.y<>v2.y)
End Operator

Function v3.length As Single
    Return Sqr(x*x+y*y+z*z)
End Function

Function v3.unit As v3
    Dim n As Single=length
    If n=0 Then n=1e-20
    Return Type(x/n,y/n,z/n)
End Function

Type _float As V3

Dim Shared As Const v3 eyepoint=Type(512,768\2,600)
#define map(a,b,x,c,d) ((d)-(c))*((x)-(a))/((b)-(a))+(c)
#define incircle(cx,cy,radius,x,y) (cx-x)*(cx-x) +(cy-y)*(cy-y)<= radius*radius
'<><><><><><><><><><><> Quick SORT <><><><><><><><><><>
#define up <,>
#define down >,<
#macro SetQsort(datatype,fname,b1,b2,dot)
Sub fname(array() As datatype,begin As Long,Finish As Ulong)
    Dim As Long i=begin,j=finish 
    Dim As datatype x =array(((I+J)\2))
    While  I <= J
        While array(I)dot b1 X dot:I+=1:Wend
            While array(J)dot b2 X dot:J-=1:Wend
                If I<=J Then Swap array(I),array(J): I+=1:J-=1
            Wend
            If J > begin Then fname(array(),begin,J)
            If I < Finish Then fname(array(),I,Finish)
        End Sub
        #endmacro    
        
        Sub GetCircle(xm As Single, ym As Single,zm As Single, r As Integer,p() As v3)
            #define CIRC(r)  ( ( Int( (r)*(1 + Sqr(2)) ) - (r) ) Shl 2 )
            Dim As Long x = -r, y = 0, e = 2 - r Shl 1,count
            Redim p(1 To CIRC(r)+4 )
            Do
                count+=1:p(count)=Type<v3>(xm-x, ym+y,zm)
                count+=1:p(count)=Type<v3>(xm-y, ym-x,zm)
                count+=1:p(count)=Type<v3>(xm+x, ym-y,zm)
                count+=1:p(count)=Type<v3>(xm+y, ym+x,zm)
                r = e
                If r<=y Then
                    y+=1
                    e+=y Shl 1+1
                End If
                If r>x Or e>y Then
                    x+=1
                    e+=x Shl 1+1
                End If
            Loop While x<0
            Redim Preserve p(1 To count-1)
        End Sub
        
        Sub RotateArray(wa() As V3,result() As V3,angle As _float,centre As V3,flag As Long=0)
            Dim As Single dx,dy,dz,w
            Dim As Single SinAX=Sin(angle.x)
            Dim As Single SinAY=Sin(angle.y)
            Dim As Single SinAZ=Sin(angle.z)
            Dim As Single CosAX=Cos(angle.x)
            Dim As Single CosAY=Cos(angle.y)
            Dim As Single CosAZ=Cos(angle.z)
            Redim result(Lbound(wa) To Ubound(wa))
            For z As Long=Lbound(wa) To Ubound(wa)
                dx=wa(z).x-centre.x
                dy=wa(z).y-centre.y
                dz=wa(z).z-centre.z
                Result(z).x=((Cosay*Cosaz)*dx+(-Cosax*Sinaz+Sinax*Sinay*Cosaz)*dy+(Sinax*Sinaz+Cosax*Sinay*Cosaz)*dz)+centre.x
                result(z).y=((Cosay*Sinaz)*dx+(Cosax*Cosaz+Sinax*Sinay*Sinaz)*dy+(-Sinax*Cosaz+Cosax*Sinay*Sinaz)*dz)+centre.y
                result(z).z=((-Sinay)*dx+(Sinax*Cosay)*dy+(Cosax*Cosay)*dz)+centre.z
                #macro perspective()
                w = 1 + (result(z).z/eyepoint.z)
                result(z).x = (result(z).x-eyepoint.x)/w+eyepoint.x 
                result(z).y = (result(z).y-eyepoint.y)/w+eyepoint.y 
                result(z).z = (result(z).z-eyepoint.z)/w+eyepoint.z
                #endmacro
                If flag Then: perspective():End If
                result(z).col=wa(z).col
                result(z).flag=wa(z).flag
            Next z
        End Sub
        
        Sub inc(a() As v3,b() As v3,clr As Ulong) 'increment an array
            Var u=Ubound(a)
            Redim Preserve a(1 To u+ Ubound(b)) 
            For n As Long=1 To Ubound(b)
                b(n).col=clr
                a(u+n)= b(n)
            Next n
        End Sub
        
        Sub createdisc(xc As Single,yc As Single,zc As Single,rad As Long,d() As v3)'ends
            Redim d(1 To 4*rad^2)
            Dim As Long ctr
            For x As Long=xc-rad To xc+rad
                For y As Long=yc-rad To yc+rad  
                    If incircle(xc,yc,rad,x,y) Then
                        ctr+=1
                        d(ctr)=Type(x,y,zc,0,1)
                    End If
                Next y
            Next x
            Redim Preserve d(1 To ctr)     
        End Sub
        
        Function segment_distance( l As Line, p As v3, ip As v3=Type(0,0,0)) As Single
            Var s=l.v1,f=l.v2
            Dim As Single linelength=(s-f).length
            Dim As Single dist= ((1/linelength)*((s-f) cross (p-s))).length
            Dim As Single lpf=(p-f).length,lps=(p-s).length
            If lps >= lpf Then
                Var temp=Sqr(lps*lps-dist*dist)/linelength
                If temp>=1 Then temp=1:dist=lpf
                ip=s+(temp)*(f-s)
                Return dist
            Else
                Var temp=Sqr(lpf*lpf-dist*dist)/linelength
                If temp>=1 Then temp=1:dist=lps
                ip=f+(temp)*(s-f)
                Return dist
            End If
            Return dist
        End Function
        
        Function Regulate(Byval MyFps As Long,Byref fps As Long=0) As Long
            Static As Double timervalue,_lastsleeptime,t3,frames
            Var t=Timer
            frames+=1
            If (t-t3)>=1 Then t3=t:fps=frames:frames=0
            Var sleeptime=_lastsleeptime+((1/myfps)-T+timervalue)*1000
            If sleeptime<1 Then sleeptime=1
            _lastsleeptime=sleeptime
            timervalue=T
            Return sleeptime
        End Function
        '======================== set up ============= 
        
        Screen 20,32
        
        Dim As Any Ptr i=Imagecreate(1024,768)
         For n As Long=0 To 768
        Var red=map(768,0,n,0,255)
        Var green=map(768,0,n,0,255)
        Var blue=map(768,0,n,100,255)
        Line i,(0,n)-(1024,n),Rgb(red,green,blue)
        Next
        
        Redim As v3 e1(),e2() 'ends
        Redim As v3 c(),a(0)  'cylinder
        
        For z As Long=-200 To 200 'fill cylinder
            getcircle(512,768\2,z,20,c())
            inc(a(),c(),Rgb(0,200,0))
        Next
        
        createdisc(512,768\2,-201,18,e1()) 'ends
        createdisc(512,768\2, 201,18,e2())
        inc(a(),e1(),Rgb(155,50,0))  'add them to the array
        inc(a(),e2(),Rgb(0,50,155))
        Dim As v3 L(1 To 2)={Type(512,768\2,-205),Type(512,768\2,205)}'ends of central axis
        inc(a(),L(),0) 'add them to array
        
        SetQsort(V3,QsortZ,down,.z)'initiate quicksort
        
        Redim As v3 result()'working array
        Dim As Single ang
        Dim As Single r,g,b,rad,dt
        Dim As v3 light=Type(512,-1000,0)
        Dim As v3 ip 
        Dim As Line ln
        Dim As Long fps
        Do
            ang+=.015
            RotateArray(a(),result(),Type<_float>(1.2*ang,2*ang,ang),Type(512,768\2,0),1)
            Qsortz(result(),Lbound(result),Ubound(result)-2)
            Screenlock
            Cls
            put(0,0),i,pset
            Draw String(20,20),"FPS " &fps,0
            For n As Long=Lbound(result) To Ubound(result)-2
                If result(n).flag=0 Then 'curved bit shader
                    Dim As v3 d=Type(result(n).x-light.x,result(n).y-light.y,result(n).z-light.z)'point to light
                    ln=Type<Line>(result(Ubound(result)-1),result(Ubound(result))) 'the central cylinder axis (line)
                    segment_distance(ln,result(n),ip) 'need ip (intercept of central axis)
                    Dim As v3 c=Type(result(n).x-ip.x,result(n).y-ip.y,result(n).z-ip.z)  'cylinder normals at point
                    Var q=c.unit dot d.unit        'shade by dot product
                    dt=map(-1,1,q,1,0)             'map dot product to [1,0]    
                    r=Cast(Ubyte Ptr,@result(n).col)[2]*dt
                    g=Cast(Ubyte Ptr,@result(n).col)[1]*dt
                    b=Cast(Ubyte Ptr,@result(n).col)[0]*dt
                Else 'ends
                    dt=map(600,200,result(n).y,.3,1) 'shade by .y
                    r=Cast(Ubyte Ptr,@result(n).col)[2]*dt
                    g=Cast(Ubyte Ptr,@result(n).col)[1]*dt
                    b=Cast(Ubyte Ptr,@result(n).col)[0]*dt  
                End If
                
                rad=map(-200,200,result(n).z,2,1) 
                Circle(result(n).x,result(n).y),rad,Rgb(r,g,b),,,,f
            Next n
            
            Screenunlock
            Sleep regulate(60,fps)
        Loop Until Inkey=Chr(27)
        imagedestroy i
        Sleep
        
        
         
My results

Code: Select all


32 bit complier
-gen gcc 21 fps
gcc O1 26 fps
gcc O2 28 fps
gcc O3 36 fps
fpu sse -vec2 28 fps
-gen gas 27 fps



64 bit compiler
-gen gcc 29 fps
gcc O1 39 fps
gcc O2 37 fps
gcc O3 48 fps
fpu sse -vec2 30 fps  GRAPHICS FAIL ON CYLINDER!


 
Why is your -O2 slower than your O1??
Or is your time elapdsed not a measusre of speed.

Win 10 64 bits.
D.J.Peters
Posts: 8586
Joined: May 28, 2005 3:28
Contact:

Re: Why is there a Known Compiler Bug from 2012 still in?

Post by D.J.Peters »

@Manpcnin
do you use double or float (single) ?
do you run 32-bit exe on 64-bit windows ?
are the 3D data created on the fly or loaded from file ?
can you upload the code ?

Joshy
TeeEmCee
Posts: 375
Joined: Jul 22, 2006 0:54
Location: Auckland

Re: Why is there a Known Compiler Bug from 2012 still in?

Post by TeeEmCee »

The -O 2 build which gives faulty results and is also slower may be because denormals, NaNs or infinities are creeping into the calculations, which can greatly slow down some instructions due to causing the CPU to use a slow path. Maybe this is happening due to differences in rounding or precision rather than undefined behaviour or a compiler bug.

But gcc's -O 2 doesn't enable -ffast-math (non-standards-compliant calculation optimisations), which can often be blamed for such problems.

I see that Jeff commented on sf.net bug #458 about double vs long double precision, maybe because he was thinking about this. One difference between gcc and gas backends is that it can change how intermediate results are rounded, because they will be at 80bit precision unless spilled to memory. To test whether this is the problem you pass "-fpu sse" and see whether the difference between "-gen gcc -O 1" and "-gen gcc -O 2" persists.
Manpcnin
Posts: 46
Joined: Feb 25, 2007 1:43
Location: N.Z.

Re: Why is there a Known Compiler Bug from 2012 still in?

Post by Manpcnin »

Sorry I got too busy to focus on coding this week, but I finally dove into my program and found a small piece of code that can replicate some of the weirdness.

Code: Select all

const size = 20
dim shared as single hMap(size,size),nMap(size,size)

for j as integer = 0 to size
    for i as integer = 0 to size
        hmap(i,j)=rnd*10
    next i
next j

dim as single l,j0,j1,j2,i0,i1,i2

j0 = 0
for j1 = 1 to size-1
    i0 = 0:j2 = j1+1
    for i1 = 1 to size-1
    i2 = i1+1
        nMap(i1,j1) = hMap(i0,j0)+hMap(i1,j1) + hMap(i2,j2)			'Error is probably generated from this code.
        'nMap(i1,j1) = hMap(i1-1,j1-1)+hMap(i1,j1) + hMap(i2+1,j2+1)    '<= This, instead of the above, generates working code under gen gcc.
        'print nMap(i1,j1)                                              '<= un-commenting the print ALSO prevents the weird behavior
        i0 = i1
    next i1
    j0 = j1
next j1
for j1 = 1 to size step 5 'print a sample of the array
    for i1 = 1 to size step 5
        print nMap(i1,j1)
    next i1
next j1
print "done"
sleep: end
Yes there's no reason to use singles for those variables. And I've switched to ints for my actual program which fixes the problem. BUT there's still a problem here. There is no undefined behavior, but this code fails and crashes under "-gen gcc -O 1" (and -O 2 / -O 3) (And was obviously the reasons NaNs and INFs were messing up my previous code.
And adding a print REALLY shouldn't affect the emmited ASM in any meaningful way, yet it does.
Post Reply