using gcc quadmath

General FreeBASIC programming questions.
srvaldez
Posts: 2344
Joined: Sep 25, 2005 21:54

using gcc quadmath

Postby srvaldez » Aug 01, 2017 13:35

[edit] see viewtopic.php?p=270387#p270387
I have tried this before and failed to get it to work in 32-bit, but the topic at http://masm32.com/board/index.php?topic=6412.0 inspired to try again
it works OK in 64-bit but not in 32-bit, I tried all variations of compile options but it always fails, this is a puzzle that has me stumped and would like to know the solution.
here's a minimal test example

Code: Select all

#inclib "quadmath-0" 'you need libquadmath-0.dll

type float128
   as long f(0 to 3) '__float128 is 16 bytes in both 32 and 64 bit
end type

declare function addf128 cdecl alias "__addtf3" (byval a as float128, byval b as float128) as float128
declare function subf128 cdecl alias "__subtf3" (byval a as float128, byval b as float128) as float128
declare function mulf128 cdecl alias "__multf3" (byval a as float128, byval b as float128) as float128
declare function divf128 cdecl alias "__divtf3" (byval a as float128, byval b as float128) as float128

declare function strtoflt128 cdecl alias "strtoflt128" (byval as zstring ptr, byval as byte ptr ptr) as float128
declare function quadmath_snprintf cdecl alias "quadmath_snprintf" (byval st as zstring ptr, byval size as integer, byval form as zstring ptr, ...) as long

dim as float128 x
dim as zstring ptr s= allocate(256)

x = strtoflt128 ("3.1415926535897932384626433832795029Q", 0)
'x = addf128(x, x)
for i as long =3 to 0 step -1
   print hex(x.f(i),8),
next
print
quadmath_snprintf (s, 256, "%+-#46.*Qe", 33, @x)
print *s

deallocate(s)
sleep

here's the 64-bit output

Code: Select all

4000921F      B54442D1      8469898C      C51701B8
+3.141592653589793238462643383279503e+00

the 32-bit output

Code: Select all

4000921F      B54442D1      8469898C      C51701B8
+4.004067261107258206686834059245884e-4913

the strtoflt128 works in both versions as you can see from the hex output but anything else fails miserably in 32-bit
Last edited by srvaldez on Apr 07, 2020 0:10, edited 1 time in total.
dodicat
Posts: 6390
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: using gcc quadmath

Postby dodicat » Aug 01, 2017 15:41

I think thewre is a bug somewhere.
If you dim a string before printing the hex numbers, you get a different answer from quadmath_snprintf each run

Code: Select all

#inclib "quadmath-0" 'you need libquadmath-0.dll

type float128
   as long f(0 to 3) '__float128 is 16 bytes in both 32 and 64 bit
end type

declare function addf128 cdecl alias "__addtf3" (byval a as float128, byval b as float128) as float128
declare function subf128 cdecl alias "__subtf3" (byval a as float128, byval b as float128) as float128
declare function mulf128 cdecl alias "__multf3" (byval a as float128, byval b as float128) as float128
declare function divf128 cdecl alias "__divtf3" (byval a as float128, byval b as float128) as float128

declare function strtoflt128 cdecl alias "strtoflt128" (byval as zstring ptr, byval as byte ptr ptr) as float128
declare function quadmath_snprintf cdecl alias "quadmath_snprintf" (byval st as zstring ptr, byval size as integer, byval form as zstring ptr, ...) as long

dim as float128 x
dim as zstring ptr s= allocate(256)

x = strtoflt128 ("3.1415926535897932384626433832795029Q", 0)

dim as string a '<----------- HERE
'x = addf128(x, x)
for i as long =3 to 0 step -1
   print hex(x.f(i),8),
next
print
print a

quadmath_snprintf (s, 256, "%+-#46.*Qe", 33, @x)
print *s

deallocate(s)
sleep

I have seen this type of bug before.
I just cannot remember how to fix it.
srvaldez
Posts: 2344
Joined: Sep 25, 2005 21:54

Re: using gcc quadmath

Postby srvaldez » Aug 01, 2017 15:53

thank you looking into it, it's strange that it works without problems with FB-win-64, but the same example won't run on OS X or Linux
one could probably make a wrapper in C but then why not use another quad library that's more friendly.
fxm
Posts: 9559
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: using gcc quadmath

Postby fxm » Aug 01, 2017 16:45

srvaldez wrote:I have tried this before and failed to get it to work in 32-bit, ...

For people who are interested in this problem, the previous topic is here:
quadmath lib works with fb-64 not with fb-32
dodicat
Posts: 6390
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: using gcc quadmath

Postby dodicat » Aug 01, 2017 17:06

using the masm32 link from srvaldez
I got it running in C 32 bits. (First example)

Code: Select all


#include <C:\MinGW\lib\gcc\mingw32\4.7.2\include\quadmath.h>
#include <stdio.h>
// OPT_Linker C:\MinGW\lib\gcc\mingw32\4.8.1\libquadmath-0.dll (Dodicat ---not needed, used another folder for the dll)

int main ()
{
  __float128 r;

  r = strtoflt128("3.1415926535897932384626433832795029", NULL);   
  printf("%s",r);
 // r = 13.1415926535897932384626433832795029;   // OK
 

  _getch();
}

 
srvaldez
Posts: 2344
Joined: Sep 25, 2005 21:54

Re: using gcc quadmath

Postby srvaldez » Aug 01, 2017 17:50

I strongly suspect that's an alignment problem, anyone has experience in this area?
fxm
Posts: 9559
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: using gcc quadmath

Postby fxm » Aug 01, 2017 18:35

Off chance:
dim as zstring ptr s = callocate(256+1)
srvaldez
Posts: 2344
Joined: Sep 25, 2005 21:54

Re: using gcc quadmath

Postby srvaldez » Aug 01, 2017 18:58

am not sure fxm, tried your suggestion but without change in the result.
here's the output on OS X FBx64

Code: Select all

4000921F      B54442D1      0000921F      B54442D1     
+1.681023999950894566514408287245474e-4932   

for reference the win-64 output

Code: Select all

4000921F      B54442D1      8469898C      C51701B8
+3.141592653589793238462643383279503e+00

notice the overlapping, but the structure is 16 bytes, so why this strange behavior?
jj2007
Posts: 1403
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: using gcc quadmath

Postby jj2007 » Aug 01, 2017 19:11

Hi everybody,
I am the author of MasmBasic. The main problem here, as I see it, is the very awkward way how GCC passes parameters. Here is a "simple" example, it just does a+b:

Code: Select all

CPU Disasm
Address         Hex dump                 Command                             Comments
0040139A        ³.  90                   nop
0040139B        ³.  CC                   int3
0040139C        ³.  C705 20504000 5AFE70 mov dword ptr [405020], 7C70FE5A    ; load numSmall
004013A6        ³.  C705 24504000 D91BDD mov dword ptr [405024], 74DD1BD9
004013B0        ³.  C705 28504000 6B7E48 mov dword ptr [405028], 87487E6B
004013BA        ³.  C705 2C504000 800E1D mov dword ptr [40502C], 401D0E80
004013C4        ³.  90                   nop
004013C5        ³.  C705 30504000 5AFE70 mov dword ptr [405030], 7C70FE5A    ; load NumBig
004013CF        ³.  C705 34504000 D91BDD mov dword ptr [405034], 74DD1BD9
004013D9        ³.  C705 38504000 6B7E48 mov dword ptr [405038], 0B487E6B
004013E3        ³.  C705 3C504000 58261D mov dword ptr [40503C], 401D2658
004013ED        ³.  90                   nop
004013EE        ³.  A1 20504000          mov eax, [405020]                   ; numSmall
004013F3        ³.  894424 40            mov [esp+40], eax
004013F7        ³.  A1 24504000          mov eax, [405024]
004013FC        ³.  894424 44            mov [esp+44], eax
00401400        ³.  A1 28504000          mov eax, [405028]
00401405        ³.  894424 48            mov [esp+48], eax
00401409        ³.  A1 2C504000          mov eax, [40502C]
0040140E        ³.  894424 4C            mov [esp+4C], eax
00401412        ³.  A1 30504000          mov eax, [405030]                   ; numBig
00401417        ³.  894424 30            mov [esp+30], eax
0040141B        ³.  A1 34504000          mov eax, [405034]
00401420        ³.  894424 34            mov [esp+34], eax
00401424        ³.  A1 38504000          mov eax, [405038]
00401429        ³.  894424 38            mov [esp+38], eax
0040142D        ³.  A1 3C504000          mov eax, [40503C]
00401432        ³.  894424 3C            mov [esp+3C], eax
00401436        ³.  8D4424 50            lea eax, [esp+50]                   ; dest
0040143A        ³.  8B4C24 30            mov ecx, [esp+30]
0040143E        ³.  894C24 20            mov [esp+20], ecx                   ; copy big
00401442        ³.  8B4C24 34            mov ecx, [esp+34]
00401446        ³.  894C24 24            mov [esp+24], ecx
0040144A        ³.  8B4C24 38            mov ecx, [esp+38]
0040144E        ³.  894C24 28            mov [esp+28], ecx
00401452        ³.  8B4C24 3C            mov ecx, [esp+3C]
00401456        ³.  894C24 2C            mov [esp+2C], ecx
0040145A        ³.  8B5424 40            mov edx, [esp+40]
0040145E        ³.  895424 10            mov [esp+10], edx                   ; copy small
00401462        ³.  8B4C24 44            mov ecx, [esp+44]
00401466        ³.  894C24 14            mov [esp+14], ecx
0040146A        ³.  8B5424 48            mov edx, [esp+48]
0040146E        ³.  895424 18            mov [esp+18], edx
00401472        ³.  8B4C24 4C            mov ecx, [esp+4C]
00401476        ³.  894C24 1C            mov [esp+1C], ecx
0040147A        ³.  890424               mov [esp], eax
0040147D        ³.  E8 2E0D0000          call 004021B0                       ; numBig+numSmall
00401482        ³.  8B4424 50            mov eax, [esp+50]                   ; shuffle result around
00401486        ³.  894424 30            mov [esp+30], eax
0040148A        ³.  8B4424 54            mov eax, [esp+54]
0040148E        ³.  894424 34            mov [esp+34], eax
00401492        ³.  8B4424 58            mov eax, [esp+58]
00401496        ³.  894424 38            mov [esp+38], eax
0040149A        ³.  8B4424 5C            mov eax, [esp+5C]
0040149E        ³.  894424 3C            mov [esp+3C], eax
004014A2        ³.  8B4424 30            mov eax, [esp+30]
004014A6        ³.  894424 40            mov [esp+40], eax                   ; one more shuffle
004014AA        ³.  8B4424 34            mov eax, [esp+34]
004014AE        ³.  894424 44            mov [esp+44], eax
004014B2        ³.  8B4424 38            mov eax, [esp+38]
004014B6        ³.  894424 48            mov [esp+48], eax
004014BA        ³.  8B4424 3C            mov eax, [esp+3C]
004014BE        ³.  894424 4C            mov [esp+4C], eax
004014C2        ³.  8B4424 40            mov eax, [esp+40]
004014C6        ³.  A3 50504000          mov [405050], eax                   ; wow, finally we move it to destination!
004014CB        ³.  8B4424 44            mov eax, [esp+44]
004014CF        ³.  A3 54504000          mov [405054], eax
004014D4        ³.  8B4424 48            mov eax, [esp+48]
004014D8        ³.  A3 58504000          mov [405058], eax
004014DD        ³.  8B4424 4C            mov eax, [esp+4C]
004014E1        ³.  A3 5C504000          mov [40505C], eax
004014E6        ³.  90                   nop
004014E7        ³.  90                   nop
004014E8        ³.  CC                   int3


Comments are mine, added after a week of trial and error.

Here is the source:

Code: Select all

  __asm("nop");
  __asm("int $3");
  numSmall=   1.1345678901234567890123456789012345678901234567890e9q;
  __asm("nop");
  numBig=   1.2345678901234567890123456789012345678901234567890e9q;
  __asm("nop");
  destMod=numSmall+numBig;   // uses internal GCC function; precision?
  __asm("nop");
  __asm("nop");
  __asm("int $3");


What GCC does there is pretty hilarious, and all this shuffling around of quads on the stack is completely unnecessary. However, I have never worked with FB, so I can't give you much advice how to pass your parameters...
fxm
Posts: 9559
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: using gcc quadmath

Postby fxm » Aug 02, 2017 6:29

Have you already tried compiling the FreeBasic code with the -exx option to see an eventual run-time error (by running the program from a command window or using a IDE doing that)?
srvaldez
Posts: 2344
Joined: Sep 25, 2005 21:54

Re: using gcc quadmath

Postby srvaldez » Aug 02, 2017 12:02

hello fxm
a new member of this forum by the name of jj2007 said he posted a long and detailed explanation but his post has not been published, I would like to read his explanation before I go any further.
srvaldez
Posts: 2344
Joined: Sep 25, 2005 21:54

Re: using gcc quadmath

Postby srvaldez » Aug 02, 2017 23:39

thank you very much jj2007 :-)
this will take some time to study.
srvaldez
Posts: 2344
Joined: Sep 25, 2005 21:54

Re: using gcc quadmath

Postby srvaldez » Aug 03, 2017 0:05

jj2007, from looking at the asm code it's not clear how the arguments are passed, will sleep on this.
jj2007
Posts: 1403
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: using gcc quadmath

Postby jj2007 » Aug 03, 2017 0:22

Here it is with some highlighting:

00401436 ³. 8D4424 50 lea eax, [esp+50] ; dest

0040143A ³. 8B4C24 30 mov ecx, [esp+30]
0040143E ³. 894C24 20 mov [esp+20], ecx ; copy big

0040145A ³. 8B5424 40 mov edx, [esp+40]
0040145E ³. 895424 10 mov [esp+10], edx ; copy small

0040147A ³. 890424 mov [esp], eax ; dest
0040147D ³. E8 2E0D0000 call 004021B0 ; numBig+numSmall

So it seems that GCC allocates a lot of shadow space somewhere before, and then passes paras in REAL16 steps to
[esp+00]
[esp+10]
[esp+20]
My implementation works just fine, but I pass everything in xmm regs. No idea what you can do in FB, though.
srvaldez
Posts: 2344
Joined: Sep 25, 2005 21:54

Re: using gcc quadmath

Postby srvaldez » Aug 03, 2017 0:39

using the C compiler explorer at https://godbolt.org
with compiler x86-64 gcc 7.1 and compiler options -m64 -O0
the following C code

Code: Select all

#include <quadmath.h>

int main ()
{
  __float128 a, b, r;
   r=a+b;
  return 0;
}

produces

Code: Select all

        push    rbp
        mov     rbp, rsp
        sub     rsp, 48
        movdqa  xmm1, XMMWORD PTR [rbp-32]
        movdqa  xmm0, XMMWORD PTR [rbp-16]
        call    __addtf3
        movaps  XMMWORD PTR [rbp-48], xmm0
        mov     eax, 0
        leave
        ret

it looks to me that the quads are loaded into xmm1 and xmm0 from the stack (I didn't know that the xmm# registers were 128-bit)
with compiler options -m32 -O0, we get

Code: Select all

main:
        lea     ecx, [esp+4]
        and     esp, -16
        push    DWORD PTR [ecx-4]
        push    ebp
        mov     ebp, esp
        push    ecx
        sub     esp, 84
        lea     eax, [ebp-72]
        sub     esp, 16
        mov     edx, DWORD PTR [ebp-40]
        mov     DWORD PTR [esp], edx
        mov     edx, DWORD PTR [ebp-36]
        mov     DWORD PTR [esp+4], edx
        mov     edx, DWORD PTR [ebp-32]
        mov     DWORD PTR [esp+8], edx
        mov     edx, DWORD PTR [ebp-28]
        mov     DWORD PTR [esp+12], edx
        sub     esp, 16
        mov     edx, DWORD PTR [ebp-24]
        mov     DWORD PTR [esp], edx
        mov     edx, DWORD PTR [ebp-20]
        mov     DWORD PTR [esp+4], edx
        mov     edx, DWORD PTR [ebp-16]
        mov     DWORD PTR [esp+8], edx
        mov     edx, DWORD PTR [ebp-12]
        mov     DWORD PTR [esp+12], edx
        sub     esp, 12
        push    eax
        call    __addtf3
        add     esp, 44
        mov     eax, DWORD PTR [ebp-72]
        mov     DWORD PTR [ebp-88], eax
        mov     eax, DWORD PTR [ebp-68]
        mov     DWORD PTR [ebp-84], eax
        mov     eax, DWORD PTR [ebp-64]
        mov     DWORD PTR [ebp-80], eax
        mov     eax, DWORD PTR [ebp-60]
        mov     DWORD PTR [ebp-76], eax
        mov     eax, DWORD PTR [ebp-88]
        mov     DWORD PTR [ebp-56], eax
        mov     eax, DWORD PTR [ebp-84]
        mov     DWORD PTR [ebp-52], eax
        mov     eax, DWORD PTR [ebp-80]
        mov     DWORD PTR [ebp-48], eax
        mov     eax, DWORD PTR [ebp-76]
        mov     DWORD PTR [ebp-44], eax
        mov     eax, 0
        mov     ecx, DWORD PTR [ebp-4]
        leave
        lea     esp, [ecx-4]
        ret

look like the arguments are copied from the stack to local stack then a call to __addtf3 is made, not sure how to interpret the rest.
edit: not sure why this code just before __addtf3

Code: Select all

        sub     esp, 12 ;<<< why?
        push    eax
        call    __addtf3

Return to “General”

Who is online

Users browsing this forum: No registered users and 5 guests