x64 Win asm naked function return value?

General FreeBASIC programming questions.
Post Reply
srvaldez
Posts: 3373
Joined: Sep 25, 2005 21:54

x64 Win asm naked function return value?

Post by srvaldez »

I was experimenting with x64 asm, the complex addition can be implemented as a naked sub like the following

Code: Select all

type double_complex
    re as double
    im as double
end type

sub cadd naked cdecl(byref result as double_complex, byref x as double_complex, byref y as double_complex)
    asm
        movsd   xmm0, QWORD PTR [rdx+8]
        movsd   xmm1, QWORD PTR [rdx]
        addsd   xmm0, QWORD PTR [r8+8]
        addsd   xmm1, QWORD PTR [r8]
        movsd   QWORD PTR [rcx+8], xmm0
        movsd   QWORD PTR [rcx], xmm1
        ret
    end asm
end sub
now, how about making it a function?
when writing naked functions, you don't have the convenience of accessing the arguments by name, and local variables must be either registers or the stack.
also, unlike non-naked functions, you don't have the ease of storing the return value in [function], if it's a foating point value then you store the return value in xmm0 before ret, but how to return a simple structure like above?
I thought that it would be returned in the registers xmm0 and xmm1, but no.
by trial and error I found that for this example the address for the return value is stored in register rcx and also in register rax, now I would like to know, which of these two registers is the right register?
it was by chance that rax had the value of rcx, however one must copy rcx to rax before ret.
I also found that register usage don't match that of MS documentation https://docs.microsoft.com/en-us/cpp/bu ... ster-usage
rcx is for 1st argument
rdx is for 2nd argument
r8 is for 3rd argument
but I found that the arguments were passed in registers rdx and r8 and apparently rcx is used for the return address, ok here's the function

Code: Select all

function cadd naked cdecl(byref x as double_complex, byref y as double_complex) as double_complex
    asm
        movsd   xmm0, QWORD PTR [rdx]
        movsd   xmm1, QWORD PTR [rdx+8]
        addsd   xmm0, QWORD PTR [r8]
        addsd   xmm1, QWORD PTR [r8+8]
        movsd   QWORD PTR [rcx], xmm0	're
        movsd   QWORD PTR [rcx+8], xmm1	'im
        mov      rax, rcx
        ret
    end asm
end function
and a small test

Code: Select all

dim as double_complex x, y, z

x.re=0.5
x.im=0.3
y.re=.8
y.im=.9

z=cadd(x, y)
print z.re,z.im
sleep

Code: Select all

 1.3           1.2 
btw, the sub and function are identical except for the argument count
Last edited by srvaldez on May 30, 2019 11:53, edited 2 times in total.
srvaldez
Posts: 3373
Joined: Sep 25, 2005 21:54

Re: x64 Win asm naked function return value?

Post by srvaldez »

found out that the right register for the return address is rcx, also when using att syntax with -O 2 the program crashes but is OK with -O 0, but there have been problems before when using inline asm and optimization.
rcx does indeed hold the address of the temporary variable, but one must copy rcx to rax before ret.
Last edited by srvaldez on May 30, 2019 11:50, edited 1 time in total.
SARG
Posts: 1755
Joined: May 27, 2005 7:15
Location: FRANCE

Re: x64 Win asm naked function return value?

Post by SARG »

Hi srvaldez,

That's not completely exact.
Rcx (hidden parameter) contains the address of a temporary variable (type as double_complex) which receives the values (your movsd) then they are moved to the final variable (result). Rax not really used.

The extract below, from gas64, below shows in detail how it works.

Code: Select all

# basic --> result=cadd2(dx, dy)
   # -----------------------------------------
   # Info --> addrof _RESULT+-168 [struct DOUBLE_COMPLEX]
   # Info --> v1=var RESULT ofs=-168 [struct DOUBLE_COMPLEX] symbdump=var local accessed declared RESULT [struct DOUBLE_COMPLEX]
   # Info --> vr=reg 11 [struct DOUBLE_COMPLEX ptr]
   # Info --> virtual register =11 real register=r11
   # Info --> marked as used register=r11
   lea r11, -168[rbp]
   
   # Info --> addrof _DY+-152 [struct DOUBLE_COMPLEX]
   # Info --> v1=var DY ofs=-152 [struct DOUBLE_COMPLEX] symbdump=var local accessed declared DY [struct DOUBLE_COMPLEX]
   # Info --> vr=reg 12 [struct DOUBLE_COMPLEX ptr]
   # Info --> virtual register =12 real register=r10
   # Info --> marked as used register=r10
   
   lea r10, -152[rbp]
   # Info --> addrof _DX+-136 [struct DOUBLE_COMPLEX]
   # Info --> v1=var DX ofs=-136 [struct DOUBLE_COMPLEX] symbdump=var local accessed declared DX [struct DOUBLE_COMPLEX]
   # Info --> vr=reg 13 [struct DOUBLE_COMPLEX ptr]
   # Info --> virtual register =13 real register=r8
   # Info --> marked as used register=r8
   lea r8, -136[rbp]
   
   # Info --> addrof _LT_0029+-184 [struct DOUBLE_COMPLEX]
   # Info --> v1=var LT_0029 ofs=-184 [struct DOUBLE_COMPLEX] symbdump=var local temp accessed implicit LT_0029 [struct DOUBLE_COMPLEX]
   # Info --> vr=reg 14 [struct DOUBLE_COMPLEX ptr]
   # Info --> virtual register =14 real register=r9
   # Info --> marked as used register=r9
#O5lea r9, -184[rbp]

   # Info --> memclear vr14 [struct DOUBLE_COMPLEX ptr]
   # Info --> v1=reg 14 [struct DOUBLE_COMPLEX ptr]
   # Info --> v2=imm 16 [integer]
   # Info --> virtual register =14 real register=r9
   # Info --> Release done for register=r9
   # Info --> OPTIMIZATION 5 (lea)
   # Info --> END OF OPTIMIZATION5
   #O5mov rdx, r9
   lea rdx, -184[rbp] #Optim 5
   mov QWORD PTR 0[rdx], 0
   mov QWORD PTR 8[rdx], 0
   
   # Info --> addrof _LT_0029+-184 [struct DOUBLE_COMPLEX]
   # Info --> v1=var LT_0029 ofs=-184 [struct DOUBLE_COMPLEX] symbdump=var local temp accessed implicit LT_0029 [struct DOUBLE_COMPLEX]
   # Info --> vr=reg 15 [struct DOUBLE_COMPLEX ptr]
   # Info --> virtual register =15 real register=r9
   # Info --> marked as used register=r9
#O5lea r9, -184[rbp]

   # Info --> call _CADD2 / mang=CADD2
   # Info --> symbdump=proc shared public naked accessed declared parsed procemitted CADD2 cdecl [struct DOUBLE_COMPLEX]
   # Info --> vr=reg 16 [struct DOUBLE_COMPLEX ptr]
   # Info --> level=1
   # Info --> arg vr15 [struct DOUBLE_COMPLEX ptr]
   # Info --> arg=reg 15 [struct DOUBLE_COMPLEX ptr] vreg=15
   # Info --> virtual register =15 real register=r9
   # Info --> Release done for register=r9
   # Info --> OPTIMIZATION 5 (lea)
   # Info --> END OF OPTIMIZATION5
   #O5mov rcx, r9
   lea rcx, -184[rbp] #Optim 5
   
   # Info --> arg vr13 [struct DOUBLE_COMPLEX ptr]
   # Info --> arg=reg 13 [struct DOUBLE_COMPLEX ptr] vreg=13
   # Info --> virtual register =13 real register=r8
   # Info --> Release done for register=r8
   mov rdx, r8
   
   # Info --> arg vr12 [struct DOUBLE_COMPLEX ptr]
   # Info --> arg=reg 12 [struct DOUBLE_COMPLEX ptr] vreg=12
   # Info --> virtual register =12 real register=r10
   # Info --> Release done for register=r10
   mov r8, r10
   
   mov QWORD PTR -80[rbp], r11  #NO_FREE
   call _CADD2
   mov r11, QWORD PTR -80[rbp] #NO_FREE
   # Info --> virtual register =16 real register=r10
   # Info --> marked as used register=r10
   mov r10, rax
   
   # Info --> addrof _LT_0029+-184 [struct DOUBLE_COMPLEX]
   # Info --> v1=var LT_0029 ofs=-184 [struct DOUBLE_COMPLEX] symbdump=var local temp accessed implicit LT_0029 [struct DOUBLE_COMPLEX]
   # Info --> vr=reg 17 [struct DOUBLE_COMPLEX ptr]
   # Info --> virtual register =17 real register=r8
   # Info --> marked as used register=r8
#O5lea r8, -184[rbp]

   # Info --> memcopy vr11 [struct DOUBLE_COMPLEX ptr] <= vr17 [struct DOUBLE_COMPLEX ptr]
   # Info --> v1=reg 11 [struct DOUBLE_COMPLEX ptr]
   # Info --> v2=reg 17 [struct DOUBLE_COMPLEX ptr]
   # Info --> nb bytes=16
   # Info --> virtual register =11 real register=r11
   # Info --> virtual register =17 real register=r8
   # Info --> Release done for register=r8
   # Info --> OPTIMIZATION 5 (lea)
   # Info --> END OF OPTIMIZATION5
   #O5mov r9, r8
   lea r9, -184[rbp] #Optim 5
   # Info --> Release done for register=r11
   mov rdx, r11
   mov rax, 0[r9]
   mov 0[rdx], rax
   mov rax, 8[r9]
   mov 8[rdx], rax
   # Info --> registers released
For naked procs the parameter names are not filled that the reason why we got an issue (compiler crash) with the last example you provided for gas64. It's fixed in next release.
srvaldez
Posts: 3373
Joined: Sep 25, 2005 21:54

Re: x64 Win asm naked function return value?

Post by srvaldez »

thank you SARG
I will study your information later, right now I have a bad headache, would it be safe to say that assigning a value to this hidden variable would be equivalent to the statement Function = value ?
here's another silly example

Code: Select all

type bar
   as double d
   as long l
   as longint ld
   as zstring*19 sz
end type

function foo naked () as bar
   asm
      fldpi
      fstp qword ptr   [rcx]
      mov	eax, 123
      mov	dword	ptr [rcx+8],eax
      mov	rax,123456789
      mov	qword	ptr [rcx+16],rax
      mov	rax,.L0
      mov	qword	ptr [rcx+24],rax
      mov	rax,.L0+8
      mov	qword	ptr [rcx+32],rax
      mov   rax, rcx
      ret
      .L0: .asciz "hello world\n"
      
   end asm
end function

dim as bar y

y=foo
? y.d, y.l, y.ld, y.sz
Last edited by srvaldez on May 30, 2019 11:51, edited 3 times in total.
SARG
Posts: 1755
Joined: May 27, 2005 7:15
Location: FRANCE

Re: x64 Win asm naked function return value?

Post by SARG »

It should be safe. However I will check tomorrow (near 2h00 now in France...).
But there are only 4 fields in the type so writing in [rcx+32] seems not good.

By the way CDECL is not usefull.

edit

function foo naked () as bar
...

y=foo

are enough no need a parameter.
Last edited by SARG on May 28, 2019 0:07, edited 1 time in total.
srvaldez
Posts: 3373
Joined: Sep 25, 2005 21:54

Re: x64 Win asm naked function return value?

Post by srvaldez »

thanks SARG :-)
mov rax,.L0+8 'only copies 8 characters
mov qword ptr [rcx+32],rax 'copies the remaining characters of the string
SARG
Posts: 1755
Joined: May 27, 2005 7:15
Location: FRANCE

Re: x64 Win asm naked function return value?

Post by SARG »

Ok understood.

Have you seen my edit ?
srvaldez
Posts: 3373
Joined: Sep 25, 2005 21:54

Re: x64 Win asm naked function return value?

Post by srvaldez »

yes, I corrected my post.
srvaldez
Posts: 3373
Joined: Sep 25, 2005 21:54

Re: x64 Win asm naked function return value?

Post by srvaldez »

hello SARG
is there a way to know the offset of the elements of a structure as in the foo-bar example?
I am having no luck in macOS, except for the first element.
SARG
Posts: 1755
Joined: May 27, 2005 7:15
Location: FRANCE

Re: x64 Win asm naked function return value?

Post by SARG »

Hi srvaldez,

Checked, all is ok.

However for this example it's better to use a sub with the result variable as parameter. It's nearly the same but this avoids the creation of a temporary variable and the copy.

Code: Select all

print __FB_BACKEND__

type bar
   as double d
   as long l
   as longint ld
   as zstring*19 sz
end type
sub foo2 naked (y as bar)
   asm
      fldpi
      fstp qword ptr   [rcx]
      mov   eax, 123
      mov   dword   ptr [rcx+8],eax
      mov   rax,123456789
      mov   qword   ptr [rcx+16],rax
      mov   rax,.L1
      mov   qword   ptr [rcx+24],rax
      mov   rax,.L1+8
      mov   qword   ptr [rcx+32],rax
      ret
      .L1: .asciz "hello world\n"
      
   end asm
end sub

dim as bar y
foo2(y)
? y.d, y.l, y.ld, y.sz
sleep
srvaldez
Posts: 3373
Joined: Sep 25, 2005 21:54

Re: x64 Win asm naked function return value?

Post by srvaldez »

yes, I kind of thought so. :-)
[dit] resolved the problem on macOS, I was doing "movl 123, %eax" when it should be "movl $123, %eax" also FB has the offsetof function to get the offset of the different members of a type, but I was looking for way to retrieve that information in asm
srvaldez
Posts: 3373
Joined: Sep 25, 2005 21:54

Re: x64 Win asm naked function return value?

Post by srvaldez »

@SARG
I found that for naked functions one must copy rcx to rax before ret, I edited my posts above.
SARG
Posts: 1755
Joined: May 27, 2005 7:15
Location: FRANCE

Re: x64 Win asm naked function return value?

Post by SARG »

Hi srvaldez,

If the returned value is a simple datatype :
- integer number --> rax
- float( single/double) --> xmm0

As far I understand in case of structure (udt) rax is not used. Maybe byref/byval involves a different behaviour.

Do you get a problem when not filling rax in your example ?
srvaldez
Posts: 3373
Joined: Sep 25, 2005 21:54

Re: x64 Win asm naked function return value?

Post by srvaldez »

I had a test case where it was crashing without copying rcx to rax , but now I can't replicate the problem, if I find out, I will let you know.
Post Reply