i have this little console program here, which works like a charm unless optimization is applied:
Code: Select all
'#compiler freebasic
'#compile console 32 exe /o "-gen gcc"
'#compile console 32 exe /o "-gen gcc -Wc -O3"
'***********************************************************************************************
'***********************************************************************************************
private sub do_copy(byval dest as any ptr, byval src as any ptr, byval count as Ulong)
'***********************************************************************************************
' copy count bytes from src to dest
'***********************************************************************************************
asm
push ecx
mov edi, [dest] 'destination
mov esi, [src] 'source
mov ecx, [count] '# of bytes to copy
shr ecx, 2 '/4 -> # of dwords to copy
cld
rep movsd 'copy dwords
mov ecx, [count]
and ecx, 3 'mod 4
rep movsb 'copy remaining bytes
pop ecx
end asm
end sub
'***********************************************************************************************
function fb_main as long
'***********************************************************************************************
' main
'***********************************************************************************************
dim s as string = "123456789"
dim s1 as string
s1 = space(len(s))
do_copy(strptr(s1), strptr(s), len(s))
print s
print s1
sleep
function = 0
end function
'***********************************************************************************************
end fb_main
'***********************************************************************************************
'***********************************************************************************************
'***********************************************************************************************
It does nothing useful except for demonstrating my problem. The sub "do_copy" is written in assembler and will be used in another application for copying data from one place to another with utmost speed. Basically this works, but as soon as optimization is applied (-Wc -o1/2/3) - it´s optimized to death. The code crashes at:
Code: Select all
rep movsd
ESI or EDI hold incorrect values then. This happens only with the "-Wc o..." switch without it, it runs as expected.
How can i have gcc optimization without this fatal side effect? In other words, how can i prevent optimization in special places and have it everywhere else? Is there a special coding style or meta statement ensuring certain code sequences to remain untouched?
Thanks
JK