ASM function to reverse endianness

New to FreeBASIC? Post your questions here.
Post Reply
fzabkar
Posts: 154
Joined: Sep 29, 2018 2:52
Location: Australia

ASM function to reverse endianness

Post by fzabkar »

I've been using the following function to reverse the endian-ness of a single 32-bit dword. Could someone please show me how to adapt it to reverse a block of 32-bit data? I'm currently iterating through a For-Next loop.

Code: Select all

' Assembly language function to reverse the endian-ness of a 32-bit dword

Function EndianRev32( ByVal dwNum As uLong ) As uLong
    
	ASM
        
	  mov eax, [dwNum]
	  bswap eax
	  mov [Function], eax
      
	End ASM
    
End Function
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: ASM function to reverse endianness

Post by jj2007 »

Code: Select all

' Assembly language function to reverse the endian-ness of a block of 32-bit dwords

Sub EndianRev32 naked ( ByRef srcNums As uLong, ByRef destNums As uLong, ByVal elements as uLong )
  ASM
	push esi
	push edi
	mov ecx, [esp+8+12]	'elements
	mov esi, [esp+8+4]	 'srcNums
	mov edi, [esp+8+8]	 'destNums
  L1:	lodsd
	bswap eax
	stosd
	dec ecx
	jg L1
	pop edi
	pop esi
	ret 12
  End ASM   
End Sub

dim s(10) as ulong=>{100, 101, 102, 103, 104, 105, 106, 107, 108, 109}
dim d(10) as ulong

Print "Swapping src->dest:"
EndianRev32(s(0), d(0), 10)
  For ct As long=0 To 9
	Print ct, d(ct)
  Next

Print "Swapping dest->dest:"
EndianRev32(d(0), d(0), 10)
  For ct As long=0 To 9
	Print ct, d(ct)
  Next
sleep
fzabkar
Posts: 154
Joined: Sep 29, 2018 2:52
Location: Australia

Re: ASM function to reverse endianness

Post by fzabkar »

Many thanks.

I am referencing a block of data in RAM via a pointer rather than an explicit array variable. Would the following code work the same way?

Code: Select all

Sub EndianRev32 naked ( ByVal srcBlock As uLong Ptr, ByVal destBlock As uLong Ptr, ByVal numElements as uLong )
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: ASM function to reverse endianness

Post by jj2007 »

Test it...
fzabkar
Posts: 154
Joined: Sep 29, 2018 2:52
Location: Australia

Re: ASM function to reverse endianness

Post by fzabkar »

Thanks, it works fine.

http://www.users.on.net/~fzabkar/FreeBa ... rdrev2.bas
http://www.users.on.net/~fzabkar/FreeBa ... rdrev2.exe

I'm also using this code to byte swap a 16-bit word:

Code: Select all

' Assembly language function to reverse the endian-ness of a 16-bit word

Function EndianRev16( ByVal num As UShort ) As UShort
    
	ASM
        
	  mov ax, [num]
	  xchg ah, al
	  mov [Function], ax
      
	End ASM
    
End Function
Could I use this code instead?

Code: Select all

Sub EndianRev16 Naked ( ByVal wdSrcPtr As uShort Ptr, ByVal wdDestPtr As uShort Ptr, ByVal dwNumElements as uLong )
    
    ASM
        push si                 ' save SI onto stack
        push di                 ' save DI onto stack
        mov ecx, [sp+8+6]       ' elements
        mov si, [sp+8+2]        ' wdSrcPtr
        mov di, [sp+8+4]        ' wdDestPtr
L1:     lodsw                   ' load AX with contents of address pointed to by SI - SI increments by 2
        xchg ah, al             ' byte swap AX
        stosw                   ' copy contents of AX to word address pointed to by DI - DI increments by 2
        dec ecx                 ' decrement CX
        jg L1                   ' jump to L1 if greater than 0
        pop di                  ' restore SI from stack
        pop si                  ' restore DI from stack
        ret 8                   ' discard 6 bytes from stack
   
    End ASM
  
End Sub
I may have misinterpreted the "MOV reg [sp+8+n]" op codes. Could you explain how the stack pointer is affected? Am I correct in specifying "sp" instead of "esp"?

Why do we need to discard bytes from the stack? I didn't need to do this for the original single word/dword EndianRev function. Does this have something to do with the Naked parameter, ie compiler prologue/epilogue?
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: ASM function to reverse endianness

Post by jj2007 »

fzabkar wrote:Am I correct in specifying "sp" instead of "esp"?
No. This is still 32-bit code, so esp is the stack pointer, and esi + edi are pointers, too.

Code: Select all

' Assembly language function to reverse the endian-ness of a block of 16-bit words (->fzabkar)

Sub EndianRev16 naked ( ByRef srcNums As uShort, ByRef destNums As uShort, ByVal elements as uLong )
  ASM
	mov ecx, [esp+12]	'elements
	push esi
	push edi
	mov esi, [esp+8+4]	'srcNums
	mov edi, [esp+8+8]	'destNums
  L1:	lodsw
	bswap eax
	shr eax, 16    ' important
	stosw
	dec ecx
	jg L1
	pop edi
	pop esi
	ret 12
  End ASM   
End Sub

dim s(10) as ushort=>{100, 101, 102, 103, 104, 105, 106, 107, 108, 109}
dim d(10) as ushort

Print "Swapping src->dest:"
EndianRev16(s(0), d(0), 10)
  For ct As long=0 To 9
	Print ct, d(ct)
  Next

Print "Swapping dest->dest:"
EndianRev16(d(0), d(0), 10)
  For ct As long=0 To 9
	Print ct, d(ct)
  Next
sleep
The function does two pushes to save the non-volatile registers esi + edi. These pushes are compensated by [esp+8+4]

Btw you are lucky that I had saved your code. Do not assume that experienced coders are eager to put the missing headers etc around your snippets to make it work. POST COMPLETE CODE, as I have done above, or your help requests will be ignored.
marcov
Posts: 3462
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Re: ASM function to reverse endianness

Post by marcov »

jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: ASM function to reverse endianness

Post by jj2007 »

For a high speed application, that's indeed a good choice. It's SSSE3, though (older machines can't use it), and requires 16-bit alignment.
fzabkar
Posts: 154
Joined: Sep 29, 2018 2:52
Location: Australia

Re: ASM function to reverse endianness

Post by fzabkar »

This code works, too. I'm guessing that the "Naked" prologue and epilogue compiler overheads involve Pushing and Popping the stack, plus the Return. (The FB docs state that the ESI and EDI registers are automatically pushed and popped by ASM).

Code: Select all

Sub EndianRev16( ByVal wdSrcPtr As uShort Ptr, ByVal wdDestPtr As uShort Ptr, ByVal dwNumElements as uLong )
    
    ASM
        mov ecx, [dwNumElements]
        mov esi, [wdSrcPtr]
        mov edi, [wdDestPtr]
L1:     lodsw
        bswap eax
        shr eax, 16         ' important
        stosw
        dec ecx
        jg L1
   
    End ASM
  
End Sub
Thanks again.
marcov
Posts: 3462
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Re: ASM function to reverse endianness

Post by marcov »

jj2007 wrote:
For a high speed application, that's indeed a good choice. It's SSSE3, though (older machines can't use it), and requires 16-bit alignment.
Those machines are getting pretty old though (8+ years). Afaik that code uses movdqu so that is unaligned. (heavy penalties on Pentium-D's though iirc, but you wouldn't be doing anything speed dependent on those anyway, there are better machines in the bin)

In the past I used simlar code for swapping certain camera's 16-bit images. But in the end I changed the calculation to swap on the fly, and used shaders to do the swapping while displaying.
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: ASM function to reverse endianness

Post by jj2007 »

fzabkar wrote:This code works, too. I'm guessing that the "Naked" prologue and epilogue compiler overheads...
"Naked" means "no overhead", and the Sub is only 28 bytes instead of 36. You should one day insert an asm int 3 into your code, before calling the sub or function, and let the just-in-time debugger show you what happens under the hood. I recommend OllyDbg, it's easy to learn: you just need to press F7 repeatedly. Example:

Code: Select all

asm int 3
EndianRev16(s(0), d(0), 10)

Code: Select all

CPU Disasm
Address        Hex dump              Command                                 Comments
004016A7       ³.  CC                int3                                    ; ³
004016A8       ³.  6A 0A             push 0A                                 ; ³ÚArg3 = 0A
004016AA       ³.  8D45 AC           lea eax, [ebp-54]                       ; ³³
004016AD       ³.  50                push eax                                ; ³³Arg2 => offset LOCAL.21
004016AE       ³.  8D45 E4           lea eax, [ebp-1C]                       ; ³³
004016B1       ³.  50                push eax                                ; ³³Arg1 => offset LOCAL.7
004016B2       ³.  E8 D9FEFFFF       call 00401590                           ; ³ÀTmpFb.00401590
...
00401590       Ú$  8B4C24 0C         mov ecx, [esp+0C]                       ; TmpFb.00401590(guessed Arg1,Arg2,Arg3)
00401594       ³.  56                push esi
00401595       ³.  57                push edi
00401596       ³.  8B7424 0C         mov esi, [esp+0C]
0040159A       ³.  8B7C24 10         mov edi, [esp+10]
0040159E       ³>  66:AD             Úlodsw
004015A0       ³.  0FC8              ³bswap eax
004015A2       ³.  C1E8 10           ³shr eax, 10
004015A5       ³.  66:AB             ³stosw
004015A7       ³.  49                ³dec ecx
004015A8       ³. 7F F4             Àjg short 0040159E
004015AA       ³.  5F                pop edi
004015AB       ³.  5E                pop esi
004015AC       À.  C2 0C00           retn 0C
The same but not naked:

Code: Select all

CPU Disasm
Address        Hex dump              Command                                 Comments
004016D7       ³.  CC                int3
004016D8       ³.  6A 0A             push 0A
004016DA       ³.  8D45 AC           lea eax, [ebp-54]
004016DD       ³.  50                push eax                                ; ÚArg3 => offset LOCAL.21
004016DE       ³.  8D45 E4           lea eax, [ebp-1C]                       ; ³
004016E1       ³.  50                push eax                                ; ³Arg2 => offset LOCAL.7
004016E2       ³.  E8 C9FEFFFF       call 004015B0                           ; ³
...
004015B0       Ú$  55                push ebp
004015B1       ³.  89E5              mov ebp, esp
004015B3       ³.  53                push ebx
004015B4       ³.  56                push esi
004015B5       ³.  57                push edi
004015B6       ³.  8B4D 10           mov ecx, [ebp+10]
004015B9       ³.  8B75 08           mov esi, [ebp+8]
004015BC       ³.  8B7D 0C           mov edi, [ebp+0C]
004015BF       ³>  66:AD             Úlodsw
004015C1       ³.  0FC8              ³bswap eax
004015C3       ³.  C1E8 10           ³shr eax, 10
004015C6       ³.  66:AB             ³stosw
004015C8       ³.  49                ³dec ecx
004015C9       ³. 7F F4             Àjg short 004015BF
004015CB       ³.  5F                pop edi
004015CC       ³.  5E                pop esi
004015CD       ³.  5B                pop ebx
004015CE       ³.  89EC              mov esp, ebp
004015D0       ³.  5D                pop ebp
004015D1       À.  C2 0C00           retn 0C
marcov wrote:that code uses movdqu so that is unaligned
Yes, you can use movdqu or movups to load the source operand into an xmm reg. Pshufb works on a 128-bit memory operand, too, but then it needs to be aligned.
srvaldez
Posts: 3379
Joined: Sep 25, 2005 21:54

Re: ASM function to reverse endianness

Post by srvaldez »

if your OS is Windows then you could try the CRT function _swab https://docs.microsoft.com/en-us/cpp/c- ... w=msvc-160
Remarks

If n is even, the _swab function copies n bytes from src, swaps each pair of adjacent bytes, and stores the result at dest. If n is odd, _swab copies and swaps the first n-1 bytes of src, and the final byte is not copied. The _swab function is typically used to prepare binary data for transfer to a machine that uses a different byte order.

Code: Select all

extern "c"
	declare sub _swab( byval src as zstring ptr, byval dest as zstring ptr, byval n as long)
end extern

dim as zstring ptr a=callocate(64)
dim as zstring ptr b=callocate(64)

*a="0123456789"
_swab( a, b, 10)
Print *a
Print *b
deallocate(a)
deallocate(b)
sleep
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: ASM function to reverse endianness

Post by jj2007 »

srvaldez wrote:if your OS is Windows then you could try the CRT function _swab https://docs.microsoft.com/en-us/cpp/c- ... w=msvc-160
Interesting, thanks! It seems a very old function, as there is no DWORD equivalent. Here is its innermost loop (esi is the word counter, ecx srcptr, eax destptr, in place swapping is possible):

Code: Select all

L0:
mov dl, [ecx]
inc ecx
mov bl, [ecx]
inc ecx
mov [eax], bl
inc eax
mov [eax], dl
inc eax
dec esi
jnz L0
srvaldez
Posts: 3379
Joined: Sep 25, 2005 21:54

Re: ASM function to reverse endianness

Post by srvaldez »

it puzzles me that they didn't use bswap
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: ASM function to reverse endianness

Post by jj2007 »

@marcov: here is the pshufb test - a minor improvement on the CRT _swab by a factor 7.4 ;-)

(running the exe requires the Masm32 SDK - the test is with \Masm32\include\Windows.inc)
Post Reply