Allocate+DeAllocate vs GlobalAlloc/HeapAlloc

Windows specific questions.
Post Reply
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Allocate+DeAllocate vs GlobalAlloc/HeapAlloc

Post by jj2007 »

The useGA and useHA switches determine which allocation function to use. With GlobalAlloc and HeapAlloc it works perfect, but with the native FB functions Allocate and DeAllocate it throws segfaults after a while.

I am stuck. For me, these functions look interchangeable, and I can't see a reason why Allocate does not work. Any ideas?

Code: Select all

#include "Windows.bi"	' HeapAlloc + HeapFree
#include "crt.bi"	' _aligned_malloc + _aligned_free
#define useGA 1	' use GlobalAlloc
#define useHA 0	' use HeapAlloc
#define elements 500	' array elements

extern "c"
  declare function _aligned_malloc(byval as size_t, byval as size_t) as integer
  declare sub _aligned_free(byval as integer)
end extern

Dim shared as integer ProcHeap
ProcHeap=GetProcessHeap()

function Alloc16(bytes as long) As integer	' provides a 16-byte aligned pointer
  #if useGA
	Dim As integer a16=GlobalAlloc(GMEM_FIXED, bytes+16)
  #elseif useHA
	Dim As integer a16=HeapAlloc(ProcHeap, HEAP_GENERATE_EXCEPTIONS or HEAP_ZERO_MEMORY, bytes+16)
  #else
	Dim As integer a16=Allocate(bytes+16)	' what's wrong?
  #endif
  if a16 and 8 Then *cast(integer ptr, a16) = 8: return a16+8	' set the flag/offset, return ptr+8
  return a16+16	' flag not set, ret ptr+16
end function

Sub Free16(oldptr as integer)
  #if useGA
	GlobalFree(oldptr+*cast(integer ptr, oldptr-8)-16)
  #elseif useHA
	HeapFree(ProcHeap, 0, oldptr+*cast(integer ptr, oldptr-8)-16)
  #else
	DeAllocate(oldptr+*cast(integer ptr, oldptr-8)-16)
  #endif
End Sub

Dim as long somearray(elements)
Dim As double t
Dim As long useAlloc16
for tloop as Long=0 to 7
	t=timer
	For outerloop as long=1 to 2000
	For x as long=0 To elements
		if useAlloc16 then
			somearray(x)=Alloc16(Rnd()*1000)
		else
			somearray(x)=_aligned_malloc(Rnd()*1000, 16)
		endif
		' printf("%x ", somearray(x))
	Next 
	' print
	For x as long=0 To elements
		if useAlloc16 then
			' if x<2 Then print hex(somearray(x)); " "
			Free16(somearray(x))
		else
			' if x<2 Then print hex(somearray(x)); " "
			_aligned_free(somearray(x))
		endif
	Next
	Next
	t=timer-t
	print "Alloc16=";useAlloc16;": ";int(t*1000);" ms"
	useAlloc16=useAlloc16 xor 1
next
sleep
(this is a spinoff from this thread)
fxm
Moderator
Posts: 12107
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Allocate+DeAllocate vs GlobalAlloc/HeapAlloc

Post by fxm »

'Allocate' does not clear the memory, so the flag/offset is not initialized to '0'.
I don't know for 'GlobalAlloc' and 'HeapAlloc' ((but I think it should be the same for 'GlobalAlloc' at least).

General proposed solution:

Code: Select all

function Alloc16(bytes as long) As integer   ' provides a 16-byte aligned pointer
  #if useGA
   Dim As integer a16=GlobalAlloc(GMEM_FIXED, bytes+16)
  #elseif useHA
   Dim As integer a16=HeapAlloc(ProcHeap, HEAP_GENERATE_EXCEPTIONS or HEAP_ZERO_MEMORY, bytes+16)
  #else
   Dim As integer a16=Allocate(bytes+16)   ' what's wrong?
  #endif
  if a16 and 8 Then *cast(integer ptr, a16) = 8: return a16+8   ' set the flag/offset, return ptr+8
  *cast(integer ptr, a16+8) = 0   ' reset the flag/offset
  return a16+16   ' flag not set, ret ptr+16
end function
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Allocate+DeAllocate vs GlobalAlloc/HeapAlloc

Post by jj2007 »

fxm wrote:'Allocate' does not clear the memory, so the flag/offset is not initialized to '0'
Thanks, that was indeed the reason. Replace Alloc16 above with this:

Code: Select all

function Alloc16(bytes as long) As integer	' inspired by MasmBasic's Alloc16
  #if useGA
	Dim As integer a16=GlobalAlloc(GMEM_FIXED or GMEM_ZEROINIT, bytes+16)
  #elseif useHA
	Dim As integer a16=HeapAlloc(ProcHeap, HEAP_GENERATE_EXCEPTIONS or HEAP_ZERO_MEMORY, bytes+16)
  #else
	Dim As integer a16=Allocate(bytes+16)	 ' no zeroinit here
	*cast(integer ptr, a16+8) = 0				' clear the flag
  #endif
  if a16 and 8 Then *cast(integer ptr, a16) = 8: return a16+8	' set the flag/offset, return ptr+8
  return a16+16	' ret ptr+16
end function
fxm
Moderator
Posts: 12107
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Allocate+DeAllocate vs GlobalAlloc/HeapAlloc

Post by fxm »

Or use 'Callocate' instead of 'Allocate'.
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Allocate+DeAllocate vs GlobalAlloc/HeapAlloc

Post by jj2007 »

The fastest solution, btw, is HeapAlloc(ProcHeap, HEAP_GENERATE_EXCEPTIONS, bytes+16). It's over 25% faster than CRT's _aligned_malloc. Even when using the HEAP_ZERO_MEMORY flag, HeapAlloc is faster. The documentation of _aligned_malloc is scarse, but it seems that it does not initialize (i.e. zeroinit) the returned memory block.
adeyblue
Posts: 300
Joined: Nov 07, 2019 20:08

Re: Allocate+DeAllocate vs GlobalAlloc/HeapAlloc

Post by adeyblue »

The code of the _aligned functions are here. They've never been fast
https://github.com/Chuyu-Team/VC-LTL/bl ... /align.cpp

You can get HeapAlloc to always align to 16 if you use RtlCreateHeap to create a new one with the HEAP_CREATE_ALIGN_16 flag (HeapCreate will filter it out, so yoou have to use the ntdll one).

Also, not that you can do this with FB, but if you create a static IMAGE_LOAD_CONFIG_DIRECTORY struct in the code and set the IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG NT OptionalHeader directory to point to it, you might also be able to force the ALIGN_16 flag onto the default process heap with the ProcessHeapFlags member of the struct.

Noting that HeapAlloc seems to default to align 16 in 64-bit apps anyway.

(and yes, I know this is a testing toy, but those are ways you can get the alignment without having code to do it for every allocation)
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Allocate+DeAllocate vs GlobalAlloc/HeapAlloc

Post by jj2007 »

Thanks, adeyblue - very interesting. I've tested 64-bit HeapAlloc, and yes, it always aligns 16. Where do you get this info??? Same for RtlCreateHeap - Microsoft doesn't document the flags...
nastasa eodor
Posts: 182
Joined: Dec 18, 2018 16:37
Location: Germany, Hessdorf
Contact:

Re: Allocate+DeAllocate vs GlobalAlloc/HeapAlloc

Post by nastasa eodor »

memory.inc
i wrote this unit 10 years ago for rapidq compiler & interpreter, cand be easyly used in FreeBasic, i hope that this help...

Code: Select all

$DEFINE __RQMEMORY

$IFNDEF __WIN32API

Const HEAP_CREATE_ALIGN_16          = &H10000
Const HEAP_CREATE_ENABLE_TRACING    = &H20000
Const HEAP_DISABLE_COALESCE_ON_FREE = &H80
Const HEAP_FREE_CHECKING_ENABLED    = &H40
Const HEAP_GENERATE_EXCEPTIONS      = &H4
Const HEAP_GROWABLE                 = &H2
Const HEAP_MAXIMUM_TAG              = &HFFF
Const HEAP_NO_SERIALIZE             = &H1
Const HEAP_PSEUDO_TAG_FLAG          = &H8000
Const HEAP_REALLOC_IN_PLACE_ONLY    = &H10
Const HEAP_TAG_SHIFT                = 18
Const HEAP_TAIL_CHECKING_ENABLED    = &H20
Const HEAP_ZERO_MEMORY              = &H8

Declare Function HeapCreate Lib "kernel32" Alias "HeapCreate" (ByVal flOptions As Long, ByVal dwInitialSize As Long, ByVal dwMaximumSize As Long) As Long
Declare Function HeapDestroy Lib "kernel32" Alias "HeapDestroy" (ByVal hHeap As Long) As Long
Declare Function HeapAlloc Lib "kernel32" Alias "HeapAlloc" (ByVal hHeap As Long, ByVal dwFlags As Long, ByVal dwBytes As Long) As Long
Declare Function HeapReAlloc Lib "kernel32" Alias "HeapReAlloc" (ByVal hHeap As Long, ByVal dwFlags As Long, lpMem As Long, ByVal dwBytes As Long) As Long
Declare Function HeapFree Lib "kernel32" Alias "HeapFree" (ByVal hHeap As Long, ByVal dwFlags As Long, lpMem As Long) As Long
Declare Function HeapSize Lib "kernel32" Alias "HeapSize" (ByVal hHeap As Long, ByVal dwFlags As Long, lpMem As Long) As Long
Declare Function GetProcessHeap Lib "kernel32" Alias "GetProcessHeap" () As Long

$ENDIF

Function PtrSize(BYREF P&) As Integer
    DefInt Size = 0 
    If P& Then
       Size = HeapSize(GetProcessHeap,HEAP_NO_SERIALIZE,P&)
    End If
    Result = Size
End Function

Sub PtrFree(P&)
    If P& Then 
       HeapFree(GetProcessHeap,HEAP_NO_SERIALIZE Or HEAP_ZERO_MEMORY,P&)
       P& = 0
    End If
End Sub

Function ReAllocate(BYREF P&,Count&) As Integer
    If P& Then
       P& = HeapReAlloc(GetProcessHeap,HEAP_NO_SERIALIZE Or HEAP_ZERO_MEMORY,P&,PtrSize(P&)*Count&)
    End If
    Result = P&
End Function

Function Allocate(Count&) As Integer
    Result = HeapAlloc(GetProcessHeap,HEAP_NO_SERIALIZE Or HEAP_ZERO_MEMORY,Count&)
End Function

Function Ptr(P&,Size&,Index&) As Integer
    Result = 0
    If P& Then
       DefInt Count = PtrSize(P&)/Size&
       If Index& < Count Then
          Result = P& + Index&*Size&
       End If
    End If
End Function

Function NewZStr(S As String) As Integer
    DefStr cs = s + Chr$(0)
    DefInt Ptr = HeapAlloc(GetProcessHeap,HEAP_NO_SERIALIZE Or HEAP_ZERO_MEMORY,Len(cs))
    If Ptr Then
       If cs <> "" Then
          MemCpy Ptr,VarPTR(cs),Len(cs)
       Else
          Ptr = 0
       End If
    End If
    Result = Ptr
End Function

Function NewStr(S As String) As Integer
    DefStr Cs = s
    DefInt Ptr = HeapAlloc(GetProcessHeap,HEAP_NO_SERIALIZE Or HEAP_ZERO_MEMORY,Len(Cs))
    If Ptr Then MemCpy Ptr,VarPTR(Cs),Len(Cs)
    Result = Ptr
End Function

Function StrFromPtr(P&) As String
    DefStr s = ""
    If P& Then
       DefInt L = HeapSize(GetProcessHeap,HEAP_NO_SERIALIZE,P&)
       If L <> &HFFFFFFFF Then
          s = Space$(L)
          MemCpy VarPTR(s),P&,L
       Else
          s = ""
       End If
    End If
    Result = s
End Function

Function NewPtr(Size&,Typ&) As Integer
    DefInt Ptr = HeapAlloc(GetProcessHeap,HEAP_NO_SERIALIZE Or HEAP_ZERO_MEMORY,Size&)
    If Ptr Then If Typ& Then MemCpy Ptr,Typ&,Size&
    Result = Ptr
End Function

Function NewDouble(Typ#) As Integer
    DefDbl F = Typ#
    DefInt Ptr = HeapAlloc(GetProcessHeap,HEAP_NO_SERIALIZE Or HEAP_ZERO_MEMORY,SizeOf(Double))
    If Ptr Then MemCpy Ptr,VarPTR(F),SizeOf(Double)
    Result = Ptr
End Function

Function DoubleFromPtr(P&) As Double
    Dim M As QMemoryStream
    DefDbl F = 0
    If P& Then
       M.Position = 0
       M.MemCopyFrom(P&,8)
       M.Position = 0
       F = M.ReadNum(8)
       M.Close
       Result = F
    Else
       Result = 0
    End If
End Function

Function NewSingle(Typ!) As Integer
    DefSng F = Typ!
    DefInt Ptr = HeapAlloc(GetProcessHeap,HEAP_NO_SERIALIZE Or HEAP_ZERO_MEMORY,SizeOf(Single))
    If Ptr Then MemCpy Ptr,VarPTR(F),SizeOf(Single)
    Result = Ptr
End Function

Function SingleFromPtr(P&) As Single
    Dim M As QMemoryStream
    DefDbl F = 0
    If P& Then
       M.Position = 0
       M.MemCopyFrom(P&,4)
       M.Position = 0
       F = M.ReadNum(4)
       M.Close
       Result = F
    Else
       Result = 0
    End If
End Function

Function NewInteger(Typ&) As Integer
    DefInt I = Typ&
    DefInt Ptr = HeapAlloc(GetProcessHeap,HEAP_NO_SERIALIZE Or HEAP_ZERO_MEMORY,SizeOf(Integer))
    If Ptr Then MemCpy Ptr,VarPTR(I),SizeOf(Integer)
    Result = Ptr
End Function

Function IntFromPtr(P&) As Integer
    Dim M As QMemoryStream
    DefInt I = 0
    If P& Then
       M.Position = 0
       M.MemCopyFrom(P&,4)
       M.Position = 0
       I = M.ReadNum(4)
       M.Close
       Result = I
    Else
       Result = 0
    End If
End Function

Function NewShort(Typ%) As Integer
    DefShort I = Typ%
    DefInt Ptr = HeapAlloc(GetProcessHeap,HEAP_NO_SERIALIZE Or HEAP_ZERO_MEMORY,SizeOf(Integer))
    If Ptr Then MemCpy Ptr,VarPTR(I),SizeOf(Short)
    Result = Ptr
End Function

Function ShortFromPtr(P&) As Short
    Dim M As QMemoryStream
    DefInt I = 0
    If P& Then
       M.Position = 0
       M.MemCopyFrom(P&,2)
       M.Position = 0
       I = M.ReadNum(2)
       M.Close
       Result = I
    Else
       Result = 0
    End If
End Function

Function NewByte(Typ?) As Integer
    DefByte I = Typ?
    DefInt Ptr = HeapAlloc(GetProcessHeap,HEAP_NO_SERIALIZE Or HEAP_ZERO_MEMORY,SizeOf(Byte))
    If Ptr Then MemCpy Ptr,VarPTR(I),SizeOf(Byte)
    Result = Ptr
End Function

Function ByteFromPtr(P&) As Integer
    Dim M As QMemoryStream
    DefInt I = 0
    If P& Then
       M.Position = 0
       M.MemCopyFrom(P&,1)
       M.Position = 0
       I = M.ReadNum(1)
       M.Close
       Result = I
    Else
       Result = 0
    End If
End Function

Function NewByteArray(Count&) As Integer
    DefInt Ptr = HeapAlloc(GetProcessHeap,HEAP_NO_SERIALIZE Or HEAP_ZERO_MEMORY,SizeOf(Byte)*Count&)
    Result = Ptr
End Function

Function NewIntArray(Count&) As Integer
    DefInt Ptr = HeapAlloc(GetProcessHeap,HEAP_NO_SERIALIZE Or HEAP_ZERO_MEMORY,SizeOf(Integer)*Count&)
    Result = Ptr
End Function

Function NewSngArray(Count&) As Integer
    DefInt Ptr = HeapAlloc(GetProcessHeap,HEAP_NO_SERIALIZE Or HEAP_ZERO_MEMORY,SizeOf(Single)*Count&)
    Result = Ptr
End Function

Function NewDblArray(Count&) As Integer
    DefInt Ptr = HeapAlloc(GetProcessHeap,HEAP_NO_SERIALIZE Or HEAP_ZERO_MEMORY,SizeOf(Double)*Count&)
    Result = Ptr
End Function

Function NewStrArray(Count&) As Integer
    DefInt Ptr = HeapAlloc(GetProcessHeap,HEAP_NO_SERIALIZE Or HEAP_ZERO_MEMORY,SizeOf(String)*Count&)
    Result = Ptr
End Function

marcov
Posts: 3462
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Re: Allocate+DeAllocate vs GlobalAlloc/HeapAlloc

Post by marcov »

adeyblue wrote:The code of the _aligned functions are here. They've never been fast
Noting that HeapAlloc seems to default to align 16 in 64-bit apps anyway.
Probably because 64-bit on Windows prescribes SSE-2 for floating point. SSE2 is part of the x86_64 ABI, while it is optional for the x32 ABI.
Post Reply