OpenCL (Open Coumputer Language) for FreeBASIC .

Headers, Bindings, Libraries for use with FreeBASIC, Please include example of use to help ensure they are tested and usable.
D.J.Peters
Posts: 8586
Joined: May 28, 2005 3:28
Contact:

Re: OpenCL (Open Coumputer Language) for FreeBASIC .

Post by D.J.Peters »

the posted code was an early test on 32-bit (fixed now)

the line:
dim as cl_int nChars
must be:
dim as size_t nChars

Joshy
h4tt3n
Posts: 698
Joined: Oct 22, 2005 21:12
Location: Denmark

Re: OpenCL (Open Coumputer Language) for FreeBASIC .

Post by h4tt3n »

speedfixer wrote:Simply needed to install the dev packages, that's all.
Would you mind providing a link for that? Have been messing around with this and can't get anything to work.

Cheers,
Mike
speedfixer
Posts: 606
Joined: Nov 28, 2012 1:27
Location: CA, USA moving to WA, USA
Contact:

Re: OpenCL (Open Coumputer Language) for FreeBASIC .

Post by speedfixer »

Hi, Mike

To tell the truth, I have no clue which I chose: I simply selected from Synaptic those that looked appropriate.

I'm in Lubuntu (call it little Ubuntu - same kernel,smaller/lighter desktop, etc.)

If you are NOT in an Ubuntu system, let me know. Also, tell me what video card you have and its driver.

In Synaptic or whatever your software package manager may be - search for opencl and note the installed packages.
Something should be present for the video card you have installed.

If there is not - you should still have a large list of choices - libraries, tools, interfaces.

Something should match your vendor and/or your current video driver.
This particular system I am on is Nvidia.

Using the opencl search. I NOW show installed:

1 - clinfo - a OpenCL info tool - it will verify OpenCL capability
x - gdb - handy, but if you need this you know it, if you don't know it - don't bother
x - (gdbserver)
x - nvidia-cuda-doc
* - nvidia-opencl-dev
2 - nvidia-opencl-icd-367 (says transitional for 375 - I chose this)
x - nvidia-opencl-icd-375 (my nvidia driver is 375) --- choose one for YOUR driver
x - nvidia-profiler
x - nvidia-visual-profiler
* - ocl-icd-libopencl1 (should be installed with an icd)
3 - opencl-headers

x - should not be needed, but may be installed with some other choice.

Install clinfo - see what it says. Command line - no sudo needed. It may not run if your driver doesn't support OpenCL.

I would suggest installing one at a time until the compile works. No need for more than required, right?

We would all like to know how it worked out.


David
h4tt3n
Posts: 698
Joined: Oct 22, 2005 21:12
Location: Denmark

Re: OpenCL (Open Coumputer Language) for FreeBASIC .

Post by h4tt3n »

Thanks for the detailed answer!

This gives me something to work with. I could really use the extra speed this would provide.

I'm running Win 10 on a newer Lenovo laptop with an AMD Radeon graphics card with a R6 chipset and 10 cores.

Cheers,
Mike
D.J.Peters
Posts: 8586
Joined: May 28, 2005 3:28
Contact:

Re: OpenCL (Open Coumputer Language) for FreeBASIC .

Post by D.J.Peters »

@h4tt3n you need the WINDOWS AMD OpenCL SDK for your readon chip thats all.

Joshy
12val12newakk
Posts: 35
Joined: Nov 14, 2019 17:04

Re: OpenCL (Open Coumputer Language) for FreeBASIC .

Post by 12val12newakk »

Please can a working example
with transfer to the kernel and back two or three float4 ?
D.J.Peters
Posts: 8586
Joined: May 28, 2005 3:28
Contact:

Re: OpenCL (Open Coumputer Language) for FreeBASIC .

Post by D.J.Peters »

@12val12newakk I don't teach you in OpenCL you have to learn it self ;-)

But you can use my FreeBASIC OpenCL "template" as a starting point for your own experiments.

Try to understand all steps from the template !

If you got the details,
save it under a new name
and use float4 in the kernel !

Allocate the input / output buffers to the right size of float4 !

type cl_float4
as cl_float x, y, z, w
end type


NOTE: float4 in host memory must be 16 byte aligned !

By the way on github.com are sample OpenCL kernels to solfe the simple N-Body problem you working on.

Learning by doing have fun !

Joshy

file: tempate.bas

Code: Select all

' file: template.bas
#include "cl.bi"

#ifndef NULL
#define NULL cptr(any ptr,0)
#endif

' GLOBALS

' Input data is stored here.
dim shared as cl_float ptr g_input

' Output data is stored here.
dim shared as cl_float ptr g_output

' Multiplier is stored in this variable 
dim shared as cl_float     g_multiplier

' problem size (width) for 1D algorithm 
dim shared as cl_uint      g_width
' problem size for 2D algorithm (width x height)
dim shared as cl_uint      g_height

' The memory buffer that is used 
' as input/output for OpenCL kernel
dim shared as cl_mem            g_inputBuffer
dim shared as cl_mem            g_outputBuffer

dim shared as cl_context        g_context
dim shared as cl_device_id ptr  g_devices
dim shared as cl_command_queue  g_commandQueue
dim shared as cl_program        g_program

' This program uses only one kernel 
' and this serves as a handle to it
dim shared as cl_kernel         g_kernel

' FUNCTION DECLARATIONS
' OpenCL related initialisations are done here.
' Context, Device list, Command Queue are set up.
' Calls are made to set up OpenCL memory buffers that this program uses
' and to load the programs into memory and get kernel handles.
declare sub initializeCL()

' This is called once the OpenCL context, memory etc. are set up,
' the program is loaded into memory and the kernel handles are ready.
' It sets the values for kernels' arguments and enqueues calls to the kernels
' on to the command queue and waits till the calls have finished execution.
'
' It also gets kernel start and end time if profiling is enabled.
declare sub runCLKernels()

' Releases OpenCL resources (Context, Memory etc.)
declare sub cleanupCL()

' Releases program's resources
declare sub cleanupHost()

' Prints no more than 256 elements of the given array.
' Prints full array if length is less than 256.
' Prints Array name followed by elements.
declare sub print1DArray(arrayName as string, _
                         arrayData as integer ptr, _ 
                         length    as uinteger)

declare sub print1DFloatArray(arrayName as string, _
                              arrayData as cl_float ptr, _ 
                              length    as uinteger)

' Host Initialization
' Allocate and initialize memory on the host
sub initializeHost
  print "initializeHost"
  g_width  = 16
  g_input  = NULL
  g_output = NULL
  g_multiplier = 1.5

  ' Allocate and initialize memory used by host
  dim as cl_uint sizeInBytes = g_width * sizeof(cl_float)
  g_input = allocate(sizeInBytes)
  if (g_input= NULL) then
    print "Error: Failed to allocate input memory on host"
    return
  end if

  g_output = callocate(sizeInBytes)
  if (g_output=NULL) then
    print "Error: Failed to allocate output memory on host"
    return
  end if

  ' fill the input array
  for i as cl_uint = 0  to g_width-1
    g_input[i] = i
  next
  
end sub

' OpenCL related initialization
' Create Context, Device list, Command Queue
' Create OpenCL memory buffer objects
'  CL code, compile, link CL source
' Build program and kernel objects
sub initializeCL
  print "initializeCL"
  dim as cl_int status = 0
  dim as size_t deviceListSize

  dim as cl_uint numPlatforms
  status = clGetPlatformIDs(0, NULL, @numPlatforms)
  if (status<>CL_SUCCESS) or (numPlatforms<1) then
    print "Error: Getting Platforms. (clGetPlatformsIDs)"
    return
  end if

  dim as cl_platform_id platform
  status = clGetPlatformIDs(1, @platform, NULL)
  if (status<>CL_SUCCESS) then
    print "Error: Getting Platform Ids. (clGetPlatformsIDs)"
    return
  end if


  dim as cl_context_properties cps(2)
  cps(0) = CL_CONTEXT_PLATFORM
  cps(1) = cast(cl_context_properties, platform)
  cps(2) = NULL
  ' Create an OpenCL context from platform ID
  g_context = clCreateContextFromType(@cps(0), _
                                      CL_DEVICE_TYPE_GPU, _
                                      NULL, _
                                      NULL, _
                                      @status)
  if (status<>CL_SUCCESS) then
    print "Error: Creating Context. (clCreateContextFromType)"
    return
  end if

  ' First, get the size of device list data
  status = clGetContextInfo(g_context, _
                            CL_CONTEXT_DEVICES, _
                            0, _
                            NULL, _
                            @deviceListSize)
  if(status <> CL_SUCCESS) then
    print "Error: Getting Context Info (device list size, clGetContextInfo)"
    return
  else
    print "Info: deviceListSize = " & deviceListSize
  end if

  ' Detect OpenCL devices
  g_devices = callocate(deviceListSize)
  ' Now, get the device list data
  status = clGetContextInfo(g_context, _
                            CL_CONTEXT_DEVICES, _
                            deviceListSize, _
                            g_devices, _
                            NULL)
  if (status<>CL_SUCCESS) then
    print "Error: Getting Context Info (device list, clGetContextInfo)"
    return
  end if

  ' Create an OpenCL command queue
  g_commandQueue = clCreateCommandQueue(g_context, _
                                        g_devices[0], _
                                        0, _
                                        @status)
  if (status<>CL_SUCCESS) then
    print "Creating Command Queue. (clCreateCommandQueue)"
    return
  end if

  ' Create OpenCL memory buffers
  g_inputBuffer = clCreateBuffer(g_context, _
                                 CL_MEM_READ_WRITE or CL_MEM_USE_HOST_PTR, _
                                 sizeof(cl_float) * g_width, _
                                 g_input, _
                                 @status)
  if (status<>CL_SUCCESS) then
    print "Error: clCreateBuffer (inputBuffer)"
    return
  end if

  g_outputBuffer = clCreateBuffer(g_context, _
                                  CL_MEM_READ_WRITE or CL_MEM_USE_HOST_PTR, _
                                  sizeof(cl_float) * g_width, _
                                  g_output, _
                                  @status)
  if (status<>CL_SUCCESS) then
    print "Error: clCreateBuffer (outputBuffer)"
    return
  end if


  ' build CL program object
  ' create CL kernel object
  dim as string strSource
  strSource &= !"// Sample kernel which multiplies every element of the input array \n"
  strSource &= !"// with a constant and stores it at the corresponding output array \n"
  strSource &= !"__kernel void FloatTestKernel(__global  float * output, \n"
  strSource &= !"                              __global  float * input, \n"
  strSource &= !"                              const     float multiplier) { \n"
  strSource &= !"  uint tid = get_global_id(0); \n"
  strSource &= !"  output[tid] = input[tid] * multiplier; \n"
  strSource &= !"} \n"
  
  dim as zString ptr pSource = strptr(strSource)
  dim as size_t nChars = len(strSource)
  g_program = clCreateProgramWithSource(g_context, _
                                        1, _
                                        @pSource, _
                                        @nChars, _
                                        @status)
  if (status<>CL_SUCCESS) then
    print "Error: (clCreateProgramWithSource) !"
    beep:sleep:return
  end if

  ' create a cl program executable
  ' for the devices specified (one device in this case)
  status = clBuildProgram(g_program, 1, g_devices, NULL, NULL, NULL)
  if (status<>CL_SUCCESS) then
    print "Error: Building Program (clBuildProgram)"
    beep:sleep:return
  end if

  ' get a kernel object handle for
  ' a kernel with the given name
  g_kernel = clCreateKernel(g_program, "FloatTestKernel", @status)
  if (status<>CL_SUCCESS) then
    print "Error: Creating Kernel from program. (clCreateKernel)"
    return
  end if
end sub

' Run OpenCL program
' Bind host variables to kernel arguments
' Run the CL kernel
sub runCLKernels ()
  print "runCLKernels"
  dim as cl_int status
  dim as cl_event events(1)
  dim as size_t globalThreads(1)
  dim as size_t localThreads(0)

  globalThreads(0) = g_width
  localThreads(0)  = 1

  ' Set appropriate arguments to the kernel
  
  ' ARG 1 the output array to the kernel
  status = clSetKernelArg(g_kernel, _
                          0, _
                          sizeof(cl_mem), _
                          @g_outputBuffer)
  if (status<>CL_SUCCESS) then
    print "Error: Setting kernel argument. (output)"
    return
  end if

  ' ARG 2 the input array to the kernel
  status = clSetKernelArg(g_kernel, _
                          1, _
                          sizeof(cl_mem), _
                          @g_inputBuffer )
  if (status<>CL_SUCCESS) then
    print "Error: Setting kernel argument. (input)"
    return
  end if

  ' ARG 3 the multiplier
  status = clSetKernelArg(g_kernel, _
                          2, _
                          sizeof(cl_float), _
                          @g_multiplier )
  if (status<>CL_SUCCESS) then
    print "Error: Setting kernel argument. (multiplier)"
    return
  end if

  ' Enqueue a kernel run call.
  status = clEnqueueNDRangeKernel(g_commandQueue, _
                                  g_kernel, _
                                  1, _
                                  NULL, _
                                  @globalThreads(0), _
                                  @localThreads(0), _
                                  0, _
                                  NULL, _
                                  @events(0))
  if (status<>CL_SUCCESS) then
    print "Error: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel)"
    return
  end if

  ' wait for the kernel call to finish execution
  status = clWaitForEvents(1, @events(0))
  if (status<>CL_SUCCESS) then
    print "Error: Waiting for kernel run to finish. (clWaitForEvents)"
    return
  end if

  status = clReleaseEvent(events(0))
  if (status<>CL_SUCCESS) then
    print "Error: clReleaseEvent. (events[0])"
    return
  end if

  ' Enqueue readBuffer
  status = clEnqueueReadBuffer(g_commandQueue, _
                               g_outputBuffer, _
                               CL_TRUE, _
                               0, _
                               g_width * sizeof(cl_float), _
                               g_output, _
                               0, _
                               NULL, _
                               @events(1) )

  if (status<>CL_SUCCESS) then
    print "Error: clEnqueueReadBuffer failed. (clEnqueueReadBuffer)"
  end if

  ' Wait for the read buffer to finish execution
  status = clWaitForEvents(1, @events(1))
  if (status<>CL_SUCCESS) then
    print "Error: Waiting for read buffer call to finish. (clWaitForEvents)"
    return
  end if

  status = clReleaseEvent(events(1))
  if (status<>CL_SUCCESS) then
    print "Error: clReleaseEvent. (events[1])"
    return
  end if
end sub

' Release OpenCL resources (Context, Memory etc.) 
sub cleanupCL 
  print "cleanupCL"
  dim as cl_int status
  status = clReleaseKernel(g_kernel)
  if (status<>CL_SUCCESS) then
    print "Error: In clReleaseKernel"
    return
  end if
  status = clReleaseProgram(g_program)
  if (status<>CL_SUCCESS) then
    print "Error: In clReleaseProgram"
    return
  end if
  status = clReleaseMemObject(g_inputBuffer)
  if (status<>CL_SUCCESS) then
    print "Error: In clReleaseMemObject (inputBuffer)"
    return
  end if
  status = clReleaseMemObject(g_outputBuffer)
  if (status<>CL_SUCCESS) then
    print "Error: In clReleaseMemObject (outputBuffer)"
    return
  end if
  status = clReleaseCommandQueue(g_commandQueue)
  if (status<>CL_SUCCESS) then
    print "Error: In clReleaseCommandQueue"
    return
  end if
  status = clReleaseContext(g_context)
  if (status<>CL_SUCCESS) then
    print "Error: In clReleaseContext"
    return
  end if
end sub

' Releases program's resources
sub cleanupHost ()
 print "cleanupHost"
  if (g_input<>NULL) then
    deallocate g_input
    g_input = NULL
  end if
  if (g_output<>NULL) then
    deallocate g_output
    g_output = NULL
  end if
  if (g_devices<>NULL) then
    deallocate g_devices
    g_devices = NULL
  end if
end sub

' Print no more than 256 elements of the given array.
' Print Array name followed by elements.
sub print1DArray(arrayName as string, _
                 arrayData as integer ptr, _
                 length    as uinteger)
  dim as cl_uint i
  dim as cl_uint numElementsToPrint = iif(length>256,256,length)
  print arrayName
  for i = 0 to numElementsToPrint-1
    print arrayData[i] & " ";
  next
  print
end sub

sub print1DFloatArray(arrayName as string, _
                      arrayData as cl_float ptr, _
                      length    as uinteger)
  dim as cl_uint i
  dim as cl_uint numElementsToPrint = iif(length>256,256,length)
  print arrayName
  for i = 0 to numElementsToPrint-1
    print arrayData[i] & " ";
  next
  print
end sub

'
' main
'
' Initialize Host application 
initializeHost()

' Initialize OpenCL resources
initializeCL()

' print input array
print1DFloatArray("Input",g_input,g_width)

' Run the CL program
runCLKernels()

' Print output array
print1DFloatArray("Output",g_output,g_width)

' Releases OpenCL resources 
cleanupCL()

' Release host resources
cleanupHost()
sleep
D.J.Peters
Posts: 8586
Joined: May 28, 2005 3:28
Contact:

Re: OpenCL (Open Coumputer Language) for FreeBASIC .

Post by D.J.Peters »

I tested float4 successful for you !

Joshy

Code: Select all

' file: float4.bas 
' test of: float4

#include once "cl.bi"

sub fail(errcode as cl_int,msg as string)
  if (errcode<>CL_SUCCESS) then
    if ScreenPtr() then sleep 2000,1:screen 0
    print msg
    print getCLErrorString(errcode)
    beep:sleep:end
  end if
end sub

function GetPlatformID() as cl_platform_id
  dim as cl_platform_id patformid
  dim as cl_int failed = clGetPlatformIDs(1,@patformid, NULL)
  if failed then fail(failed,"Unable to enumerate platform IDs")
  return patformid
end function

function GetDeviceID(byval platformid as cl_platform_id, _
                     byval devicetype as cl_device_type=CL_DEVICE_TYPE_GPU) as cl_device_id
  dim as cl_device_id deviceid          
  dim as cl_int failed = clGetDeviceIDs(platformid,devicetype,1,@deviceid,NULL)
  'if (failed = CL_DEVICE_NOT_FOUND) andalso (devicetype=CL_DEVICE_TYPE_GPU) then
    ' try CPU also  
  '  failed = clGetDeviceIDs(platformid,CL_DEVICE_TYPE_CPU,1,@deviceid,NULL)
  'end if
  if failed then fail(failed,"Unable to enumerate GPU/CPU device IDs")
  return deviceid
end function

function CreateContext(deviceid as cl_device_id) as cl_context
  dim as cl_int failed
  dim as cl_context context = clCreateContext(0,1,@deviceid,NULL,NULL,@failed)
  if failed then fail(failed,"Unable to create context")
  return context
end function


function CreateCommandQueue(byval context as cl_context, _
                            byval deviceid as cl_device_id) as cl_command_queue
  dim as cl_int failed
  dim as cl_command_queue queue = clCreateCommandQueue(context,deviceid,0,@failed)
  if failed then fail(failed,"Unable to create command queue")
  return queue
end function

function CreateProgram(byval context as cl_context, _
                       byval deviceid as cl_device_id, _
                       byval sourcecode as string) as cl_program
  
  dim as cl_int failed
  dim as zstring ptr pCode = strptr(sourcecode)
  dim as cl_program program = clCreateProgramWithSource(context,1,@pCode,NULL,@failed)
  if failed or program=NULL then 
    fail(failed,"Unable to create program from source code")
  end if  
  failed = clBuildProgram(program,0,NULL,NULL,NULL,NULL)
  if failed then
    dim as size_t nChars
    clGetProgramBuildInfo(program,deviceid,CL_PROGRAM_BUILD_LOG,0,NULL,@nChars)
    dim as string strLog=space(nChars)
    clGetProgramBuildInfo(program,deviceid,CL_PROGRAM_BUILD_LOG,nChars,strptr(strLog),NULL)
    fail(failed,!"Unable to build program " & strLog)
  end if
  return program
end function

function CreateKernel(byval program as cl_program, _
                      byval kernelname as zstring ptr) as cl_kernel
  dim as cl_int failed
  dim as cl_kernel kernel = clCreateKernel(program,kernelname,@failed)
  if failed then
    clReleaseProgram(program)
    fail(failed,"Unable to create kernel")
  end if
  return kernel
end function

function CreateReadBuffer(byval context as cl_context, _
                          byval nBytes as size_t) as cl_mem
  dim as cl_int failed
  dim as cl_mem p = clCreateBuffer(context,CL_MEM_READ_ONLY,nBytes,NULL,@failed)
  if failed then fail(failed,"Unable to create READ_ONLY buffer")
  return p
end function

function CreateWriteBuffer(byval context as cl_context, _
                           byval nBytes as size_t) as cl_mem
  dim as cl_int failed
  dim as cl_mem p = clCreateBuffer(context,CL_MEM_WRITE_ONLY,nBytes,NULL,@failed)
  if failed then fail(failed,"Unable to create WRITE_ONLY buffer")
  return p
end function

function CreateReadWriteBuffer(byval context as cl_context, _
                               byval nBytes as size_t) as cl_mem
  dim as cl_int failed
  dim as cl_mem p = clCreateBuffer(context,CL_MEM_READ_WRITE,nBytes,NULL,@failed)
  if failed then fail(failed,"Unable to create READ_WRITE buffer")
  return p
end function

sub SetKernelArg(byval kernel    as cl_kernel, _
                 byval nArgument as cl_uint, _
                 byval ArgSize   as size_t, _
                 byval pValue    as any ptr)
  dim as cl_int failed = clSetKernelArg(kernel,nArgument,ArgSize,pValue)
  if failed then fail(failed,"Unable to set kernel arg " & nArgument)
end sub

function GetKernelWorkGroupSize(byval kernel as cl_kernel, _
                                byval deviceid as cl_device_id) as size_t
  dim as size_t WorkGroupSize
  dim as cl_int failed = clGetKernelWorkGroupInfo(kernel,deviceid,CL_KERNEL_WORK_GROUP_SIZE, sizeof(size_t),@WorkGroupSize,NULL)
  if failed then fail(failed,"Unable to get kernel work-group size")
  return WorkGroupSize
end function

function GetDeviceMaxWorkItemDimensions(byval deviceid as cl_device_id) as cl_uint
  dim as size_t dimensions
  clGetDeviceInfo(deviceid,CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS,sizeof(size_t),@dimensions,NULL)
  return dimensions
end function

function GetDeviceMaxWorkItemSizes(byval deviceid as cl_device_id, byval dimension as cl_uint) as size_t
  var maxdims = GetDeviceMaxWorkItemDimensions(deviceid)
  if dimension>maxdims then return 0
  dim as size_t sizes(maxdims-1)
  clGetDeviceInfo(deviceid,CL_DEVICE_MAX_WORK_ITEM_SIZES,maxdims*sizeof(size_t),@sizes(0),NULL)
  return sizes(dimension)
end function

'
' main
'
#if 1
type cl_float4
  as cl_float x,y,z,w
end type 
#else
type cl_float4
  union  : as cl_float x,r,s : end union
  union  : as cl_float y,g,t : end union
  union  : as cl_float z,b,u : end union
  union  : as cl_float w,a,v : end union
end type 
#endif

dim as string code
code  = !"__kernel void testFloat4( __global float4 *pBuffer, uint uWidth, uint uHeight) {  \n"
code &= !"  uint id = get_global_id(0); \n"
code &= !"  float x = (float)(id % uWidth) / (float)(uWidth); \n"
code &= !"  float y = (float)(id / uWidth) / (float)(uHeight); \n"
code &= !"  pBuffer[id] = (float4)(x,y,1-x,1); \n"
code &= !"} \n"

var platform      = GetPlatformID()
var device        = GetDeviceID(platform)
var context       = CreateContext(device)
var queue         = CreateCommandQueue(context,device)
var program       = CreateProgram(context,device,code)
var kernel        = CreateKernel(program,"testFloat4")
var workGroupSize = GetKernelWorkGroupSize(kernel,device)

'var nDimensions = GetDeviceMaxWorkItemDimensions(device)
'for i as integer=0 to nDimensions-1
'  print "workdim[" & i & "] = " & GetDeviceMaxWorkItemSizes(device,i)
'next



dim as integer w=256,h=256
'screeninfo w,h
'w*=.5 : h*=.5
dim as size_t globalSize = w*h
dim as cl_uint uWidth=w,uHeight=h

screenres w,h,32
screenset 1,0

var img = ImageCreate(w,h)
dim as ubyte ptr pPixel,pImagePixels
dim as integer iPitch
imageinfo img,,,,iPitch,pImagePixels
iPitch shr=2 ' bytes to pixels

var pBuffer = CreateReadBuffer(context,globalSize*sizeof(cl_float4))
SetKernelArg(kernel, 0, sizeof(cl_mem) ,@pBuffer)
SetKernelArg(kernel, 1, sizeof(cl_uint),@uWidth)
SetKernelArg(kernel, 2, sizeof(cl_uint),@uHeight)

dim as integer frames,fps
dim as double tNow,tLast=Timer

while inkey()=""

  ' run the kernel
  dim as cl_int failed = clEnqueueNDRangeKernel(queue,kernel,1,NULL,@globalSize, NULL, 0, NULL, NULL)
  if failed then fail(failed,"Unable to enqueue kernel")
  
  ' map the GPU buffer in HOST memory space  
  dim as cl_float4 ptr pFloat4 = clEnqueueMapBuffer(queue, pBuffer,CL_TRUE,CL_MAP_READ,0,globalSize*sizeof(cl_float4), 0, NULL, NULL, @failed)
  if failed then fail(failed,"Unable to map buffer") 
  
  ' read the mapped memory
  pPixel = pImagePixels
  for i as integer = 0 to globalSize-1
    pPixel[2] = pFloat4[i].x*255
    pPixel[1] = pFloat4[i].y*255
    pPixel[0] = pFloat4[i].z*255
    pPixel[3] = pFloat4[i].w*255
    pPixel+=4
  next
  ' put the image on screen  
  draw string img,(0,0),"frame: " & frames & " fps: " & fps,RGB(255,255,255)
  put (0,0),img,PSET
  
  ' unmap the memoy 
  failed = clEnqueueUnmapMemObject( queue, pBuffer, pFloat4, 0, NULL, NULL)
  if failed then fail(failed,"Unable to unmap buffer") 
  ' prepare the queue for next loop 
  failed = clFinish(queue)
  if failed then fail(failed,"Unable to finish the queue") 

  flip : frames+=1
  if frames mod 100=0 then
    tNow=timer() : fps=100/(tNow-tLast) : tLast=tNow
  end if
  ' sleep 10
wend

' free all resources
clReleaseMemObject(pBuffer)
clReleaseKernel(kernel)
clReleaseCommandQueue(queue)
clReleaseContext(context)
Last edited by D.J.Peters on Dec 21, 2019 19:33, edited 1 time in total.
12val12newakk
Posts: 35
Joined: Nov 14, 2019 17:04

Re: OpenCL (Open Coumputer Language) for FreeBASIC .

Post by 12val12newakk »

D.J.Peters Thank you for your help !
I could not run this code
But test01 and test02 work well
"
Unable to build program C:\Users\I54590~1\AppData\Local\Temp\\OCL6836T1.cl:1:15: error: kernel cannot be called 'main'
__kernel void main( __global float4 *dst, uint width, uint height) {
^
1 error generated.
error: Clang front-end compilation failed!
Frontend phase failed compilation.
Error: Compiling CL to IR "
PS
"main" forbidden name for __kernel ?
i rename to "openCL_sub" / and this work
-----
did not understand float4
can a specific example for dummies?
how directly convert x(i), y(i), z(i), w(i)--> float4
Vx(i),Vy(i),Vz(i),Vw(i)--> float4 ?

I want to run this kernel.
https://github.com/ubernaut/stableorbit ... y/part2.cl
and then change the gravitational potential to the one I needL-j potential
then enter events and watch the reaction of the system
badidea
Posts: 2586
Joined: May 24, 2007 22:10
Location: The Netherlands

Re: OpenCL (Open Coumputer Language) for FreeBASIC .

Post by badidea »

12val12newakk wrote: how directly convert x(i), y(i), z(i), w(i)--> float4
Vx(i),Vy(i),Vz(i),Vw(i)--> float4 ?
Don't use x(i), etc. Use something(i).x, see:

Code: Select all

#define cl_float single '<-- I don't have cl_float

type cl_float4
	as cl_float x, y, z, w
end type

const N_OBJ = 100

dim as cl_float4 p(N_OBJ-1) 'position
dim as cl_float4 v(N_OBJ-1) 'velocity

for i as integer = 0 to N_OBJ-1
	print p(i).x, p(i).y, p(i).z, p(i).w
	print v(i).x, p(i).y, v(i).z, v(i).w
next
12val12newakk
Posts: 35
Joined: Nov 14, 2019 17:04

Re: OpenCL (Open Coumputer Language) for FreeBASIC .

Post by 12val12newakk »

D.J.Peters
-----
your examples do some direct copying.i can't use it
array host_ pos (float4 ) to CreateReadWriteBuffer
array host_vel (float4 ) to CreateReadWriteBuffer
iteration calculation
CreateReadWriteBuffer to array host_ pos (float4 )
CreateReadWriteBuffer to array host_vel (float4 )

please without "clEnqueueMapBuffer" " clEnqueueUnmapMemObject"
12val12newakk
Posts: 35
Joined: Nov 14, 2019 17:04

Re: OpenCL (Open Coumputer Language) for FreeBASIC .

Post by 12val12newakk »

Temporarily abandoned float4
I can’t expand your examples .. no way
Trying to add the number of transferred arrays
cloning
(arguments) in kernel everything falls apart.
I tearfully ask for bidirectional parameter transfer (ReadWrite) for such a simplified kernel

Code: Select all

__kernel void SimpleKernel  ( __global float * posx,
		               __global float * posy
		               __global float * posz, 
		               __global float * velx, 
		               __global float * vely, 
		               __global float * velz,  
                               __global float * mass,
		               __global int  *clr,
                               __const     float  dt)
      { 
        posx= posx+dt;
        posy= posy+dt;
        posz= posz+dt;
        velx=velx-dt; 
        vely=vely-dt;
        velz=velz-dt;                  
        mass=mass+dt;
        clr=clr-1;         
                }
can you specify the sizes of arrays directly N_OBJ * 4?
without width height .. I do not draw pictures. I arrange the particles and calculate their integral interaction through pair
I draw only circles with a radius of zero force
D.J.Peters
Posts: 8586
Joined: May 28, 2005 3:28
Contact:

Re: OpenCL (Open Coumputer Language) for FreeBASIC .

Post by D.J.Peters »

D.J.Peters wrote:The Open Coumputer Language for FreeBASIC.

download: OpenCL.zip from: Jan 03, 2020

import lib's 32/64-bit and the OpenCL specs. are included.

Don't forget the "readme" ;-)

Joshy

Simple test:

The default OpenGL shader from shadertoy.com as OpenCL version:

Code: Select all

#include "cl.bi"

dim as string CODE
CODE &= !"kernel void shader (global uchar4 * pixels, uint width, uint height, float time) \n"
CODE &= !"{\n"
CODE &= !"  uint ix = get_global_id(0); \n"
CODE &= !"  uint iy = get_global_id(1); \n"
CODE &= !"  uint i  = ix+width*iy;      \n"
CODE &= !"  float x = ix/(float)width;  \n"
CODE &= !"  float y = iy/(float)height; \n"
CODE &= !"  uchar r = (uchar)(x*255.f); \n"
CODE &= !"  uchar g = (uchar)(y*255.f); \n" 
CODE &= !"  uchar b = (uchar)(255.f * (0.5f + 0.5f*sin(time))); \n" 
CODE &= !"  pixels[i] = (uchar4)(b,g,r,255); \n"
CODE &= !"}\n"

dim as zstring ptr pCode = strptr(CODE)
dim as cl_platform_id platform
' get first platform
dim as cl_int status = clGetPlatformIDs(1,@platform,NULL)


' get first device from platform try a GPU device at first
dim as cl_device_id device
status = clGetDeviceIDs(platform,CL_DEVICE_TYPE_GPU,1,@device,NULL)
if status = CL_INVALID_DEVICE_TYPE then
  ' fallback: try a CPU device also
  status = clGetDeviceIDs(Platform,CL_DEVICE_TYPE_CPU,1,@device,NULL)
end if
if status then
  print "error: can't create OpenCL device !"
  flip : beep : sleep : end 1
end if

' create a device context
dim as cl_context context = clCreateContext(0,1,@device,NULL,NULL,@status)

' create program object in context from sourcecode
dim as cl_program program = clCreateProgramWithSource(context,1,@pCode,NULL,@status)

' compile and link the program
status = clBuildProgram(program, 0, NULL, NULL, NULL, NULL)
if status then
  print "error: can't create compile unand link kernel !"
  dim as size_t nChars ' get size of log
  clGetProgramBuildInfo(program,device,CL_PROGRAM_BUILD_LOG,0,0,@nChars)
  dim as string strLog=space(nChars)
  ' get log from compiler/linker
  clGetProgramBuildInfo(program,device,CL_PROGRAM_BUILD_LOG,nChars,strptr(strLog),NULL)
  print strLog 
  flip : beep : sleep : end 1
end if

' create a kernel object
dim as cl_kernel kernel = clCreateKernel(program, "shader", @status)

' get size of workgroup from device
dim as size_t WorkGroupSize
status = clGetKernelWorkGroupInfo(kernel,device,CL_KERNEL_WORK_GROUP_SIZE, sizeof(size_t),@WorkGroupSize,NULL)

' calculate local worksize
'(must be workgroupsize div power of two)
dim as size_t LocalSize = WorkGroupSize
while LocalSize*LocalSize > WorkGroupSize
  LocalSize shr=1
wend

dim as cl_int GlobalWidth=640,GlobalHeight = GlobalWidth/16*9

' calculate global worksize (should be a multiply of local worksize
GlobalWidth -=GlobalWidth  mod LocalSize
GlobalHeight-=GlobalHeight mod LocalSize

screenres GlobalWidth,GlobalHeight,32,2
screenset 1,0

windowtitle "global: " & GlobalWidth & " x " & GlobalHeight & " local: " & LocalSize & " x " & LocalSize & " worksize"

dim as any ptr pixels,img=imagecreate(GlobalWidth,GlobalHeight)
imageinfo img,,,,,pixels
dim as size_t bufSize = GlobalWidth*GlobalHeight*4
dim as size_t globalSizes(...) => {GlobalWidth,GlobalHeight}
dim as size_t localSizes (...) => {LocalSize  ,LocalSize}

' create a memory buffer in the context
dim as cl_mem buffer = clCreateBuffer(context, CL_MEM_READ_WRITE, bufSize, NULL, @status)

' set the parameters of the kernel
status = clSetKernelArg(kernel, 0, sizeof(cl_mem ),@buffer)
status = clSetKernelArg(kernel, 1, sizeof(cl_uint),@GlobalWidth)
status = clSetKernelArg(kernel, 2, sizeof(cl_uint),@GlobalHeight)

' create command queue
dim as cl_command_queue queue = clCreateCommandQueue(context,device,0,@status)
if status then
  print "error: can't create a command queue !"
  flip : beep : sleep : end 1
end if

dim as integer frames,fps
dim as double  tStart=timer
dim as double  tLast =tStart
while inkey()=""
  dim as single runtime = timer()-tStart
  ' set the time parameter
  status = clSetKernelArg(kernel, 3, sizeof(cl_float),@runtime)
  ' execute the "2D" kernel
  status = clEnqueueNDRangeKernel(queue,kernel,2, NULL, @globalSizes(0), @localSizes(0),0,NULL,NULL)
  ' read/copy the device buffer in the image buffer
  status = clEnqueueReadBuffer   (queue,buffer,CL_TRUE,0,bufSize,pixels,0,NULL,NULL)
  ' draw the runtime and fps in the image
  draw string  img,(0,0),"fps: " & fps & " time: " & runtime 
  ' put it on the screen
  put (0,0),img,PSET
  ' make hiden page visible
  flip
  ' count the frames
  frames+=1
  ' update every 60 frame the fps
  if frames mod 60=0 then
    var tNow=timer()
    fps=60/(tNow-tLast)
    tLast=tNow
  end if
wend

' free all resources
clReleaseCommandQueue queue
clReleaseProgram      program
clReleaseMemObject    buffer
clReleaseContext      context
srvaldez
Posts: 3373
Joined: Sep 25, 2005 21:54

Re: OpenCL (Open Coumputer Language) for FreeBASIC .

Post by srvaldez »

hello D.J.Peters
size_t is not defined, had to #include "crt\ctype.bi" after that it compiled and ran OK
thank you for this and a happy new year :-)
D.J.Peters
Posts: 8586
Joined: May 28, 2005 3:28
Contact:

Re: OpenCL (Open Coumputer Language) for FreeBASIC .

Post by D.J.Peters »

The latest version Jan 2020 is independent from C runtime !

you have to use "cl_size_t" as replacement for "size_t" now

Joshy

from file "cl.bi"

Code: Select all

'#include "crt/stdint.bi"
'#include "crt/stdlib.bi"
...
' added
type cl_size_t as uinteger ' 32/64 bit
Post Reply