Try to use the ZMM register

Windows specific questions.
Post Reply
fifoul
Posts: 15
Joined: Oct 17, 2005 13:20
Location: France

Try to use the ZMM register

Post by fifoul »

I have try to replace the YMM register by the ZMM register in this program but when program start, it crash immediatly, I don't know why ?
(no error appear in Freebasic compilation)
compiler command : "<$fbc>" "<$file>" -arch x86-64

Code: Select all

screenres 800,600,32
    
dim as single num1(0 to 19),num2(0 to 19),num3(0 to 19)

num1(0)=1.1
num1(1)=3.3
num1(2)=5.5
num1(3)=7.7
num1(4)=9.9
num1(5)=11.1
num1(6)=13.3
num1(7)=15.5      ' end of ymm
num1(8)=17.7      ' out of ymm

num2(0)=2.2
num2(1)=4.4
num2(2)=6.6
num2(3)=8.8
num2(4)=10.0
num2(5)=12.2
num2(6)=14.4
num2(7)=16.6      ' end of ymm
num2(8)=20.0      ' out of ymm

asm
 vmovups ymm0,ymmword ptr[num1]
 vmovups ymm1,ymmword ptr[num2]              
 vaddps  ymm2,ymm0,ymm1                         
 vmovups ymmword ptr[num3],ymm2                 
end asm

'asm
' vmovups zmm0,zmmword ptr[num1]
' vmovups zmm1,zmmword ptr[num2]                  
' vaddps  zmm2,zmm0,zmm1                          
' vmovups zmmword ptr[num3],zmm2
'end asm

?:?" 8 Single float addition test with YMM register"
?
?:? num1(0);" +";num2(0);" =";num3(0);"   -> first addition, it's the beginning of YMM register"
?:? num1(1);" +";num2(1);" =";num3(1)
?:? num1(2);" +";num2(2);" =";num3(2)
?:? num1(3);" +";num2(3);" =";num3(3)
?:? num1(4);" +";num2(4);" =";num3(4)
?:? num1(5);" +";num2(5);" =";num3(5)
?:? num1(6);" +";num2(6);" =";num3(6)
?:? num1(7);" +";num2(7);" =";num3(7);"   -> last addition because it's the end of YMM register"
?:? num1(8);" +";num2(8);" =";num3(8);"   -> no addition because it's outside of YMM register"

sleep
Last edited by fxm on Apr 20, 2023 19:40, edited 1 time in total.
Reason: Added code tags.
adeyblue
Posts: 299
Joined: Nov 07, 2019 20:08

Re: Try to use the ZMM register

Post by adeyblue »

The first question would probably be - do your OS & Processor support AVX-512?

Code: Select all

#include once "windows.bi"

#define XSTATE_AVX512_KMASK                 (5)
#define XSTATE_AVX512_ZMM_H                 (6)
#define XSTATE_AVX512_ZMM                   (7)
#define PF_AVX512F_INSTRUCTIONS_AVAILABLE   (41)

#define XSTATE_MASK_AVX512                  ((1ull Shl (XSTATE_AVX512_KMASK)) Or _
                                             (1ull Shl (XSTATE_AVX512_ZMM_H)) Or _
                                             (1ull Shl (XSTATE_AVX512_ZMM)))

dim as Long fails

If (GetEnabledXStateFeatures() And XSTATE_MASK_AVX512) <> XSTATE_MASK_AVX512 Then
    Print "OS support for AVX512 is not enabled"
    fails += 1
End If
If IsProcessorFeaturePresent(PF_AVX512F_INSTRUCTIONS_AVAILABLE) = 0 Then
    Print "CPU Support for AVX512 is not present"
    fails += 1
ElseIf fails = 0 Then
    Print "AVX512 should be enabled"
End If 
SARG
Posts: 1756
Joined: May 27, 2005 7:15
Location: FRANCE

Re: Try to use the ZMM register

Post by SARG »

Very few processors support this technology (mostly I9,Xeon).
And Intel drops it.

So it would not be not surprising if your processor doesn't have it.

On my PC I get an exception : EXCEPTION_ILLEGAL_INSTRUCTION
marcov
Posts: 3455
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Re: Try to use the ZMM register

Post by marcov »

Mostly the supporting processors are

On the Intel side:
- Certain 8th generation Cannon Lake products (low end server?) have AVX-512
- in the 10th and 11th generation Core with codename "Ice Lake" and "Rocket Lake", typically with a letter in the typename.
The Skylake derived desktop parts of those generations don't,
- The P-cores of the 12th and 13th generation do support AVX-12. P-Core only models might have a bios option for it, but Intel clamped don on that.

Some more XEON parts might have AVX512 too (and some even with more execution units (pipes) as desktop parts)

I expect that Intel thinks of something to have at least some AVX512 (not necessarily performant) in the E-Cores in some future update

On the AMD side:
- Ryzen 7xxx series implements AVX-512 using two AVX256 pipes, but despite that many *nix libs enable AVX-512 for these CPUs because it simply performs better (avoids decoding bottleneck?)

Easiest is on Linux to look in /proc/cpuinfo or on Windows download CPU-Z and look at the supported instruction sets.
Post Reply