Also: Why do you push and pop the rdi register? Is this good practice but not really necessary here or is this actually needed here as well?
I assume that that the assembler instructions are for 64 bit.
I get a pile of errors here on 32 bit windows.
Yes this is for 64-bit. It uses the 8-byte registers. This here should work for 32-bit:
Code: Select all
mov edi, [p]
mov eax, 0x05050505
mov ecx, DIMSIZE * DIMSIZE * DIMSIZE * DIMSIZE * DIMSIZE / 4
dodicat wrote:A stroke of luck that Provoni needs only byte or ubyte arrays, otherwise looping in the values would be only way I reckon!
MichaelW's assembler code solves this problem. He just puts the 1-byte number 5, eight times into the 8-byte register rax. This is then written to the memory as in memset but in 8-byte steps with "rep stosq". You could also put an 8-byte number once or a 4-byte number twice into this register.
The downside, of course, is the reliance on the assembler code. I can think of another alternative to a simple loop. You can use memcpy. Write the first 32-or-so-bytes in a loop (or even hardcode them) and then copy the 32 bytes with mempcy to get a total of 64 bytes filled. Then copy the 64 bytes to get a total of 128 bytes filled. And so on. This isn't terribly fast but faster than a simple loop. I get the following timings for writing integer values in 32-bit:
simple loop 4.46 seconds
memcpy 1.37 seconds
rep stosd 0.45 seconds
a really nice piece of code. However, you've used what I call "The Disc-Manufacturers Cheat" to calculate Bytes:
which, should actually be used ... to obtain correct results.
To be fair, Kilo, Mega, Giga and so on are SI prefixes standing for 10^3,10^6,10^9 and so on. The IEC has introduced a new binary prefix to resolve this problem. So now it is on us to change our habits and accept that 10^3 Bytes are one KB and 2^10 Bytes are one KiB ;-)