Programming Languages Benchmark

caseih · Post by **caseih** » Sep 15, 2022 1:15

Looks good, @Vinion. Do you think that FB could use a compile-time generic typing system for use with containers, or are macros sufficient for your needs? It's a bit of a loaded question I guess because the last thing most people want is to make FB a dialect of C++.

Vinion · Post by **Vinion** » Sep 15, 2022 5:40

Ohh yes! A generic typing at compile time would be better than a macro. Perhaps something like ArrayList<String> or ArrayList<MyType>... I find Macros excellent declaring methods and stuff...

marcov · Post by **marcov** » Sep 15, 2022 12:26

While the benchmark is of course already perfect with Free Pascal occupying the top spot, I still have a minor discussion point:

your benchmark benchmarks both the filling as the multiplication of the matrix, and that makes it sensitive to the implementation (and relative speed) of random. I suggest to limit to only testing the multiplication

FPC has a relative slow random (Mersenne twister), and in the past often lost to older compilers with relatively fast (but imprecise) random implementations this way, making some benchmarks more about random() than the actual calculation.

Another minor discussion point might be turning on optimization for GCC of course, and stating what backend you use for FB (gas or gcc)

Vinion · Post by **Vinion** » Sep 15, 2022 15:43

To be honest I measured random number generation only with java and I saw that it wasn't so significant but I guess you are right. I will change it without measuring the random generation.

marcov wrote: ↑Sep 15, 2022 12:26 While the benchmark is of course already perfect with Free Pascal occupying the top spot, I still have a minor discussion point:

your benchmark benchmarks both the filling as the multiplication of the matrix, and that makes it sensitive to the implementation (and relative speed) of random. I suggest to limit to only testing the multiplication

FPC has a relative slow random (Mersenne twister), and in the past often lost to older compilers with relatively fast (but imprecise) random implementations this way, making some benchmarks more about random() than the actual calculation.

Another minor discussion point might be turning on optimization for GCC of course, and stating what backend you use for FB (gas or gcc)

dodicat · Post by **dodicat** » Sep 15, 2022 18:41

For those interested, here is gsl solving linear equations.
The LU bit is probably lower and upper triangular matrices representing the matrix, this method is faster than Gaussian elimination.
Also, for gls_blas, I found this page which explains some of the seemingly strange keywords in gsl.
https://sites.google.com/a/phys.buruniv ... h-gsl-blas

Code: Select all




#include "gsl/gsl_linalg.bi"
#include "gsl/gsl_blas.bi"



Dim As Double a_data(3,3) = {{ 0.18, 0.60, 0.57, 0.96}, _
                             {0.41, 0.24, 0.99, 0.58}, _
                             {0.14, 0.30, 0.97, 0.66}, _
                             {0.51, 0.13, 0.19, 0.85 }}

Dim As Double copy(3,3) = {{ 0.18, 0.60, 0.57, 0.96}, _
                           {0.41, 0.24, 0.99, 0.58}, _
                           {0.14, 0.30, 0.97, 0.66}, _
                           {0.51, 0.13, 0.19, 0.85 }}                             

Dim As  Double b_data(3) = { 1.0, 2.0, 3.0, 4.0 }

Print "original matrix"

For i As long = 0 To 3
      For j As long = 0 To 3
            Print copy(i,j);" ";
      Next
      Print
Next

Print "original right side vector"
For n As Long=0 To 3
      Print b_data(n) 
Next
print


Dim As   gsl_matrix_view m 
m = gsl_matrix_view_array (@a_data(0,0), 4, 4)

Dim As   gsl_vector_view b
b  = gsl_vector_view_array (@b_data(0), 4)

Dim As gsl_vector Ptr x = gsl_vector_alloc (4)

Dim As Long s

Dim As gsl_permutation Ptr p = gsl_permutation_alloc (4)

gsl_linalg_LU_decomp (@m.matrix, p, @s)

gsl_linalg_LU_solve (@m.matrix, p, @b.vector, x)

Print "answer vector"
For n As Long=0 To 3
      Print gsl_vector_get (x,n) 
Next


gsl_permutation_free (p)

Dim As gsl_matrix Ptr m2 = gsl_matrix_alloc(4,4)

For i As long = 0 To 3
      For j As long = 0 To 3
            gsl_matrix_set (m2, i, j, copy(i,j))
      Next
Next

Dim As gsl_vector Ptr result = gsl_vector_alloc(4)

gsl_blas_dgemv(CblasNoTrans, 1.0,m2,x,0.0,result)

print
Print "check"
printf (!"result = \n")
gsl_vector_fprintf (stdout, result, "%g")

Sleep

I had to copy the main matrix, it is seriously convoluted by the time the calculations are done.
I cannot make the matrix a constant in fb, which might negate the need for a copy (MIGHT)

caseih · Post by **caseih** » Sep 15, 2022 19:22

By the by, the PyPy JIT implementation of Python ran the original nested loop benchmark in 44 seconds on my machine, only twice as slow as FB. Pretty impressive for an interpreted language. I hadn't used PyPy in quite a while. PyPy managed the list comprehension version in 41 seconds.

Also with the -O 3 compiler flag, FB was 16.4 seconds on my computer. -O 2 was slightly faster, at 16.3 seconds. -O 1 was 16.4 seconds. I can't recall what all the differences in the optimization levels are. EDIT: This is 64-bit fbc on Linux, using the GCC backend.

@marcov's comment about the random number generator performance reminded me of this quote I saw recently:
"Anyone attempting to generate random numbers by deterministic means is, of course, living in a state of sin." -- John Von Neumann
Too funny.

@dodicat, maybe we should create a new tips and tricks topic about using GSL and include the examples posted here.

StarByte · Post by **StarByte** » Oct 08, 2022 17:57

Hello,

greetings to all. This is my first post.
Blitzmax-ng runs code about 25% faster than FreeBasic.

Code: Select all

Global fCount:Int=1024

Global fArrayA:Double[fCount,fCount]
Global fArrayB:Double[fCount,fCount]
Global fArrayC:Double[fCount,fCount]
Global benchmarkTests,i,j,k: Int
Global tmp:String
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Print ("BlitzMax-ng Matrix Multiplication Benchmark started. Please wait...")
Local Start:Int = MilliSecs()

For benchmarkTests = 0 To 4
				' Add random values to Arrays A and B (0 to 1)
   For i=0 To fCount - 1
      For j=0 To fCount - 1
          fArrayA[i,j] = Rnd()
          fArrayB[i,j] = Rnd()
          fArrayC[i,j] = 0
      Next
   Next 
  				' Multiply A to B values and add them to C
   For i = 0 To fCount - 1
      For j = 0 To fCount - 1
         For k = 0 To fCount - 1
		                    fArrayC[i,j]  = (fArrayC[i,j] + fArrayA[i,k]) * fArrayB[k,j]
         Next
      Next
   Next
Next ' benchmarkTests
Print "BlitzMax Matrix Muliplication Benchmark finished in:  "+(MilliSecs()-start)+" milliseconds."
Input "Press any key to exit"

I think Blitzmax-ng (no build options) is even faster than Pascal.

deltarho[1859] · Post by **deltarho[1859]** » Oct 08, 2022 23:55

In Windows 32-bit mode I get 28% faster (Quick Build) (FB: -w all -fpmode fast -fpu sse -arch 686 -gen gcc -O 2)

That is impressive, but we cannot write 'Blitzmax-ng runs code about 25% faster than FreeBasic.' with only one code test. Needless to say, the same goes with the opening post.

I replaced FB's Rnd default with MSWSII but it did not make much difference. The double 'For' using FB's Rnd used 0.028ms compared with MSWSII using 0.01ms.

StarByte · Post by **StarByte** » Oct 09, 2022 8:37

Either the benchmark is meaningful or it is not.
In this one case it is ...

marcov · Post by **marcov** » Oct 09, 2022 11:56

deltarho[1859] wrote: ↑Oct 08, 2022 23:55 That is impressive, but we cannot write 'Blitzmax-ng runs code about 25% faster than FreeBasic.' with only one code test.

Stronger even, using one file math problems are a pretty bad test (unless your target is one file math problems). It allows the compiler to perform optimizations that won't work in more sizeable programs. (or only with complex profiling and feedback)

caseih · Post by **caseih** » Oct 09, 2022 15:35

StarByte wrote: ↑Oct 09, 2022 8:37Either the benchmark is meaningful or it is not.

Correct. Most are not terribly relevant.

In this one case it is ...

What does this benchmark mean exactly? What exactly are you benchmarking? How does this apply to larger programs and projects?

I've never heard of Blitzmax-ng before. I'm unlikely to ever use it but it's neat that the language creator (you?) has made it freely available and provided a really nice web site, tutorial, and language reference for it. It's really cool that there are so many freely available languages and compiler implementations these days. It's wonderful for those wanting to learn about compiler design and programming language theory.

caseih · Post by **caseih** » Oct 09, 2022 15:47

marcov wrote: ↑Oct 09, 2022 11:56Stronger even, using one file math problems are a pretty bad test (unless your target is one file math problems). It allows the compiler to perform optimizations that won't work in more sizeable programs. (or only with complex profiling and feedback)

And even then most math problems are best solved with purpose-built libraries and tools like GSL, Numpy, etc as we've already shown. The naive approaches are too slow to be useful. In fact it is algorithms and heuristics that make computing fast, not the programming language, lest we end up measuring and optimizing all the wrong things. The most useful things I learned about programming in Uni were data structures, algorithms and heuristics, big O run-time analysis, and optimization strategies (caching, problem space search algorithms). Gave me a rigger that I didn't have before even though I'd been writing computer programs for many years.

Speaking of measuring and benchmarking the wrong things, in a uni class I was tasked with doing big O analysis of various sorting algorithms empirically. Running them and timing them, and then deriving the likely O() runtime. I wanted to make my programs a bit more interactive so I put a lot of print statements in my loops as the sort progressed. My TA laughed his head off when I apparently discovered that all the sorting algorithms were O(n)! Turns out I was benchmarking the print statement, not the sort itself. That was an embarrassing lesson to me! For those unfamiliar with big-O, the best general sorting algorithms are O(n log n), the worst are O(n^2).

deltarho[1859] · Post by **deltarho[1859]** » Oct 09, 2022 19:07

Quote of the month:

caseih wrote:In fact it is algorithms and heuristics that make computing fast, not the programming language,

I had a quick look at Blitzmax and found that it had two data types missing: 32-bit and 64-bit unsigned integers. Ouch!

The binaries are built similarly to FreeBASIC; C++ with gcc 8.1 as the backend.

marcov · Post by **marcov** » Oct 09, 2022 19:10

caseih wrote: ↑Oct 09, 2022 15:47
marcov wrote: ↑Oct 09, 2022 11:56Stronger even, using one file math problems are a pretty bad test (unless your target is one file math problems). It allows the compiler to perform optimizations that won't work in more sizeable programs. (or only with complex profiling and feedback)
And even then most math problems are best solved with purpose-built libraries and tools like GSL, Numpy, etc as we've already shown.

Only the class of math programs that are standalone, and you might as well go one step further and outsource it completely to a math academic

But from an application perspective such solutions often have issues, GSL has a license that isn't for everyone (GPL), and numpy pulls in an interpreter, and might cause problems if you try to run the solution in your own application's threads.

In fact it is algorithms and heuristics that make computing fast, not the programming language, lest we end up measuring and optimizing all the wrong things.

The biggest issue is always nailing the exact problem with some first and second order borderconditions. Because often they matter.

The most useful things I learned about programming in Uni were data structures, algorithms and heuristics, big O run-time analysis, and optimization strategies (caching, problem space search algorithms). Gave me a rigger that I didn't have before even though I'd been writing computer programs for many years.

Funny, one of the classes I remember fondly was a deep dive into a RDBMS internals, and it actually nuanced big O a bit. It was for guestimating DB query performance, and learned you deal also with cases where N might not be infinite (like big O does). That said, in such cases quadratic behaviour is usually solved with indexes, and then linear approximations worked fine, and also regarded the constant term.

dodicat · Post by **dodicat** » Oct 10, 2022 11:31

Hi StarByte.
fb is a little slow at accessing array elements, so I have to optimize for that by minimizing calls to farray.
Also got rid of rnd which will throw off comparison.
So testing double addition, multiplication, and arrays.

Code: Select all

#cmdline "-gen gcc -O 2"

dim shared fCount as Integer=1024

dim shared as double fArrayA(fCount,fCount)
dim shared as double fArrayB(fCount,fCount)
dim shared as double fArrayC(fCount,fCount)
dim shared as integer benchmarkTests,i,j,k
dim shared as string tmp
Print ("Fb-ng Matrix Multiplication Benchmark started. Please wait...")'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
var Start=timer
dim as double k2=0
For benchmarkTests = 0 To 4
' Add  values to Arrays A and B (0 to 1)
   For i=0 To fCount - 1
      For j=0 To fCount - 1
          fArrayA(i,j) = i/(1004.5+benchmarkTests)
          fArrayB(i,j) =i/(1023.6+benchmarkTests)
          fArrayC(i,j) = 0
      Next
   Next 
  				' Multiply A to B values and add them to C
   For i = 0 To fCount - 1
      For j = 0 To fCount - 1
            k2=0
         For k = 0 To fCount - 1
      k2 =  (k2+fArrayA(i,k)) * fArrayB(k,j)
      'fArrayC(i,j)  = (fArrayC(i,j) + fArrayA(i,k)) * fArrayB(k,j)
   Next
   fArrayC(i,j)=k2
      Next
   Next
Next ' benchmarkTests

Print "fb Matrix Muliplication Benchmark finished in:  "+str((timer-start)*1000)+" milliseconds."
print "sample"
print farrayc(2,2)
print "Press any key to exit"
sleep

likewise I change the blitz code

Code: Select all

Global fCount:Int=1024

Global fArrayA:Double[fCount,fCount]
Global fArrayB:Double[fCount,fCount]
Global fArrayC:Double[fCount,fCount]
Global benchmarkTests,i,j,k: Int
Global tmp:String
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''


Print ("BlitzMax-ng Matrix Multiplication Benchmark started. Please wait...")
Local Start:Int = MilliSecs()
Local k2:Double=0

For benchmarkTests = 0 To 4
	' Add  values to Arrays A and B (0 to 1)
   For i=0 To fCount - 1
      For j=0 To fCount - 1
          fArrayA[i,j] = i/(1004.5+benchmarkTests)
          fArrayB[i,j] = i/(1023.6+benchmarkTests)
          fArrayC[i,j] = 0
      Next
   Next 
  				' Multiply A to B values and add them to C
   For i = 0 To fCount - 1
      For j = 0 To fCount - 1
k2=0
         For k = 0 To fCount - 1
k2 =  (k2+fArrayA[i,k]) * fArrayB[k,j]
 ' fArrayC[i,j]  = (fArrayC[i,j] + fArrayA[i,k]) * fArrayB[k,j]
         Next
 fArrayC[i,j] = k2
      Next
   Next
Next ' benchmarkTests
Print "BlitzMax Matrix Muliplication Benchmark finished in:  "+(MilliSecs()-start)+" milliseconds."
Print "Sample "
Print farrayc[2,2]
Input "Press any key to exit"

results

Code: Select all

Fb-ng Matrix Multiplication Benchmark started. Please wait...
fb Matrix Muliplication Benchmark finished in:  42060.83159998525 milliseconds.
sample
 0.07049489402038511
Press any key to exit


Building untitled4
[ 23%] Processing:untitled4.bmx
[ 86%] Compiling:untitled4.bmx.console.release.win32.x64.c
[100%] Linking:untitled4.exe
Executing:untitled4.exe
BlitzMax-ng Matrix Multiplication Benchmark started. Please wait...
BlitzMax Matrix Muliplication Benchmark finished in:  47438 milliseconds.
Sample 
0.070494978113596257
Press any key to exit

conclusion.
An optimisation for one compiler doesn't mean the same optimisation for another compiler is beneficial (or something like that)
But thank you for the benchmarking.
I'll try freepascal later.

Programming Languages Benchmark

Re: Programming Languages Benchmark

Re: Programming Languages Benchmark

Re: Programming Languages Benchmark

Re: Programming Languages Benchmark

Re: Programming Languages Benchmark

Re: Programming Languages Benchmark

Re: Programming Languages Benchmark

Re: Programming Languages Benchmark

Re: Programming Languages Benchmark

Re: Programming Languages Benchmark

Re: Programming Languages Benchmark

Re: Programming Languages Benchmark

Re: Programming Languages Benchmark

Re: Programming Languages Benchmark

Re: Programming Languages Benchmark