Skip navigation
NASA Logo, National Aeronautics and Space Administration
Currently Being Moderated

Basic Comparison of Python, Julia, Matlab, IDL and Java (2018 Edition)

VERSION 74  Click to view document history
Created on: Feb 5, 2018 2:51 PM by Jules Kouatchou - Last Modified:  Sep 14, 2018 10:33 AM by Jules Kouatchou

Announcement: We have started the process of making this project open source. The source codes are being rewritten for clarity, simplicity and consistency. As soon as the process is completed, all the new codes and running scripts will be made available.

 

-------------------------------------------------------------------------------- -----------------------------------

 

HISTORY:

  • September 13, 2018: Added R numbers for the Fibonacci Number test case (Problem 1)
  • September 13, 2018: Corrected R numbers for the Laplace Equation test case (Problem 5)

 

 

This report is the continuation of the work done in:

 

Basic Comparison of Python, Julia, R, Matlab and IDL

 

 

Here we:

  1. Add new versions of languages
  2. Add JAVA
  3. Add more test cases.
  4. For each language, consistantly use the same method to measure the elapsed time.
  5. Provide source codes for all the test cases.
  6. Present all the timing results to the fourth digit accuracy (any number less tha 0.0001 is rounded to 0).

 

While reading this report, be mindful of the following:

  • Our intention is not to claim that one language is better than the other.
  • In our work, we are often asked to address users’ issues on the computing languages Python, Matlab, IDL, etc. We only have few hours to understand the coding principles of those languages and quickly write codes that resolve users’ issues. We present results in the point of view of a novice programmer.
  • If you are an advanced programmer or a language developer and you have results (obtained with optimization techniques) you want to share, feel free to contact us (with a web link) and we wil provide a link to your results here.

 

All the experiments presented here were done on Intel Xeon Haswell processor node. Each node has 28 cores (2.6 GHz each) and 128 Gb of available memory. We consider the following versions of the languages:


LanguageVersion
Free?
Python2.7.1Yes
Julia0.6.2Yes
JAVA1.8.0_92Yes
IDL8.5No
MatlabR2017aNo
R
Yes
GNU Compilers 7.3Yes
Intel Compilers18.0.1.163No
Scala2.12.4Yes

 

 

Problem 1: Fibonacci Number

The Fibonacci numbers are the sequence of numbers {F_n}_(n=1)^infty defined by the linear recurrence equation:

 

 

 F_n=F_(n-1)+F_(n-2)

 

 

with F_1=F_2=1.

 

Fibonacci numbers find applications in the fields of economics, computer science, biology, combinatoric, etc.

 

We implemented both the iterative method and the recursive one, and we record the elepased time for generating the Fibonacci numbers for a given n.

 

 

LanguageOptionn=25n=35n=45
Python
000
Python + Numba
000
Julia
000
IDL
000
Matlab
0.00980.00320.0025
R
0.0340.0340.034
JAVA
000
Scala
000
Fortrangfortran000

gfortran -O3000

ifort000

ifort -O3000
Cgcc000

gcc -Ofast000

icc000

icc -Ofast000

 

Table 1.1: Elapsed times (in seconds) obtained by computing Fibonacci numbers using then iterative method.

 

 

Language
Optionn=25n=35n=45
Python
0.02112.5284311.2046
Python + Numba
0.030.18.82
Julia
00.03354.130
IDL
0.03012.2573304.2285
Matlab
0.01280.514958.9283
R
0.0080.0080.008
JAVA
0.00160.04144.8609
Scala
0.0010.0455.289
Fortrangfortran00.084010.4326

gfortran -O300.02803.4602

ifort000

ifort -O3000
Cgcc00.045.07

gcc -Ofast00.011.66

icc00.023.15

icc -Ofast00.023.07

 

Table 1.2: Elapsed times (in seconds) obtained by computing Fibonacci numbers using the recursive method.

 

Problem 2: Copy Arrays

This test case is meant to show how fast languages access non-contiguous memory locations.

 

Consider an arbitrary nxnx3 matrix A. We want to perform the following operations on A:

A(i,j,1) = A(i,j,2)
A(i,j,3) = A(i,j,1)
A(i,j,2) = A(i,j,3)

For instance, in Python the code looks like:

 

for i in range(n):
     for j in range(n):
          A[i,j,1] = A[i,j,2]
          A[i,j,3] = A[i,j,1]
          A[i,j,2] = A[i,j,3]

 

The above code segment uses loops. We are also interested on how the same operations are done using vectorization:

 

A[:,:,1] = A[:,:,2]
A[:,:,3] = A[:,:,1]
A[:,:,2] = A[:,:,3]

The problem allows us to see how each language handles loops and vectorization. We record the elapsed time needed to do the array assignments.

 

 

LanguageOptionn=5000n=7000n=9000
Python
18.605537.127961.0172
Python + Numba
0.260.260.34
Julia
0.09070.13860.2274
IDL
6.877313.242221.9349
Matlab
0.27870.52230.8437
R
19.75038.63563.820
JAVA
0.14200.26800.4350
Scala
0.2040.3490.51
Fortrangfortran0.17600.34800.5760

gfortran -O30.07200.13600.2200

ifort0.06800.13600.2120

ifort -O30.06800.13200.2120
Cgcc0.17000.34000.5700

gcc -Ofast0.09000.18000.2900

icc0.09000.18000.3000

icc -Ofast0.09000.18000.3000

 

Table 2.1: Elapsed times (in seconds) obtained by copying a matrix using loops.

 

 

LanguageOptionn=5000n=7000n=9000
Python
0.49530.96891.5962
Python + Numba
0.8341.291.96
Julia
0.29260.54710.8964
IDL
0.40910.80931.3315
Matlab
0.28450.58410.9193
R
2.9565.7859.566
Fortrangfortran0.09600.24800.3080

gfortran -O30.09200.18400.3040

ifort0.12000.23200.3760

ifort -O30.12000.23200.3880

 

Table 2.2: Elapsed times (in seconds) obtained by copying a matrix using vectorization.

 

Problem 3: Matrix Multiplication

We multiply two randomly generated nxn matrices A and B:

 

C=AxB

 

This problem shows the importance of taking advantage of built-in libraries available in each language. The elapsed times presented here only measure the times spent on the multiplication (as the size of the matrix varies).

 

 

LanguageOptionn=1500n=1750n=2000
Pythonintrinsic0.580.960.97
Python + Numba(loop)3.646.3313.57
Juliaintrinsic0.14940.23910.3497
IDLintrinsic0.30280.36130.4797
Matlabintrinsic0.95670.25750.2943
R
0.9201.1580.951
JAVA(loop)6.853013.470029.2320
Scala(loop)9.25814.48223.363
Fortrangfortran (loop)17.245031.229960.1837

gfortran -O3 (loop)3.32025.304312.3367

gfortran (matmul)0.35200.56000.8280

gfortran -O3 (matmult)0.34800.55600.7840

ifort (loop)1.14001.80813.1001

ifort -O3 (loop)0.52000.82401.2760

ifort (matmul)1.14001.81212.9001

ifort -O3 (matmul)1.14001.81212.9881

ifort (DGEMM)0.21200.22800.3320
Cgcc (loop)13.490020.960031.4800

gcc -Ofast (loop)1.35002.39004.3700

icc (loop)1.21002.16004.0200

icc -Ofast (loop)1.15001.70002.6600

 

Table 3.1: Elapsed times (in seconds) obtained by multiplying two randomly generated matrices.

 

 

Problem 4: Gauss-Legendre Quadrature

The Gauss-Legendre quadrature formulas approximate the integral of a functionby a weighted sum of function-values. When m function-values are used, the formula is exact for polynomials of degree zero through 2m — 1.

 

 

LanguageOptionn=50n=75n=100
Python
0.13450.01830.0186
Julia
1.29621.35531.3556
IDL
0.00060.00090.0014
R



JAVA



Matlab
0.77390.71970.0853
Fortrangfortran00.0040.008

gfortran -O300.0040.008

ifort00.0040.008

ifort -O300.0040.008
Cgcc



gcc -Ofast



icc



icc -Ofast


 

Table 4.1: Elapsed times (in seconds) obtained by performing the Gauss-Legendre qudrature.

 

 

 

Problem 5: Numerical Approximation of the 2D Laplace Equation

We find the numerical solution of the 2D Laplace equation:

 

Uxx + Uyy = 0

 

We use the Jacobi iterative solver. We are interested in fourth-order compact finite difference scheme (Gupta, 1984):

 

Ui,j =  (4(Ui-1,j + Ui,j-1 + Ui+1,j + Ui,j+1) + Ui-1,j-1 + Ui+1,j-1 + Ui+1,j+1 + Ui-1,j+1 )/20

 

The Jacobi iterative solver stops when the difference of two consecutive approximations falls below 10^{-6}.

 

LanguageOptionn=100n=150n=200
Python
142.7886705.2682188.007
Python + Numba
1.27645.426216.396
Julia
1.03095.172416.1657

optimized1.09875.503917.1473

optimized_smind0.62153.02899.4964
IDL
83.6360416.55231298.777
Matlab
1.81994.99149.1465
R
128.131635.6741971.329
JAVA
0.48502.02105.5980
Scala
0.5452.2896.202
Fortrangfortran0.8403.80010.945

gfortran -O30.6683.0688.881

ifort0.53602.46807.1520

ifort -O30.53602.46407.1520
Cgcc0.5002.42007.7000

gcc -Ofast0.21001.04003.1800

gcc -fPIC -Ofast -O3 -xc -shared 1.14105.595317.3381

icc0.45002.23006.7900

icc -Ofast0.32001.60004.8700

 

Table 5.1: Elapsed times (in seconds) obtained by numerically solving the Posson equation using a Jacobi iterative solver with loops.

 

 

 

LanguageOptionn=100n=150n=200
Python
2.320910.763841.2477
Python + Numba
3.502112.518636.1285
Juliaoptimized_vectorized2.378714.094442.1255
IDL
1.915910.132032.2211
Matlab
3.51026.471016.4999
R
21.177102.229333.366
Fortrangfortran0.8763.94811.329

gfortran -O30.35601.78805.0880

ifort0.30001.54404.4400

ifort -O30.28401.56804.4520

 

Table 5.2: Elapsed times (in seconds) obtained by numerically solving the Posson equation using a Jacobi iterative solver with vectorization.

 

 

Problem 6: Belief Propagation

The Belief Propagation can be applied to fields such as speech recognition, computer vision, image processing, medical diagnostics, parity check codes, etc. Its calculations involve a repeated sequence of matrix multiplications, followed by normalization. The Matlab, C and Julia codes are shown in the Justin Domke's weblog (Domke 2012). We report the computing times for various values of the number of iterations (N) when the matrix dimension is 5000x5000.


 

LanguageOptionn=250n=500n=1000
Python
4.81867.324013.9176
Julia
3.9577.68414.855
IDL
18.322935.897771.0820
Matlab
2.62994.07086.8691
R
25.46346.98592.654
JAVA
321.403642.3951284.106
Fortrangfortran22.557439.922489.9696

gfortran -O35.16039.588518.7051

ifort4.60828.860517.3810

ifort -O34.63228.732517.4130
Cgcc2.64005.280010.5700

gcc -Ofast2.35004.72009.4400

icc1.45002.90005.8000

icc -Ofast1.44002.90005.8100

Table 6.1: Elapsed times (in seconds) obtained by doing the Belief Propagation computations.

 

Problem 7: Metropolis-Hastings Algorithm

The Metropolis–Hastings (M–H) algorithm is a method for obtaining random samples from a probability distribution. We perform calculations for the implementation of a Metropolis-Hastings algorithm using a two dimensional distribution (Domke 2012). Results are shown when the number of iterations (N) varies.


 

LanguageOptionn=5000n=10000n=15000
Python
0.026420.06370.0937
Julia
0.00020.00040.0006
IDL
0.00580.02190.0291
Matlab
0.01640.01940.0276
R
0.1050.1660.24
JAVA
0.0060.0070.009
Scala
0.0090.0120.014
Fortrangfortran00.00400.0040

gfortran -O3000

ifort000

ifort -O300.00400
Cgcc000

gcc -Ofast000

icc000

icc -Ofast000

 

Table 7.1: Elapsed times (in seconds) obtained by doing the Metropolis algorithm computations.

 

 

 

Problem 8: Manipulation of netCDF Files

We have a set of daily NetCDF files (7305) covering a period of 20 years (1990-2009). The files for a given year are in a sub-directory labeled YYYY (for instance 1990, 1991, 1992, etc.). We want to write a script that  opens each file, reads a three-dimensional variable (longitude/latitude/level), and manipulates it. A pseudo code for the script reads:

 

 

Loop over the years
         Obtain the list of NetCDF files
         Loop over the files
                  Read the variable (longitude/latitude/level)
                  Compute the zonal mean average (new array of latitude/level)
                  Extract the column array at latitude 86 degree South
                  Append the column array to a "master" array (or matrix)

 

The goal here is to be able to do a generate the data to do a contour plot that looks like (obtained with Python):


 

This is the kind of problems that a typical user we support faces: a collection of thousands of files that needs to be manipulated to extract the desired information. Having tools that allow us to quickly read data from files (in formats such as NetCDF, HDF4, HDF5, grib) is critical for the work we do.

 

Note that unlike  in Problem 4 of the previous report (where the daily files are in directories associated with months), the daily files to be read in in this case are stored in directories associated with the years. The access to the files is easier in this current problem and we expect the timing numbers to be reduced.

 

We report in Table 8.1 the elapsed times it took to solve Problem 8 with the various languages.

 

 

 

Language
Elapsed Time (s)
Python558.4496
Julia580.5683
IDL504.5634
Matlab646.2261

Table 8.1: Elapsed time (in seconds) obtained by manipulating 7305 NetCDF files on a single processor.

 

 

All the above runs were conducted on a node that has 28 cores. Basically, only one core was used. We want to take advantage of all the available cores by spreading the reading of the files and making sure that the data of interest are gathered in the proper order. We use the multi-processing capabilities of the various languages to slightly modify the scripts. For each year, the daily files are read in by different threads (cores).The results are shown in Table 8.2.


 

LanguagenumThreads=2
numThreads=4
numThreads=8
numThreads=16
Python352.7964238.1065170.9945105.3949

Table 8.2: Elapsed time (in seconds) obtained by manipulating 7305 NetCDF files using multiple threading.

 

 

Problem 9: Function Evaluations

We  create an array x of length n and loop several times to perform the six operations:

 

y = sin(x)
x = asin(y)
y = cos(x)
x = acos(y)
y = tan(x)
x = atan(y)

 

 

Languagen=80000n=90000n=100000
Python52.101458.459164.8276
Julia55.555062.345069.2350
IDL37.479842.018734.8829
Matlab5.18665.65234.6116
R89.500101.439112.269

 

 

 

Problem 10: Simple FFT

We create a nxn random complex matrix M and compute the following:

 

r = fft(M)
r = abs(r)

 

 

Languagen=10000n=15000n=20000
Python10.508725.576445.1959
Julia3.91611.48920.632
IDL16.615436.571173.3394
Matlab2.66066.029310.7011
R60.722157.626269.651

 

 

Problem 11: Square Root of a Matrix

We consider an nxn matrix A with 6s on the diagonal and 1s everywhele else. We are lloking for the matrix B such that BxB = A. We record the time for determining B.

 

 

LanguageOptionn=1000n=2000n=4000
PythonSciPy sqrtm 2.22275.281445.7643
Juliasqrtm0.41292.51119.111
Matlabsqrtm0.96831.39162.3767
R
1.0573.60219.122

 

 

Problem 12: Look and Say Sequence

We write codes to determine the look and say number of order n. Instead of starting with a single digit, we begin with 1223334444.

This test case highlights how languages manipulate strings of arbitrary length.

 

 

LanguageOptionsn=40n=45n=48
Python
2.292137.4429224.4362
Julia
2.76944.333345.069
IDL
19.9563296.47681570.4234
Matlab
412.59934501.6751
R
0.5091.6783.611
Java
0.04870.09470.1582
Scala
0.03900.10200.1720
Fortrangfortran0.01600.02000.0200

gfortran -O30.02000.02400.0240

ifort0.01200.01600.0120

ifort -O30.01600.02000.0080
Cgcc0.08000.26000.5300

gcc -Ofast0.04000.25000.5000

icc0.07000.26000.4800

icc -Ofast0.07000.21000.4600

 

 

References

  1. Justin Domke, Julia, Matlab and C, September 17, 2012.
  2. Michael Hirsch, Speed of Matlab vs. Python Numpy Numba CUDA vs Julia vs IDL, June 2016.
  3. Murli M. Gupta, A fourth Order poisson solver, Journal of Computational Physics, 55(1):166-172, 1984.
  4. Jean Francois Puget, A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorization, January 2016.
  5. Alex Rogozhnikov, Log-likelihood benchmark, September 2015.
  6. Sebastian Raschka, Numeric matrix manipulation - The cheat sheet for MATLAB, Python Nympy, R and Julia, June 2014.
  7. Yousef Saad, Iterative Methods for Sparse Linear Systems (2 ed.), SIAM, ISBN 0898715342, 200366

 

 

Source Files

All the source files for the problems presented here are in the attached file: sourceFiles2018.tar.gz

 

If you have a comment/suggestion/question, contact Jules Kouatchou (Jules.Kouatchou@nasa.gov)

Attachments:
Comments (0)
USAGov logo NASA Logo - nasa.gov