Skip navigation
NASA Logo, National Aeronautics and Space Administration
Currently Being Moderated

Comparing Python, NumPy, Matlab, Fortran, etc.

VERSION 15  Click to view document history
Created on: Oct 23, 2009 10:33 AM by Jules Kouatchou - Last Modified:  Nov 2, 2009 10:33 AM by bvanaart

We carry out a series a basic experiments to compare Python related packages (Python, NumPy) and compilers (GNU Fortran, Intel Fortran). We also add Matlab and Java in our study. The primary objective of this exercise is to determine how NumPy performs with respect to the other packages and compilers. All the calculations were carried out in dali.

 

 

Compilers/PackagesVersion
Python2.5.4
NumPy1.3.0
Matlabmatlab-R2008a
GNU Fortran (gfortran)4.1.2
Intel Fortran (ifort)intel-11.1.038
Java1.6.0

 

 

Problem 1

This example shows the importance of avoiding loops (as far as possible) when manipulating arrays in NumPy. We have a 5000x5000x3 matrix A and we want to perform the following operations on A:

 

             A(i,j,1) = A(i,j,2)

             A(i,j,3) = A(i,j,1)

             A(i,j,2) = A(i,j,3)

 

For instance, in Python the code looks like:

 

for i in range(5000):
    for j in range(5000):
        a[i,j,0] = a[i,j,1]
        a[i,j,2] = a[i,j,0]
        a[i,j,1] = a[i,j,2]

 

We record the elapsed time needed to do the array assignments. The results are summarized on the table below.

 

 

 

Packages/CompilersElapsed Time (s)
Python48.55
NumPy0.96
Matlab2.398
gfortran0.540
gfortran with -O30.280
ifort0.276
ifort with -O30.2600
Java

12.5518

 

 

In Fortran, NumPy and Matlab we did not use the loop to carry out the desired operations. We instead use array indexing. Note how slow was Python and how efficient was NumPy.

 

Problem 2

Here, we want to multiply two randomly generated nxn matrices A and B:

 

C=AxB

 

For NumPy and Matlab, we use the predefined matrix multiplication functions whereas in Fortran, we wrote the code to perform the multiplication. The elapsed times presented here only measure the times spent on the multiplication (as the size of the matrix varies).

 

Compilers/PackagesConfigurationn=1000n=1200n=1500
NumPyintrinsic function6.110.4829.31





Matlabintrinsic function0.5480.7140.97

triple do-loop116.90201.16385.95





gfortran triple do-loop7.84413.73628.077

matmul1.3282.54018.484

-O3 and triple do-loop1.71213.2362029.156573

-O3 and matmul1.3282.5328.364





iforttriple do-loop0.99606302.1121317.372461

-O3 and triple do-loop0.98006092.0561287.356460

matmul3.56426.22412.6927

-O3 and matmul0.37600.64801.3560

DGEMM(*)0.23600.44800.8080
Java

1.92805

3.6075

10.2055

 

(*) We did not use any compilation option and only link with "-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread".

 

The above table suggests that built-in functions are more appropriate to perform matrix multiplication. DGEMM is far more efficient. It is important to note that DGEMM is more suitable for large size matrices. If for instance n=100, the function matmul out performs DGEMM. An interesting discussion on the performance of DGEMM and matmul using the Intel Fortran compiler can be read at:

 

 

 

NumPy is very slow for matrix multiplication. It was suggested that it should be built against ATLAS BLAS for better performance.

 

Problem 3

We find the numerical solution of the 2D Laplace equation:

 

Uxx + Uyy = 0

 

We use the Jacobi iterative solver. We are interested in two finite difference schemes:

 

The second-order scheme:

 

Ui,j =  (Ui-1,j + Ui,j-1 + Ui+1,j + Ui,j+1)/4

and the fourth-order compact scheme:

 

Ui,j =  (4(Ui-1,j + Ui,j-1 + Ui+1,j + Ui,j+1) + Ui-1,j-1 + Ui+1,j-1 + Ui+1,j+1 + Ui-1,j+1 )/20

 

The Jacobi iterative solver stops when the difference of two consecutive approximations falls below 10^{-6}.

 

Second-Order Scheme

 

Compilers/Packagesn=50n=100
Python

31.9193

509.859

NumPy

0.41324

4.19535

Matlab

0.460252

5.483992

gfortran0.2122.496
gfortran with -O30.0560.696
ifort0.0360.368
ifort -O30.0360.368
Java

0.0537

0.9012


 

Fourth-Order Compact Scheme

 

Compilers/Packagesn=50n=100
Python

46.15203

751.783

NumPy

0.610216

6.38891

Matlab

0.640044

6.531990

gfortran0.2363.248
gfortran with -O30.0881.256
ifort0.0520.656
ifort -O30.0520.672
Java

0.12180

2.2022

 

Once again, we observe that Python is not suitable to manipulate arrays. NumPy and Matlab have comparable results whereas the Intel Fortran compiler displays the best performance. Java did not use array indexing like NumPy, Matlab and Fortran, but did better than NumPy and Matlab.

 

The results presented above are consistent with the ones done by other groups:

Comments (4)
USAGov logo NASA Logo - nasa.gov