Timing

We can start with the most basic form of measuring performance: timing how long the code takes to run. Modern computers have internal clocks to keep track of time. This is often referring to as measure the wall-time of an algorithm. These clocks are usually accurate on the scale of nanoseconds. This is because the CPU has to synchronise its behaviour very precisely. We can take advantage of this built-in feature and measure the current time before a piece of code has executed, and after it has finished. Taking the difference of these measurements will give you the elapsed time of the function.

Let’s write an example in Julia. Take the example of creating a 1000 by 1000 array of random numbers. We can do this with the rand(1000,1000) function call. We are not interested in the result of this function, but rather the time it takes to run this code. In Julia, there exists the @time macro, which measures the time taken for given code to execute¹. Let’s run this code multiple times and see what the results give.

julia

@time rand(1000,1000);
@time rand(1000,1000);
@time rand(1000,1000);
@time rand(1000,1000);

Output

0.001297 seconds (3 allocations: 7.629 MiB)
  0.001369 seconds (3 allocations: 7.629 MiB)
  0.001302 seconds (3 allocations: 7.629 MiB)
  0.001291 seconds (3 allocations: 7.629 MiB)

Notice that the time taken is different on each execution, even though the piece of code is the same. This is where we encounter the first issue of timing - the execution time cannot be guaranteed to be consistent between runs. This is because a modern computer is a complex machine, which is usually running the operating system and multiple applications at the same time. The operating system controls a scheduler, which tells the CPU which bit of code to run at any time. What is important here is that a program can be interrupted (the execution is paused), while the CPU switches to process a different task. This can cause the same piece of code to have different measured times as seen above.

Another reason for the discrepancy is that modern CPUs dynamically adjust the speed at which they operate, especially on devices such as laptops and phones. This is because running at 100% speed uses a lot of power, even when the CPU is idle. The CPU will try and operate at full speed when it is required, and switch back to a low power mode when the performance is not needed.

This method of timing is also unsuitable in Julia as the language is Just-in-Time compiled, which means that the first call to a function will include some additional compilation time. This should only happen on the first call, and subsequent calls will be much faster. However, it is strongly discouraged to time your code using this primitive method, and instead, benchmark the function - this is the topic of the next section.

One can also use the @elapsed macro to store the execution time (in seconds) of some code which can be used to plot a graph↩

Back Next

High Performance Computing in Julia

1. Introduction

2. Foundations

3. Julia

4. Measuring Performance

5. Optimisation

6. Parallel Programming

7. Multithreading

8. GPU Programming

Table of Contents

Timing