JM

Multithreading in Julia

Fortunately, since Julia v1.3, multithreading has been fairly simple to use inside of Julia. Performance of multithreading has also increased as Julia develops. For this reason, results shown in this book may vary greatly, depending on when it is read, and the current version of Julia you are using.

This section will cover a basic use of multithreading to speed up an embarrassingly parallel algorithm. An embarrassingly parallel problem is one where we have a one-to-one relationship between our inputs (parameters) and our desired outputs (results). This sort of problem can be distilled down to repeating the following operation n n times:

ri=f(xi) r_i = f(x_i)

where xi x_i represents one element of the n n elements that are to be processed, f f is the common function that performs the mapping and ri r_i represents the ith i^{\text{th}} element of the results vector. To use Julia specific language, any operation that can be written in vector format (broadcasted) is likely a good candidate to be embarrassingly parallel.

Monte Carlo Estimation of π

For a case study, let’s look at a way that we can calculate π \pi. We will choose a Monte-Carlo method of estimation, since this will prove to be an excellent case study for multithreading. Monte-Carlo methods are sampling techniques that can be used to estimate quantities via random simulations. They are often very easy to implement and can be used as a simple technique to provide numerical estimates for quantities that difficult or impossible to calculate analytically. In the case of calculating π \pi, one can estimate its value by playing darts on a square block, with a circle touching the edge of the square. If the radius of the circle is r r, then the area is πr2 \pi r^2 and the area of the square it lies within is 4r2 4r^2. We know that the ratio of the area of the circle compared with the square is ρ=πr24r2=π4 \rho = \frac{\pi r^2}{4 r^2}=\frac{\pi}{4}. If we can calculate an estimate for ρ \rho, we can approximate π \pi as π=4ρ \pi = 4 \rho.

We can now randomly play darts on this setup, making sure that each point in the square is equally likely to be somewhere a dart hits. Then we can count how many of the darts land in the circle, compared with landing outside the circle and in the square. If we throw n n darts uniformly randomly in the square and only nc n_c land inside the circle, then we know that ρncn \rho \approx \frac{n_c}{n}, which will approach the true value of ρ \rho when n n\rightarrow \infty, or in other terms:

π=4ρ=4limnncn \pi = 4\rho = 4 \lim_{n\rightarrow \infty} {\frac{n_c}{n}}

First, let’s write an algorithm to do this in serial.

julia
function est_pi_mc_serial(n)
    n_c = zero(typeof(n))
    for _ in 1:n
        # Choose random numbers between -1 and +1 for x and y
        x = rand() * 2 - 1
        y = rand() * 2 - 1
        # Work out the distance from origin using Pythagoras
        r2 = x*x+y*y
        # Count point if it is inside the circle (r^2=1)
        if r2 <= 1
            n_c += 1
        end
    end
    return 4 * n_c / n
end

One can see that the estimate for π \pi improves with the number of darts thrown, by using a much higher number of points:

julia
function est_pi_mc_serial(n)
    n_c = zero(typeof(n))
    for _ in 1:n
        # Choose random numbers between -1 and +1 for x and y
        x = rand() * 2 - 1
        y = rand() * 2 - 1
        # Work out the distance from origin using Pythagoras
        r2 = x*x+y*y
        # Count point if it is inside the circle (r^2=1)
        if r2 <= 1
            n_c += 1
        end
    end
    return 4 * n_c / n
end

pi
est_pi_mc_serial(100)
est_pi_mc_serial(10_000)
Figure 1: The graph shows the estimate of the relative error of the estimated value of π. This graph shows the standard deviation (using 30 repeats) of the quantity (π'(n) - π)/π where π'(n) represents an estimate for π constructed using the Monte Carlo method with n samples.

Additionally, one can plot the standard deviation of the relative errors of each estimate of π \pi against the number of darts thrown. This can be seen in Figure 1, which shows that the relative error approaches zero as n n\rightarrow \infty.

While this is not the most efficient way of calculating π \pi, heavily relying on the quality of the random number generator, it does provide a case study for a parallel speed up.