§ A visualization of Pi digits

Following tradition, as a tribute to pi day, I will attempt to recreate a particular visual representation of the digits of \(\pi\). The original creation is due to Martin Krzywinski, and can be found here. The digits of \(\pi\) were computed using the Chudnovsky algorithm. Due to the CPU-intensive nature of such a computation, I precomputed the digits and saved to them a .dat file for reuse. This is less time consuming than computing the digits every time when visualizing them (or if you write buggy programs like I do). Below is the python script that computes the \(n\) first digits of \(\pi\) and saves as them to .dat file:

def pi_digits(n):
    Computes PI using the Chudnovsky algorithm from

    # Set precision

    getcontext().prec = n

    t = Decimal(0)
    pi = Decimal(0)
    d = Decimal(0)

    # Chudnovsky algorithm

    for k in range(n):
        t = ((-1)**k)*(factorial(6*k))*(13591409+545140134*k)
        d = factorial(3*k)*(factorial(k)**3)*(640320**(3*k))

        pi += Decimal(t) / Decimal(d)

    pi = pi * Decimal(12) / Decimal(640320**(Decimal(1.5)))
    pi = 1 / pi

    return str(pi)

Since the digits will be displayed on a grid, the python script takes as input the dimensions of the grid, the number of rows and number of columns respectively.

Now that we have computed the \(n = rc\) digits of \(\pi\), we want to visualize them. The visualization was done using scatter from matplotlib where the digits are laid out on a \(r \times c\) evenly-spaced meshgrid and the values of the digits are used to fill the colors of the points on the grid, these colors are taken from a discrete color map of size \(10\). The python script used to read the digits from a file and visualize them as explained above is as follows:

def discrete_cmap(N, base_cmap=None):
    """Create an N-bin discrete colormap from the specified input map from:
    By Jake VanderPlas
    License: BSD-style

    # Note that if base_cmap is a string or None, you can simply do
    #    return plt.cm.get_cmap(base_cmap, N)
    # The following works for string, None, or a colormap instance:

    base = plt.cm.get_cmap(base_cmap)
    color_list = base(np.linspace(0, 1, N))
    cmap_name = base.name + str(N)
    return color_list, base.from_list(cmap_name, color_list, N)

if __name__ == '__main__':
    r, c = np.shape(pi_digits) # pi digits in meshgrid
    x, y = np.meshgrid(np.arange(r), np.arange(c))

    # initialize figure
    fig, axes = plt.subplots(figsize=(8, 4))

    color_list, dcmap = discrete_cmap(n, "terrain")

    # swap y and x
    scat = axes.scatter(
      y, x, c=digits[x, y], s=3, cmap=dcmap

    legend1 = axes.legend(*scat.legend_elements(), loc='upper center',
                          bbox_to_anchor=(0.5, 1.1), ncol=n, fontsize=4.5,
                          markerscale=.5, frameon=False)

    # show figure

Above code snippet should be able to reproduce something along these lines

23x23 meshgrid
23x23 meshgrid of the first 529 pi digits

What would be even more visually appealing is if we could connect the same-valued digits that are one point away from each other in either direction with edges as in the original poster. For example, for a \(3\) by \(3\) grid

\begin{align} &3,1,4 \newline &1,5,9 \newline &2,6,6 \end{align}

In this case the \(1\)s across the diagonal from the first row and second row will be connected with an edge. Similarly the \(6\)s along the third row will be connected with an edge.

To do this, we need to able to find the indices of the connected clusters of numbers across the whole grid and use them to draw the edges between the corresponding grid points. Scouring around the internet, I found that such a computation is related to a concept in graph theory, which is the concept of connected components of an undirected graph. Finding the connected components of all the digits from \(0-9\) across the grid will give us what we seek. At first sight, this seemed like a daunting task and I was apprehensive of whether a vanilla python implementation would be up to the task. However, it turns out this task is related to labeling components in a pixel array. Consider the following \(3\times3\) pixel array.

\begin{align} &0,1,0 \newline &1,0,1 \newline &0,1,0 \end{align}

The four \(1\)s in the grid would be deemed as a label/component and we would have only one component in the grid. The scipy library has a high performance C implementation, scipy.ndimage.label, to label the components and subsequently extract their indices. The single downfall of this method is that it also considers isolated groups (\(1\)s surrounded by \(0\)s in all directions) as connected components. However, one can filter out isolated groups from the original array as done here. Putting this all together, we can produce the stunning visuals below. Here are the visualizations for \(23\times23\)

23x23 meshgrid
23x23 meshgrid of the first 529 pi digits with matching neareast-neighbor edges

The complete source code and even more higher quality images can be found here.