This past π day, I reflected on the other mathematical-ish occurrences that I idly observe. For instance, I often find myself noticing prime number values I come across, like those hidden in prices, ages, or dates. I recalled seeing this Tweet about a beautiful visualization of the prime factors of the first 100 whole numbers made and designed by Sondra Eklund.

hello
A postcard of a figure created by Sondra Eklund depicting the prime factorizations of the first 100 integers. [Image courtesy of Steven Strogatz and Sondra Eklund via Twitter.]

The visualization illustrates every number with a square box, with every box corresponding to a prime number entirely filled in with a unique color. Within non-prime number boxes, colored rectangular slices of its prime factors are neatly fitted, encompassing the entire square. For example, 2, the first prime number is colored in blue. It appears as a prime factor across many numbers; in the number 12 for example, two blue rectangles are paired with a red rectangle that represents the prime number of 3. And in this way, a striking mosaic is formed — your eye scans back and forth, multiplying the colors to generate numbers, 2x2x3. I truly love this piece: the bold yet cohesive colors, the neat lines, composition, everything…!*chef’s kiss* 👩‍🍳

Following the initial Twitter thread, people recreated the art with Python, even portraying the colored squares monochromatically. Feeling inspired, I set out to re-imagine and depict the work a bit differently. I wanted to make a scalable design that captured the relationships across the numbers more immediately. I also wanted to try out Streamlit’s new sharing platform where you can readily deploy apps. So, here’s what I did about it!

Identifying and storing prime factors

First, I had to figure out how to identify the prime factors of a given number. I created a function named prime_factors which took in a numerical input and returned a list of numbers that are its prime factors.

def prime_factors(number):
""" Returns a list of prime factors of an input number. """

prime_factors = []
x = 333
dividend = number
divisor = 2

while x == 333:
if dividend == 1:
return prime_factors
else:
if dividend % divisor == 0:
prime_factors.append(divisor)
dividend = dividend/divisor
else:
divisor += 1

In this function, the modulo operator (%) comes in handy to get to the bottom of things: first, you take the input number (dividend) and begin by modulo’ing it by 2 (the first prime number), and if the remainder of that is 0, then 2 is appended to the list as a prime factor, and the new resultant dividend continues the cycle once more with the same divisor. This continuation is critical because it allows for multiple repetitions of the same prime factor to be recorded and appended to the list. Only at the point where the remainder of the modulo is not 0 with a divisor do we move onward, now adding 1 to the divisor and looping through the process again with the new divisor. For example, for the number 12, the logic would follow 12%2, 6%2, 3%2 != 0, thus adding 1 to the divisor, arriving finally to 3%3. When we get to the point where we’ve added to the divisor enough for it to match the slowly whittled dividend (i.e. they have converged and are now the same), the function returns the final list of prime factors.

After creating a function that returns the prime factors for a single given number, I wanted to create a function that returned the prime factors for a range of number values.

def prime_df(number):
""" Returns a list of prime factors of a range of numbers from 2
to the input number. """
num_prime_array = np.array(([2, 2, 1]))if number > 2:
for this_num in range(3, number+1):
num_primes = prime_factors(this_num)

#saving out numbers' unique primes and their counts
if this_num > 1:
unique, counts = np.unique(num_primes,
return_counts=True)
freq = np.asarray((list(unique), list(counts))).T
num_id = np.array([[this_num] * len(freq)]).T
num_id_unique_counts = np.hstack((num_id, freq))
num_prime_array = np.vstack((num_prime_array,
num_id_unique_counts))
num_prime_df = pd.DataFrame(num_prime_array)
else:
num_prime_df = pd.DataFrame([num_prime_array], index=[0])

num_prime_df.columns = ['Numbers', 'Primes', 'Powers']
return num_prime_df

Thus, I created a function called prime_df wherein I called the prime_factors function for a desired range of numbers from 2 to an input number. For each number in that range, I tabulated its unique prime factors and the number of times each factor was repeated. With that, I created a vertically stacked array, a column with the range of the desired numbers, a column for the unique prime factors, and a column for the count of each unique prime factor. Thus for a given array, the number of rows a number took up corresponded to the number of unique prime factors it had. Finally, I converted this array to a Pandas dataframe and returned it at the end of the function. This dataframe would allow me to easily manipulate the numbers and their prime factors in subsequent visualizations.

Depicting prime factors

To envision the prime factors of a set of numbers, I wanted to use a typical cartesian graph, with the set of numbers on the x-axis and the prime factors of the set numbers on the y-axis. This would allow one to readily see the frequencies between factors, and similarly to Sondra’s striking depiction, which factors were commonly shared across numbers.

I decided to use Seaborn’s relplot to depict this information; with it, I could not only show the factors and numbers with points along each axis, but I could depict the amount of times a factor was repeatedly multiplied with colored bubbles of varying size based on the number of repetitions. In the figure, I called these powers because these repetitions are really the exponential powers each prime factor gets, e.g. 12 = 2² x 3¹.

To customize this plot, I chose a set of complementary hex-coded color pairs that were personally pleasing to me (👩‍🎨) and created a color map by blending the range between the two colors, with transition steps between the two colors to match the greatest number of powers:

cmap = sns.blend_palette([color_1,color_2],n_colors=max(num_prime_df.Powers))

Then I essentially dropped my code into my GitHub and then loaded it on Streamlit! With no further ado, here is my dashboard. On the dashboard, I was able to control the range of numbers that’s depicted on the graph at a time; this way users can take a moment to really observe the patterns that bear out at different intervals. I also created a button that switches up the color palette of the graph for a bit of added pizzazz.

This was a fun little exercise to lightly work my creative muscles and finally experience the simple effectiveness of Streamlit’s platform for hosting applications. I’m looking forward to building bigger and better things and sharing them with you, dear reader!

I’m a data scientist with neuroscientific roots. Learn more about me here: https://athpud.com/