Prime numbers are special. I've got no doubt that you've heard of them. And even if you aren't interested in maths, I'm certain that you can appreciate the beauty around a number that's only divisible by itself and 1. Without primes, everything from modern communications to encryption schemes and even the lottery wouldn't be what it is today. But this article isn't about primes. Not really. It's more about finding them algorithmically using the Sieve of Eratosthenes.
Eratosthenes is a name you won't soon forget. He was a Greek mathematician and astronomer (276 BCE - 194 BC), who lived in present-day Libya. A prominent scholar in the Library of Alexandra and at one time served as its chief librarian, he's more widely known for being the first person to apply mathematics to geography, accurately estimating the Earth's circumference in a time before advanced measuring tools or modern cartography existed.
The Sieve of Eratosthenes shows but the smallest of glimpses into the mathematical rigor of the Hellenistic period. Unfortunately, all of Eratosthenes' work was lost in the destruction of the library.
The algorithm works a bit like this:
Let's say that n = 99
. Starting with a list that contains all integers from 2
to n
, we will systematically eliminate all non-primes.
Initially, we set p
as the first prime number: p = 2
, we start marking multiples of 2
from p ^ 2
, which is 4
. We mark 4
, 6
, 8
, 10
, and so on, as composite. After marking multiples of 2
, we move to the next unmarked number, which is 3
. Now, p = 3
, and p ^ 2
which is 9
. We mark multiples of 3
from p ^ 2
, such as 9
, 12
, 15
, 18
, and so on, as composite. This algorithm keeps marking composite numbers based on the current prime value of p
and, in the end, leaves only prime numbers unmarked. By iterating through each prime number p
and marking multiples from p ^ 2
, the algorithm identifies and marks all composite numbers in the range without repetition.
You can watch the process of eliminating non-primes, as it might be a bit tricky to wrap your head around the first time around:
We can see that after every iteration, all the remaining unmarked numbers in the list are primes, and by the end of this procedure, the only remaining numbers in our vector are primes. Pretty nifty.
The algorithm knows to stop when p
reaches the square root of n
. This has the benefit of ensuring that no composite number stays unmarked, which works because a composite number is a number that has factors other than 1
and itself, i.e., not a prime. Every composite number has at least one prime factor that's less than or equal to its square root. So, the square root of n
is the largest possible number x
, where x ≤ n
must have at least one prime factor less than or equal to the square root of x
. Since we know that at least one prime factor of a composite number must be less than or equal to the square root of our limit, by the time our algorithm gets to the square root of n
, it will have marked all composite numbers in the range.
This can be written as code (in Python) like so:
def sieve(limit):
primes = [True] * (limit + 1)
primes[0] = primes[1] = False
for p in range(2, int(limit ** 0.5) + 1):
if primes[p]:
for i in range(p * p, limit + 1, p):
primes[i] = False
return [p for p in range(2, limit + 1) if primes[p]]
print(sieve(1000))
Let's go over what this program does in four steps:
Create a list of boolean values initialized to
True
, representing all numbers as primes, except0
and1
.Iterate through all numbers
p
from2
up to the square root of the limit,limit ** 0.5
For each prime number
p
, mark multiples ofp
starting fromp ^ 2
as composite (non-prime).Collect all prime (unmarked) numbers and returns them.
The computational complexity of the Sieve of Eratosthenes algorithm is an important factor in understanding its efficiency. The time complexity of this algorithm is usually expressed in terms of its input size (n)
, which is the upper limit of the range within which we want to find prime numbers, and is O(n log(log n))
, with the space complexity being O(n)
. This makes it one of the most efficient algorithms for finding prime numbers within a given range, especially for smaller values of n
.
Now that you understand this, try to solve this Leetcode problem without looking back at the code. Let me know how it goes!