Simulating Euler's number

The number \(e\), also called Euler's number, is a well known mathematical constant. To 20 decimal places the value of \(e\) is:

$$e = 2.71828182845904523536$$

It is mostly known as the base of the natural logarithm but has multiple applications, such as in compound interest, probability theory, combinatorics, asymptotics, and other fields.

Here we will conduct a small experiment to estimate the value of \(e\).

The experiment

We pick a uniformly random real number between 0 and 1. We continue picking numbers until the sum of all picked numbers is greater than 1. How many numbers do you think we need to pick, on average?

Lets see an example:

  1. We pick a number, lets say 0.7.
  2. Since this is less than 1, we will pick another number.
  3. Lets pick 0.2.
  4. Since the sum of all previously picked numbers is sill less than 1 (0.9 in this example), we will pick another number.
  5. We now pick 0.3.
  6. The sum is now 1.2, so we stop.
  7. In this case we needed to pick 3 numbers.

So as you might now see, we will always pick at least two numbers, and often several more. This means that, on average, the number of picked values must be greater than 2. Furthermore, the upper side is unbounded since if really small numbers are picked, we will need nearly infinite numbers. So the average of 2 (about half the outcomes) and "3 or more" (the other half of outcomes) cannot be lower that 2.5.

On the other hand, on average, the maximum number of picked values must be less than 3, because the expected value of each run is still 0.5.

From this analysis we have established that the average number of picked values must be between 2.5 and 3.

The correct answer for this experiment is \(e\). We will, on average, have to pick 2.71828 values so that their sum is greater than 1.

Simulating

Now lets check this by simulation using Python.

First we will import the relevant Python modules. The random module allows us to pick random numbers from a uniform distribution and the math module contains the constant \(e\).

import random
import math

Now lets create a function that will pick random numbers until their sum is greater than 1. The function will return the number of values that were required.

def pick_numbers_until_sum_one():
    sum_ = 0
    number_count = 0
    while sum_ < 1:
        number = random.uniform(0, 1)
        sum_ += number
        number_count += 1
    return number_count

Finally, lets repeat this process a large number of times (N). We will save, for each iteration, the number of values needed and the current average (calculated since the first run).

N = 100000

numbers_needed = []
averages = []
errors = []

for i in range(N):
    # Run experiment
    numbers_needed.append(pick_numbers_until_sum_one())

    # Calculate average
    avg = sum(numbers_needed) / len(numbers_needed)
    averages.append(avg)

    # Calculate error
    err = (avg - math.e) / math.e * 100
    errors.append(err)

print(f"Average of {len(averages)} runs is {averages[-1]}")
print(f"Error after {len(errors)} runs is {errors[-1]}")

Running this code we get that after 100 thousand runs, on average, we picked 2.72032 numbers.

Average is 2.72032
Deviation is 0.07498014074980558 %

Note that the final result will change slightly every time the program is ran as different random numbers are selected every time.

We can see the evolution of our simulation in the following plot. Here we performed 15 simulations, each with 100 thousand runs. Note that the x-axis (representing the simulation run) is in logarithmic scale

Simulation

And focusing on a single simulation we can clearly see the average tending to \(e\).

Simulation detail

Has you can see our estimated value of 2.72032 is very close to the actual value of \(e\) = 2.71828. The difference corresponds to an error of 0.07 %. Obviously as we increase the number of runs (N), we will have a better estimation of \(e\).

The mathematical proof

We have seen that the simulation confirms our expected result. Now lets check the math.

We start by defining \(N_x\) as the minimum number of values between 0 and \(x\) (\(U_k \in [0,x]\)) that we need to pick so that \(\sum u_k > x\).

$$N_x = \mathrm{min} \left\{n : \sum_{k=1}^{n} U_k > x \right\}$$

\(m_x\) will be the expected value of \(N_x\).

$$m_x = E(N_x)$$

In our experiment, we want to calculate \(m_1\).

Assuming that \(U_1 = u\), if \(u \ge x\), then \(N_x = 1\). Otherwise, if \(u < x\), \(N_x = 1 + N'\), where \(N'\) is distributed like \(N_{x-u}\). As a result:

$$m_x = 1 + \int_0^x m_{x-u} du = 1 + \int_0^x m_u du$$

The equation above is satisfied when \(m_x = e^x\).

So, for \(m_1\) we get:

$$m_1 = E(N_1) = e^1 = e$$

Source code

Source code for this article can be found on this Github Repository.

References

Inspired by this Tweet by Fermat's Library.

Wikipedia - e (mathematical constant)