kees-study/cds-monte-carlo-methods

Fork 0

Files

Kees van Kempen bc85efdb6f 01: Add Metropolis algorithm attempt

2022-09-13 09:38:53 +02:00

44 KiB

Raw Blame History

Exercise sheet¶

Some general remarks about the exercises:

For your convenience functions from the lecture are included below. Feel free to reuse them without copying to the exercise solution box.
For each part of the exercise a solution box has been added, but you may insert additional boxes. Do not hesitate to add Markdown boxes for textual or LaTeX answers (via Cell > Cell Type > Markdown). But make sure to replace any part that says YOUR CODE HERE or YOUR ANSWER HERE and remove the raise NotImplementedError().
Please make your code readable by humans (and not just by the Python interpreter): choose informative function and variable names and use consistent formatting. Feel free to check the PEP 8 Style Guide for Python for the widely adopted coding conventions or this guide for explanation.
Make sure that the full notebook runs without errors before submitting your work. This you can do by selecting Kernel > Restart & Run All in the jupyter menu.
Each sheet has 100 points worth of exercises. Note that only the grades of sheets number 2, 4, 6, 8 count towards the course examination. Submitting sheets 1, 3, 5, 7 & 9 is voluntary and their grades are just for feedback.

Please fill in your name here:

In [1]:

NAME = "Kees van Kempen"

Exercise sheet 1

Code from the lecture:

In [2]:

import numpy as np
import matplotlib.pylab as plt

rng = np.random.default_rng()
%matplotlib inline

def random_in_square():
    """Returns a random position in the square [-1,1)x[-1,1)."""
    return rng.uniform(-1,1,2)  

def is_in_circle(x):
    return np.dot(x,x) < 1

def simulate_number_of_hits(N):
    """Simulates number of hits in case of N trials in the pebble game."""
    number_hits = 0
    for i in range(N):
        position = random_in_square()
        if is_in_circle(position):
            number_hits += 1
    return number_hits

def random_in_disk():
    """Returns a uniform point in the unit disk via rejection."""
    position = random_in_square()
    while not is_in_circle(position):
        position = random_in_square()
    return position

def is_in_square(x):
    """Returns True if x is in the square (-1,1)^2."""
    return np.abs(x[0]) < 1 and np.abs(x[1]) < 1
    
def sample_next_position_naively(position,delta):
    """Keep trying a throw until it ends up in the square."""
    while True:
        next_position = position + delta*random_in_disk()
        if is_in_square(next_position):
            return next_position
    
def naive_markov_pebble(start,delta,N):
    """Simulates the number of hits in the naive Markov-chain version 
    of the pebble game."""
    number_hits = 0
    position = start
    for i in range(N):
        position = sample_next_position_naively(position,delta)
        if is_in_circle(position):
            number_hits += 1
    return number_hits

def naive_markov_pebble_generator(start,delta,N):
    """Same as naive_markov_pebble but only yields the positions."""
    position = start
    for i in range(N):
        position = sample_next_position_naively(position,delta)
        yield position

def sample_next_position(position,delta):
    """Attempt a throw and reject when outside the square."""
    next_position = position + delta*random_in_disk()
    if is_in_square(next_position):
        return next_position  # accept!
    else:
        return position  # reject!

def markov_pebble(start,delta,N):
    """Simulates the number of hits in the proper Markov-chain version of the pebble game."""
    number_hits = 0
    position = start
    for i in range(N):
        position = sample_next_position(position,delta)
        if is_in_circle(position):
            number_hits += 1
    return number_hits

def markov_pebble_generator(start,delta,N):
    """Same as markov_pebble but only yields the positions."""
    position = start
    for i in range(N):
        position = sample_next_position(position,delta)
        yield position

Empirical convergence rate in the pebble game¶

(a) Write a function pi_stddev that estimates the standard deviation $\sigma$ of the $\pi$-estimate using $n$ trials by running the direct-sampling pebble game $m$ times. Store this data for $n=2^4,2^5,\ldots,2^{14}$ and $m=200$ in an array. Hint: you may use the NumPy function np.std. (12 pts)

In [3]:

def pi_stddev(n,m):
    """Estimate the standard deviation in the pi estimate in the case of n trials,
    based on m runs of the direct-sampling pebble game."""
    
    return np.std([simulate_number_of_hits(n)/n*4 for i in range(m)])
    
stddev_data = np.array([[2**k,pi_stddev(2**k,200)] for k in range(4,12)])
stddev_data

Out[3]:

array([[1.60000000e+01, 3.96949304e-01],
       [3.20000000e+01, 2.84220425e-01],
       [6.40000000e+01, 2.15515371e-01],
       [1.28000000e+02, 1.59275704e-01],
       [2.56000000e+02, 1.00798933e-01],
       [5.12000000e+02, 6.87799562e-02],
       [1.02400000e+03, 4.87684636e-02],
       [2.04800000e+03, 3.85210905e-02]])

In [4]:

# If done correctly, your code should pass the following tests
from nose.tools import assert_almost_equal
assert_almost_equal(pi_stddev(2**3,1000),0.582,delta=0.03)
assert_almost_equal(pi_stddev(2**4,1000),0.411,delta=0.03)

(b) Write a function fit_power_law that takes an array of $(n,\sigma)$ pairs and determines best-fit parameters $a,p$ for the curve $\sigma = a n^p$. This is best done by fitting a straight line to the data on a log-log scale ($\log\sigma=\log a+p\log n$). Hint: use curve_fit from SciPy. (12 pts)

In [5]:

import scipy.optimize

def fit_power_law(stddev_data):
    """Compute the best fit parameters a and p."""
    
    # Define the to be fitted function. We use the log, due to the calculation of the uncertainties.
    log_f = lambda n, a, p: np.log(a) + p*np.log(n)
    
    n, 𝜎 = stddev_data.T
    a_fit, p_fit = scipy.optimize.curve_fit(log_f, n, np.log(𝜎))[0]
    
    return a_fit, p_fit

In [6]:

from nose.tools import assert_almost_equal
assert_almost_equal(fit_power_law(np.array([[16.0,1.4],[32.0,1.1],[64.0,0.9]]))[0],3.36,delta=0.05)
assert_almost_equal(fit_power_law(np.array([[16.0,1.4],[32.0,1.1],[64.0,0.9]]))[1],-0.31,delta=0.03)

(c) Plot the data against the fitted curve $\sigma = a n^p$ in a log-log plot. Don't forget to properly label your axes. Hint: use loglog from matplotlib. (12 pts)

In [7]:

a_fit, p_fit = fit_power_law(stddev_data)

plt.figure()
plt.title("Convergence of standard deviation of direct-sampled $\pi$ guesses over $n$ throws among $m = 200$ trials")
plt.loglog(*stddev_data.T, label="data")
n_range = np.logspace(1, 4, 10)
plt.loglog(n_range, a_fit*n_range**p_fit,
           label="fitted $\sigma = an^p$\n($a = {:.3f}, p = {:.3f})$".format(a_fit, p_fit))
plt.xlabel("$n$")
plt.ylabel("$\sigma$")
plt.legend()
plt.show()

Volume of a unit ball in other dimensions¶

In this exercise you will adapt the direct-sampling Monte Carlo method of the pebble game to higher dimensions to estimate the $d$-dimensional volume of the $d$-dimensional ball of radius $1$.

(a) Adapt the direct-sampling Monte Carlo method to the pebble game in a $d$-dimensional hypercube $(-1,1)^d$ with an inscribed $d$-dimensional unit ball. (12 pts)

In [8]:

def number_of_hits_in_d_dimensions(N,d):
    """Simulates number of hits in case of N trials in the d-dimensional direct-sampling pebble game."""

    def random_in_d_dimensional_box():
        """Returns a random position in the box [-1,1)^d."""
        return rng.uniform(-1,1,d)
        
    number_hits = 0
    for i in range(N):
        position = random_in_d_dimensional_box()
        if is_in_circle(position):
            number_hits += 1
    return number_hits

In [9]:

from nose.tools import assert_almost_equal
assert_almost_equal(number_of_hits_in_d_dimensions(100,1),100,delta=1)
assert_almost_equal(number_of_hits_in_d_dimensions(2000,3),1045,delta=50)

(b) Compare your estimates for the volume $V_d$ of the $d$-dimensional unit ball for $N=10000$ trials and $d=1,2,\ldots,7$ to the exact formula $V_d = \frac{\pi^{d/2}}{\Gamma(\tfrac{d}{2}+1)}$ in a plot. Hint: the Gamma function is available in scipy.special.gamma. (12 pts)

In [10]:

import scipy.special

def volume_d_ball(d):
    """Returns the volume of a d-dimensional unit ball."""
    return np.pi**(d/2)/scipy.special.gamma(d/2 + 1)

N = 10000
for d in range(1, 8):
    print("N = {}".format(N),
          "\td = {}".format(d),
          "\tV_d = {:.3f}".format(volume_d_ball(d)),
          "\tdirect-sampled V_d = {:5.3f}".format(number_of_hits_in_d_dimensions(N, d)/N*2**d)
         )
# TODO: Plot the results as is asked.

N = 10000 	d = 1 	V_d = 2.000 	direct-sampled V_d = 2.000
N = 10000 	d = 2 	V_d = 3.142 	direct-sampled V_d = 3.144
N = 10000 	d = 3 	V_d = 4.189 	direct-sampled V_d = 4.161
N = 10000 	d = 4 	V_d = 4.935 	direct-sampled V_d = 4.898
N = 10000 	d = 5 	V_d = 5.264 	direct-sampled V_d = 5.357
N = 10000 	d = 6 	V_d = 5.168 	direct-sampled V_d = 5.146
N = 10000 	d = 7 	V_d = 4.725 	direct-sampled V_d = 4.659

Efficiency of the Metropolis algorithm and the 1/2-rule¶

In the lecture we mentioned the "1/2 rule" for Metropolis algorithms: if moves are rarely rejected then typically you do not explore the domain efficiently, while if most moves are rejected you are wasting efforts proposing moves. A good rule of thumb then is to aim for a rejection rate of $1/2$. In this exercise you are asked to test whether this rule of thumb makes any sense for the Markov-chain pebble game by varying the throwing range $\delta$.

(a) Estimate the mean square deviation $\mathbb{E}[(\tfrac{4\text{hits}}{\text{trials}} - \pi)^2 ]$ from $\pi$ for different values of $\delta$ ranging between $0$ and $3$, but fixed number of trials (say, 2000). For a decent estimate of the mean you will need at least $100$ repetitions. (14 pts)

In [14]:

N = 2000 # number of trials
m = 200  # number of repetitions

p0 = np.array([0., 0.]) # starting position in origin

for delta in np.linspace(0, 3, 10):
    pi_estimate = 0
    for i in range(m):
        hits = markov_pebble(p0, delta, N)
        pi_estimate += hits/N*4
    pi_estimate /= m
    print("N = {}".format(N),
          "\tdelta = {:.3f}".format(delta),
          "\tpi_estimate = {:.3f}".format(pi_estimate)
         )

# TODO: Investigate the rejection rate and the mean square deviation. Maybe only the latter, as the next question concerns the former.

N = 2000 	delta = 0.000 	pi_estimate = 4.000
N = 2000 	delta = 0.333 	pi_estimate = 3.149
N = 2000 	delta = 0.667 	pi_estimate = 3.139
N = 2000 	delta = 1.000 	pi_estimate = 3.135
N = 2000 	delta = 1.333 	pi_estimate = 3.146
N = 2000 	delta = 1.667 	pi_estimate = 3.150
N = 2000 	delta = 2.000 	pi_estimate = 3.150
N = 2000 	delta = 2.333 	pi_estimate = 3.141
N = 2000 	delta = 2.667 	pi_estimate = 3.146
N = 2000 	delta = 3.000 	pi_estimate = 3.146

(b) Measure the rejection rate for the same simulations as in (a). (14 pts)

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

(c) Plot both the mean square deviation and the rejection rate as function of $\delta$. How well does the 1/2 rule apply in this situation? (12 pts)