Files

Kees van Kempen 86e473fd07 01: Implement direct sampling things and fitting

Currently, only the test data is compared to the fit. Will change it to
the pebble things shortly.

2022-09-08 11:42:22 +02:00

35 KiB

Raw Blame History

Exercise sheet¶

Some general remarks about the exercises:

For your convenience functions from the lecture are included below. Feel free to reuse them without copying to the exercise solution box.
For each part of the exercise a solution box has been added, but you may insert additional boxes. Do not hesitate to add Markdown boxes for textual or LaTeX answers (via Cell > Cell Type > Markdown). But make sure to replace any part that says YOUR CODE HERE or YOUR ANSWER HERE and remove the raise NotImplementedError().
Please make your code readable by humans (and not just by the Python interpreter): choose informative function and variable names and use consistent formatting. Feel free to check the PEP 8 Style Guide for Python for the widely adopted coding conventions or this guide for explanation.
Make sure that the full notebook runs without errors before submitting your work. This you can do by selecting Kernel > Restart & Run All in the jupyter menu.
Each sheet has 100 points worth of exercises. Note that only the grades of sheets number 2, 4, 6, 8 count towards the course examination. Submitting sheets 1, 3, 5, 7 & 9 is voluntary and their grades are just for feedback.

Please fill in your name here:

In [1]:

NAME = "Kees van Kempen"

Exercise sheet 1

Code from the lecture:

In [2]:

import numpy as np
import matplotlib.pylab as plt

rng = np.random.default_rng()
%matplotlib inline

def random_in_square():
    """Returns a random position in the square [-1,1)x[-1,1)."""
    return rng.uniform(-1,1,2)  

def is_in_circle(x):
    return np.dot(x,x) < 1

def simulate_number_of_hits(N):
    """Simulates number of hits in case of N trials in the pebble game."""
    number_hits = 0
    for i in range(N):
        position = random_in_square()
        if is_in_circle(position):
            number_hits += 1
    return number_hits

def random_in_disk():
    """Returns a uniform point in the unit disk via rejection."""
    position = random_in_square()
    while not is_in_circle(position):
        position = random_in_square()
    return position

def is_in_square(x):
    """Returns True if x is in the square (-1,1)^2."""
    return np.abs(x[0]) < 1 and np.abs(x[1]) < 1
    
def sample_next_position_naively(position,delta):
    """Keep trying a throw until it ends up in the square."""
    while True:
        next_position = position + delta*random_in_disk()
        if is_in_square(next_position):
            return next_position
    
def naive_markov_pebble(start,delta,N):
    """Simulates the number of hits in the naive Markov-chain version 
    of the pebble game."""
    number_hits = 0
    position = start
    for i in range(N):
        position = sample_next_position_naively(position,delta)
        if is_in_circle(position):
            number_hits += 1
    return number_hits

def naive_markov_pebble_generator(start,delta,N):
    """Same as naive_markov_pebble but only yields the positions."""
    position = start
    for i in range(N):
        position = sample_next_position_naively(position,delta)
        yield position

def sample_next_position(position,delta):
    """Attempt a throw and reject when outside the square."""
    next_position = position + delta*random_in_disk()
    if is_in_square(next_position):
        return next_position  # accept!
    else:
        return position  # reject!

def markov_pebble(start,delta,N):
    """Simulates the number of hits in the proper Markov-chain version of the pebble game."""
    number_hits = 0
    position = start
    for i in range(N):
        position = sample_next_position(position,delta)
        if is_in_circle(position):
            number_hits += 1
    return number_hits

def markov_pebble_generator(start,delta,N):
    """Same as markov_pebble but only yields the positions."""
    position = start
    for i in range(N):
        position = sample_next_position(position,delta)
        yield position

Empirical convergence rate in the pebble game¶

(a) Write a function pi_stddev that estimates the standard deviation $\sigma$ of the $\pi$-estimate using $n$ trials by running the direct-sampling pebble game $m$ times. Store this data for $n=2^4,2^5,\ldots,2^{14}$ and $m=200$ in an array. Hint: you may use the NumPy function np.std. (12 pts)

In [3]:

def pi_stddev(n,m):
    """Estimate the standard deviation in the pi estimate in the case of n trials,
    based on m runs of the direct-sampling pebble game."""
    
    return np.std([simulate_number_of_hits(n)/n*4 for i in range(m)])
    
stddev_data = np.array([[2**k,pi_stddev(2**k,200)] for k in range(4,12)])
stddev_data

Out[3]:

array([[1.60000000e+01, 4.14095400e-01],
       [3.20000000e+01, 2.99924470e-01],
       [6.40000000e+01, 2.19101280e-01],
       [1.28000000e+02, 1.49149198e-01],
       [2.56000000e+02, 9.15196946e-02],
       [5.12000000e+02, 7.09785726e-02],
       [1.02400000e+03, 5.33523935e-02],
       [2.04800000e+03, 3.70486705e-02]])

In [4]:

# If done correctly, your code should pass the following tests
from nose.tools import assert_almost_equal
assert_almost_equal(pi_stddev(2**3,1000),0.582,delta=0.03)
assert_almost_equal(pi_stddev(2**4,1000),0.411,delta=0.03)

(b) Write a function fit_power_law that takes an array of $(n,\sigma)$ pairs and determines best-fit parameters $a,p$ for the curve $\sigma = a n^p$. This is best done by fitting a straight line to the data on a log-log scale ($\log\sigma=\log a+p\log n$). Hint: use curve_fit from SciPy. (12 pts)

In [5]:

import scipy.optimize
def fit_power_law(stddev_data):
    """Compute the best fit parameters a and p."""
    
    # Define the to be fitted function. We use the log, due to the calculation of the uncertainties.
    log_f = lambda n, a, p: np.log(a) + p*np.log(n)
    
    n, 𝜎 = stddev_data.T
    a_fit, p_fit = scipy.optimize.curve_fit(log_f, n, np.log(𝜎))[0]
    
    return a_fit, p_fit

In [6]:

from nose.tools import assert_almost_equal
assert_almost_equal(fit_power_law(np.array([[16.0,1.4],[32.0,1.1],[64.0,0.9]]))[0],3.36,delta=0.05)
assert_almost_equal(fit_power_law(np.array([[16.0,1.4],[32.0,1.1],[64.0,0.9]]))[1],-0.31,delta=0.03)

(c) Plot the data against the fitted curve $\sigma = a n^p$ in a log-log plot. Don't forget to properly label your axes. Hint: use loglog from matplotlib. (12 pts)

In [7]:

from matplotlib import pyplot as plt

data = np.array([[16.0,1.4],[32.0,1.1],[64.0,0.9]])
a_fit, p_fit = fit_power_law(data)

plt.figure()
plt.title("Data versus fit of $\sigma = an^p$")
plt.loglog(*data.T, label="data")
n_range = np.logspace(1, 2, 10)
plt.loglog(n_range, a_fit*n_range**p_fit, label="fit")
plt.xlabel("$n$")
plt.ylabel("$\sigma$")
plt.legend()
plt.show()

Volume of a unit ball in other dimensions¶

In this exercise you will adapt the direct-sampling Monte Carlo method of the pebble game to higher dimensions to estimate the $d$-dimensional volume of the $d$-dimensional ball of radius $1$.

(a) Adapt the direct-sampling Monte Carlo method to the pebble game in a $d$-dimensional hypercube $(-1,1)^d$ with an inscribed $d$-dimensional unit ball. (12 pts)

In [ ]:

def number_of_hits_in_d_dimensions(N,d):
    """Simulates number of hits in case of N trials in the d-dimensional direct-sampling pebble game."""
    # YOUR CODE HERE
    raise NotImplementedError()
    return number_hits

In [ ]:

from nose.tools import assert_almost_equal
assert_almost_equal(number_of_hits_in_d_dimensions(100,1),100,delta=1)
assert_almost_equal(number_of_hits_in_d_dimensions(2000,3),1045,delta=50)

(b) Compare your estimates for the volume $V_d$ of the $d$-dimensional unit ball for $N=10000$ trials and $d=1,2,\ldots,7$ to the exact formula $V_d = \frac{\pi^{d/2}}{\Gamma(\tfrac{d}{2}+1)}$ in a plot. Hint: the Gamma function is available in scipy.special.gamma. (12 pts)

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

Efficiency of the Metropolis algorithm and the 1/2-rule¶

In the lecture we mentioned the "1/2 rule" for Metropolis algorithms: if moves are rarely rejected then typically you do not explore the domain efficiently, while if most moves are rejected you are wasting efforts proposing moves. A good rule of thumb then is to aim for a rejection rate of $1/2$. In this exercise you are asked to test whether this rule of thumb makes any sense for the Markov-chain pebble game by varying the throwing range $\delta$.

(a) Estimate the mean square deviation $\mathbb{E}[(\tfrac{4\text{hits}}{\text{trials}} - \pi)^2 ]$ from $\pi$ for different values of $\delta$ ranging between $0$ and $3$, but fixed number of trials (say, 2000). For a decent estimate of the mean you will need at least $100$ repetitions. (14 pts)

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

(b) Measure the rejection rate for the same simulations as in (a). (14 pts)

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

(c) Plot both the mean square deviation and the rejection rate as function of $\delta$. How well does the 1/2 rule apply in this situation? (12 pts)

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

35 KiB Raw Blame History Unescape Escape

Exercise sheet¶

Empirical convergence rate in the pebble game¶

Volume of a unit ball in other dimensions¶

Efficiency of the Metropolis algorithm and the 1/2-rule¶

35 KiB

Raw Blame History