{ "cells": [ { "cell_type": "markdown", "id": "4c431265", "metadata": {}, "source": [ "# Exercise sheet\n", "\n", "Some general remarks about the exercises:\n", "* For your convenience functions from the lecture are included below. Feel free to reuse them without copying to the exercise solution box.\n", "* For each part of the exercise a solution box has been added, but you may insert additional boxes. Do not hesitate to add Markdown boxes for textual or LaTeX answers (via `Cell > Cell Type > Markdown`). But make sure to replace any part that says `YOUR CODE HERE` or `YOUR ANSWER HERE` and remove the `raise NotImplementedError()`.\n", "* Please make your code readable by humans (and not just by the Python interpreter): choose informative function and variable names and use consistent formatting. Feel free to check the [PEP 8 Style Guide for Python](https://www.python.org/dev/peps/pep-0008/) for the widely adopted coding conventions or [this guide for explanation](https://realpython.com/python-pep8/).\n", "* Make sure that the full notebook runs without errors before submitting your work. This you can do by selecting `Kernel > Restart & Run All` in the jupyter menu.\n", "* For some exercises test cases have been provided in a separate cell in the form of `assert` statements. When run, a successful test will give no output, whereas a failed test will display an error message.\n", "* Each sheet has 100 points worth of exercises. Note that only the grades of sheets number 2, 4, 6, 8 count towards the course examination. Submitting sheets 1, 3, 5, 7 & 9 is voluntary and their grades are just for feedback.\n", "\n", "Please fill in your name here:" ] }, { "cell_type": "code", "execution_count": null, "id": "026433a4", "metadata": {}, "outputs": [], "source": [ "NAME = \"\"\n", "NAMES_OF_COLLABORATORS = \"\"" ] }, { "cell_type": "markdown", "id": "3b1bff64", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "id": "41d26cde", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "de05c5cadee95d63f1acb0ab3f82894f", "grade": false, "grade_id": "cell-f29a87a28188c3d0", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "__Exercise sheet 2__\n", "\n", "Code from the lecture:" ] }, { "cell_type": "code", "execution_count": null, "id": "cb41d2a1", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "5435cd2800cbe70e733a364b79e86c9b", "grade": false, "grade_id": "cell-a6520f459483332d", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pylab as plt\n", "from scipy.integrate import quad\n", "\n", "rng = np.random.default_rng()\n", "%matplotlib inline\n", "\n", "def inversion_sample(f_inverse):\n", " '''Obtain an inversion sample based on the inverse-CDF f_inverse.'''\n", " return f_inverse(rng.random())\n", "\n", "def compare_plot(samples,pdf,xmin,xmax,bins):\n", " '''Draw a plot comparing the histogram of the samples to the expectation coming from the pdf.'''\n", " xval = np.linspace(xmin,xmax,bins+1)\n", " binsize = (xmax-xmin)/bins\n", " # Calculate the expected numbers by numerical integration of the pdf over the bins\n", " expected = np.array([quad(pdf,xval[i],xval[i+1])[0] for i in range(bins)])/binsize\n", " measured = np.histogram(samples,bins,(xmin,xmax))[0]/(len(samples)*binsize)\n", " plt.plot(xval,np.append(expected,expected[-1]),\"-k\",drawstyle=\"steps-post\")\n", " plt.bar((xval[:-1]+xval[1:])/2,measured,width=binsize)\n", " plt.xlim(xmin,xmax)\n", " plt.legend([\"expected\",\"histogram\"])\n", " plt.show()\n", " \n", "def gaussian(x):\n", " return np.exp(-x*x/2)/np.sqrt(2*np.pi)" ] }, { "cell_type": "markdown", "id": "3317e002", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "d2c3d8374cf18fd1a12c91353f28dbcf", "grade": false, "grade_id": "cell-e6c28b1e3e8371c3", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "## Sampling random variables via the inversion method \n", "__(35 Points)__\n", "\n", "Recall from the lecture that for any real random variable $X$ we can construct an explicit random variable via the inversion method that is identically distributed. This random variable is given by $F_X^{-1}(U)$ where $F_X$ is the CDF of $X$ and $U$ is a uniform random variable on $(0,1)$ and \n", "\n", "$$\n", "F_X^{-1}(p) := \\inf\\{ x\\in\\mathbb{R} : F_X(x) \\geq p\\}.\n", "$$\n", "\n", "This gives a very general way of sampling $X$ in a computer program, as you will find out in this exercise.\n", "\n", "__(a)__ Let $X$ be an **exponential random variable** with **rate** $\\lambda$, i.e. a continuous random variable with probability density function $f_X(x) = \\lambda e^{-\\lambda x}$ for $x > 0$. Write a function `f_inverse_exponential` that computes $F_X^{-1}(p)$. Illustrate the corresponding sampler with the help of the function `compare_plot` above. __(10 pts)__" ] }, { "cell_type": "markdown", "id": "6f2c475a", "metadata": { "deletable": false, "nbgrader": { "cell_type": "markdown", "checksum": "4292b1a356454d496a93ef6555f0a7ae", "grade": true, "grade_id": "cell-311fd25e116f5066", "locked": false, "points": 5, "schema_version": 3, "solution": true, "task": false } }, "source": [ "YOUR ANSWER HERE" ] }, { "cell_type": "code", "execution_count": null, "id": "e6b6428c", "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "90de5b60de4e43881ab85442cdff704a", "grade": false, "grade_id": "cell-06ef7d054d38f5c6", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "def f_inv_exponential(lam,p):\n", " # YOUR CODE HERE\n", " raise NotImplementedError()\n", " \n", "# plotting\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "id": "804aedbf", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "bce45fa412ba32138080832767338e9d", "grade": true, "grade_id": "cell-2022e00546cf1bb0", "locked": true, "points": 5, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "from nose.tools import assert_almost_equal\n", "assert_almost_equal(f_inv_exponential(1.0,0.6),0.916,delta=0.001)\n", "assert_almost_equal(f_inv_exponential(0.3,0.2),0.743,delta=0.001)" ] }, { "cell_type": "markdown", "id": "d590b09d", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "08fdb1c6ca42806566800f06d7ffb22b", "grade": false, "grade_id": "cell-f7e0d9b58c948be5", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "__(b)__ Let now $X$ have the **Pareto distribution** of **shape** $\\alpha > 0$ on $(b,\\infty)$, which has probability density function $f_X(x) = \\alpha b^{\\alpha} x^{-\\alpha-1}$ for $x > b$. Write a function `f_inv_pareto` that computes $F_X^{-1}(p)$. Compare a histogram with a plot of $f_X(x)$ to verify your function numerically. __(10 pts)__" ] }, { "cell_type": "markdown", "id": "47c7a42f", "metadata": { "deletable": false, "nbgrader": { "cell_type": "markdown", "checksum": "1d1fc6a16462f0d238005fdb33a99857", "grade": true, "grade_id": "cell-199713328dcd510d", "locked": false, "points": 5, "schema_version": 3, "solution": true, "task": false } }, "source": [ "YOUR ANSWER HERE" ] }, { "cell_type": "code", "execution_count": null, "id": "e177f32d", "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "eb07f40a935275cf5883204fc817beaa", "grade": false, "grade_id": "cell-074f6a1fd6375c22", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "### Solution\n", "def f_inv_pareto(alpha,b,p):\n", " # YOUR CODE HERE\n", " raise NotImplementedError()\n", "\n", "# plotting\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "id": "c0e1426f", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "62920089752d067b0945eb1d6d98135f", "grade": true, "grade_id": "cell-726b321246679d28", "locked": true, "points": 5, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "from nose.tools import assert_almost_equal\n", "assert_almost_equal(f_inv_pareto(1.0,1.5,0.6),3.75,delta=0.0001)\n", "assert_almost_equal(f_inv_pareto(2.0,2.25,0.3),2.689,delta=0.001)" ] }, { "cell_type": "markdown", "id": "66d91446", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "0f3c9abbe9fe756c5cf4bdd6a8a37ac2", "grade": false, "grade_id": "cell-50306550727804ca", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "__(c)__ Let $X$ be a discrete random variable taking values in $\\{1,2,\\ldots,n\\}$. Write a Python function `f_inv_discrete` that takes the probability mass function $p_X$ as a list `prob_list` given by $[p_X(1),\\ldots,p_X(n)]$ and returns a random sample with the distribution of $X$ using the inversion method. Verify the working of your function numerically on an example. __(15 pts)__" ] }, { "cell_type": "code", "execution_count": null, "id": "210f1302", "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "93d51c9c889dd5ba3490e0ee298d4240", "grade": false, "grade_id": "cell-694eb1261c2dc217", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "def f_inv_discrete(prob_list,p):\n", " # YOUR CODE HERE\n", " raise NotImplementedError()\n", "\n", "# plotting\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "id": "3c691f0a", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "b11d87e414ba9dfe2741d73dd95a2f12", "grade": true, "grade_id": "cell-140af6b31464fbef", "locked": true, "points": 15, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "assert f_inv_discrete([0.5,0.5],0.4)==1\n", "assert f_inv_discrete([0.5,0.5],0.8)==2\n", "assert f_inv_discrete([0,0,1],0.1)==3" ] }, { "cell_type": "markdown", "id": "47546d37", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "32dd38f0f963c6132fcbe3ef1f5b9682", "grade": false, "grade_id": "cell-49fd13dc534dfa28", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "## Central limit theorem? \n", "__(35 Points)__\n", "\n", "In this exercise we will have a closer look at central limits of the Pareto distribution, for which you implemented a random sampler in the previous exercise. By performing the appropriate integrals it is straightforward to show that \n", "\n", "$$ \n", "\\mathbb{E}[X] = \\begin{cases} \\infty & \\text{for }\\alpha \\leq 1 \\\\ \\frac{\\alpha b}{\\alpha - 1} & \\text{for }\\alpha > 1 \\end{cases}, \\qquad \\operatorname{Var}(X) = \\begin{cases} \\infty & \\text{for }\\alpha \\leq 2 \\\\ \\frac{\\alpha b^2}{(\\alpha - 1)^2(\\alpha-2)} & \\text{for }\\alpha > 2 \\end{cases}.\n", "$$\n", "\n", "This shows in particular that the distribution is **heavy tailed**, in the sense that some moments $\\mathbb{E}[X^k]$ diverge." ] }, { "cell_type": "markdown", "id": "ccae582d", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "e6d5659ef88eccfb693b35a088d0d50f", "grade": false, "grade_id": "cell-a05e255c144ef6c5", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "__(a)__ Write a function `sample_Zn` that produces a random sample for $Z_n= \\frac{\\sqrt{n}}{\\sigma_X}(\\bar{X}_n - \\mathbb{E}[X])$ given $\\alpha>2$, $b>0$ and $n\\geq 1$. Visually verify the central limit theorem for $\\alpha = 4$, $b=1$ and $n=1000$ by comparing a histogram of $Z_n$ to the standard normal distribution (you may use `compare_plot`). __(10 pts)__" ] }, { "cell_type": "code", "execution_count": null, "id": "82fe6efd", "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "177917ec75361799067d6c23a28569cd", "grade": false, "grade_id": "cell-b7186322b09717f8", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "def sample_Zn(alpha,b,n):\n", " # YOUR CODE HERE\n", " raise NotImplementedError()\n", "\n", "# Plotting\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "id": "b5360d77", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "e50b33644ddd6bce391b36cefcc2e308", "grade": true, "grade_id": "cell-5d16b014bef9d86f", "locked": true, "points": 10, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "assert_almost_equal(np.mean([sample_Zn(3.5,2.1,100) for _ in range(100)]),0,delta=0.3)\n", "assert_almost_equal(np.std([sample_Zn(3.5,2.1,100) for _ in range(100)]),1,delta=0.3)" ] }, { "cell_type": "markdown", "id": "6192f05d", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "08ece68d59de21d798d9a955f59be690", "grade": false, "grade_id": "cell-3e7a23657e9b8374", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "__(b)__ Now take $\\alpha = 3/2$ and $b=1$. \n", "With some work (which you do not have to do) one can show that the characteristic function of $X$ admits the following expansion around $t=0$,\n", "\n", "$$\n", "\\varphi_X(t) = 1 + 3 i t - (|t|+i t)\\,\\sqrt{2\\pi|t|} + O(t^{2}).\n", "$$\n", "\n", "Based on this, prove the **generalized CLT** for this particular distribution $X$ which states that $Z_n = c\\, n^{1/3} (\\bar{X}_n - \\mathbb{E}[X])$ in the limit $n\\rightarrow\\infty$ converges in distribution, with a to-be-determined choice of overall constant $c$, to a limiting random variable $\\mathcal{S}$ with characteristic function \n", "\n", "$$\n", "\\varphi_{\\mathcal{S}}(t) = \\exp\\big(-(|t|+it)\\sqrt{|t|}\\big).\n", "$$\n", "\n", "__(15 pts)__" ] }, { "cell_type": "markdown", "id": "9735cd88", "metadata": { "deletable": false, "nbgrader": { "cell_type": "markdown", "checksum": "dfd8683eea5663baa81f138a2809722b", "grade": true, "grade_id": "cell-b25551eca32c4807", "locked": false, "points": 15, "schema_version": 3, "solution": true, "task": false } }, "source": [ "YOUR ANSWER HERE" ] }, { "cell_type": "markdown", "id": "5b1d9f54", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "342020128f929d47eabfdf9c075ff20c", "grade": false, "grade_id": "cell-d1701433c3c77172", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "__(c)__ The random variable $\\mathcal{S}$ has a [stable Lévy distribution](https://en.wikipedia.org/wiki/Stable_distribution) with index $\\alpha = 3/2$ and skewness $\\beta = 1$. Its probability density function $f_{\\mathcal{S}}(x)$ does not admit a simple expression, but can be accessed numerically using SciPy's `scipy.stats.levy_stable.pdf(x,1.5,1.0)`. Verify numerically that the generalized CLT of part (b) holds by comparing an appropriate histogram to this PDF. __(10 pts)__" ] }, { "cell_type": "code", "execution_count": null, "id": "b06896e5", "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "c6fe081427f342c354ee8a9b3b3331e7", "grade": true, "grade_id": "cell-e08d054985cfa762", "locked": false, "points": 10, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "from scipy.stats import levy_stable\n", "\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "markdown", "id": "f49856d8", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "d8c57e5a527eaad8318e7d31dba01694", "grade": false, "grade_id": "cell-bc80caacda124bf9", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "## Joint probability density functions and sampling the normal distribution \n", "__(30 Points)__\n", "\n", "Let $\\Phi$ be a uniform random variable on $(0,2\\pi)$ and $R$ an independent continuous random variable with probability density function $f_R(r) = r\\,e^{-r^2/2}$ for $r>0$. Set $X = R \\cos \\Phi$ and $Y = R \\sin \\Phi$. This is called the **Box-Muller transform**.\n", "\n", "__(a)__ Since $\\Phi$ and $R$ are independent, the joint probability density of $\\Phi$ and $R$ is $f_{\\Phi,R}(\\phi,r) = f_\\Phi(\\phi)f_R(r) = \\frac{1}{2\\pi}\\, r\\,e^{-r^2/2}$. Show by change of variables that $X$ and $Y$ are also independent and both distributed as a standard normal distribution $\\mathcal{N}$. __(15 pts)__" ] }, { "cell_type": "markdown", "id": "aa3821de", "metadata": { "deletable": false, "nbgrader": { "cell_type": "markdown", "checksum": "2514e6664aeb4e24a9e881522a8f3a0f", "grade": true, "grade_id": "cell-4f20e3b730ba0d23", "locked": false, "points": 15, "schema_version": 3, "solution": true, "task": false } }, "source": [ "YOUR ANSWER HERE" ] }, { "cell_type": "markdown", "id": "5d064cef", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "1af73334332fe512ef7d0edb5803a58d", "grade": false, "grade_id": "cell-2f07fdb2a906bb71", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "__(b)__ Write a function to sample a pair of independent normal random variables using the Box-Muller transform. Hint: to sample $R$ you can use the inversion method of the first exercise. Produce a histogram to check the distribution of your normal variables. __(15 pts)__" ] }, { "cell_type": "code", "execution_count": null, "id": "e4023f99", "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "86173970c865da7b0cb8ab78ec4a87b6", "grade": true, "grade_id": "cell-9bf8873cce1d179c", "locked": false, "points": 15, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "def random_normal_pair():\n", " '''Return two independent normal random variables.'''\n", " # YOUR CODE HERE\n", " raise NotImplementedError()\n", " return x, y\n", "\n", "# Plotting\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.12" } }, "nbformat": 4, "nbformat_minor": 5 }