{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "4c431265",
   "metadata": {},
   "source": [
    "# Exercise sheet\n",
    "\n",
    "Some general remarks about the exercises:\n",
    "* For your convenience functions from the lecture are included below. Feel free to reuse them without copying to the exercise solution box.\n",
    "* For each part of the exercise a solution box has been added, but you may insert additional boxes. Do not hesitate to add Markdown boxes for textual or LaTeX answers (via `Cell > Cell Type > Markdown`). But make sure to replace any part that says `YOUR CODE HERE` or `YOUR ANSWER HERE` and remove the `raise NotImplementedError()`.\n",
    "* Please make your code readable by humans (and not just by the Python interpreter): choose informative function and variable names and use consistent formatting. Feel free to check the [PEP 8 Style Guide for Python](https://www.python.org/dev/peps/pep-0008/) for the widely adopted coding conventions or [this guide for explanation](https://realpython.com/python-pep8/).\n",
    "* Make sure that the full notebook runs without errors before submitting your work. This you can do by selecting `Kernel > Restart & Run All` in the jupyter menu.\n",
    "* For some exercises test cases have been provided in a separate cell in the form of `assert` statements. When run, a successful test will give no output, whereas a failed test will display an error message.\n",
    "* Each sheet has 100 points worth of exercises. Note that only the grades of sheets number 2, 4, 6, 8 count towards the course examination. Submitting sheets 1, 3, 5, 7 & 9 is voluntary and their grades are just for feedback.\n",
    "\n",
    "Please fill in your name here:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "026433a4",
   "metadata": {},
   "outputs": [],
   "source": [
    "NAME = \"\"\n",
    "NAMES_OF_COLLABORATORS = \"\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3b1bff64",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "41d26cde",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "de05c5cadee95d63f1acb0ab3f82894f",
     "grade": false,
     "grade_id": "cell-f29a87a28188c3d0",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "__Exercise sheet 2__\n",
    "\n",
    "Code from the lecture:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cb41d2a1",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "5435cd2800cbe70e733a364b79e86c9b",
     "grade": false,
     "grade_id": "cell-a6520f459483332d",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pylab as plt\n",
    "from scipy.integrate import quad\n",
    "\n",
    "rng = np.random.default_rng()\n",
    "%matplotlib inline\n",
    "\n",
    "def inversion_sample(f_inverse):\n",
    "    '''Obtain an inversion sample based on the inverse-CDF f_inverse.'''\n",
    "    return f_inverse(rng.random())\n",
    "\n",
    "def compare_plot(samples,pdf,xmin,xmax,bins):\n",
    "    '''Draw a plot comparing the histogram of the samples to the expectation coming from the pdf.'''\n",
    "    xval = np.linspace(xmin,xmax,bins+1)\n",
    "    binsize = (xmax-xmin)/bins\n",
    "    # Calculate the expected numbers by numerical integration of the pdf over the bins\n",
    "    expected = np.array([quad(pdf,xval[i],xval[i+1])[0] for i in range(bins)])/binsize\n",
    "    measured = np.histogram(samples,bins,(xmin,xmax))[0]/(len(samples)*binsize)\n",
    "    plt.plot(xval,np.append(expected,expected[-1]),\"-k\",drawstyle=\"steps-post\")\n",
    "    plt.bar((xval[:-1]+xval[1:])/2,measured,width=binsize)\n",
    "    plt.xlim(xmin,xmax)\n",
    "    plt.legend([\"expected\",\"histogram\"])\n",
    "    plt.show()\n",
    "    \n",
    "def gaussian(x):\n",
    "    return np.exp(-x*x/2)/np.sqrt(2*np.pi)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3317e002",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "d2c3d8374cf18fd1a12c91353f28dbcf",
     "grade": false,
     "grade_id": "cell-e6c28b1e3e8371c3",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "## Sampling random variables via the inversion method \n",
    "__(35 Points)__\n",
    "\n",
    "Recall from the lecture that for any real random variable $X$ we can construct an explicit random variable via the inversion method that is identically distributed. This random variable is given by $F_X^{-1}(U)$ where $F_X$ is the CDF of $X$ and $U$ is a uniform random variable on $(0,1)$ and \n",
    "\n",
    "$$\n",
    "F_X^{-1}(p) := \\inf\\{ x\\in\\mathbb{R} : F_X(x) \\geq p\\}.\n",
    "$$\n",
    "\n",
    "This gives a very general way of sampling $X$ in a computer program, as you will find out in this exercise.\n",
    "\n",
    "__(a)__ Let $X$ be an **exponential random variable** with **rate** $\\lambda$, i.e. a continuous random variable with probability density function $f_X(x) = \\lambda e^{-\\lambda x}$ for $x > 0$. Write a function `f_inverse_exponential` that computes $F_X^{-1}(p)$. Illustrate the corresponding sampler with the help of the function `compare_plot` above. __(10 pts)__"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6f2c475a",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "4292b1a356454d496a93ef6555f0a7ae",
     "grade": true,
     "grade_id": "cell-311fd25e116f5066",
     "locked": false,
     "points": 5,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "source": [
    "YOUR ANSWER HERE"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e6b6428c",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "90de5b60de4e43881ab85442cdff704a",
     "grade": false,
     "grade_id": "cell-06ef7d054d38f5c6",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "def f_inv_exponential(lam,p):\n",
    "    # YOUR CODE HERE\n",
    "    raise NotImplementedError()\n",
    "    \n",
    "# plotting\n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "804aedbf",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "bce45fa412ba32138080832767338e9d",
     "grade": true,
     "grade_id": "cell-2022e00546cf1bb0",
     "locked": true,
     "points": 5,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "from nose.tools import assert_almost_equal\n",
    "assert_almost_equal(f_inv_exponential(1.0,0.6),0.916,delta=0.001)\n",
    "assert_almost_equal(f_inv_exponential(0.3,0.2),0.743,delta=0.001)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d590b09d",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "08fdb1c6ca42806566800f06d7ffb22b",
     "grade": false,
     "grade_id": "cell-f7e0d9b58c948be5",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "__(b)__ Let now $X$ have the **Pareto distribution** of **shape** $\\alpha > 0$ on $(b,\\infty)$, which has  probability density function $f_X(x) = \\alpha b^{\\alpha} x^{-\\alpha-1}$ for $x > b$. Write a function `f_inv_pareto` that computes $F_X^{-1}(p)$. Compare a histogram with a plot of $f_X(x)$ to verify your function numerically. __(10 pts)__"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "47c7a42f",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "1d1fc6a16462f0d238005fdb33a99857",
     "grade": true,
     "grade_id": "cell-199713328dcd510d",
     "locked": false,
     "points": 5,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "source": [
    "YOUR ANSWER HERE"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e177f32d",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "eb07f40a935275cf5883204fc817beaa",
     "grade": false,
     "grade_id": "cell-074f6a1fd6375c22",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "### Solution\n",
    "def f_inv_pareto(alpha,b,p):\n",
    "    # YOUR CODE HERE\n",
    "    raise NotImplementedError()\n",
    "\n",
    "# plotting\n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c0e1426f",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "62920089752d067b0945eb1d6d98135f",
     "grade": true,
     "grade_id": "cell-726b321246679d28",
     "locked": true,
     "points": 5,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "from nose.tools import assert_almost_equal\n",
    "assert_almost_equal(f_inv_pareto(1.0,1.5,0.6),3.75,delta=0.0001)\n",
    "assert_almost_equal(f_inv_pareto(2.0,2.25,0.3),2.689,delta=0.001)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "66d91446",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "0f3c9abbe9fe756c5cf4bdd6a8a37ac2",
     "grade": false,
     "grade_id": "cell-50306550727804ca",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "__(c)__ Let $X$ be a discrete random variable taking values in $\\{1,2,\\ldots,n\\}$. Write a Python function `f_inv_discrete` that takes the probability mass function $p_X$ as a list `prob_list` given by $[p_X(1),\\ldots,p_X(n)]$ and returns a random sample with the distribution of $X$ using the inversion method. Verify the working of your function numerically on an example. __(15 pts)__"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "210f1302",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "93d51c9c889dd5ba3490e0ee298d4240",
     "grade": false,
     "grade_id": "cell-694eb1261c2dc217",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "def f_inv_discrete(prob_list,p):\n",
    "    # YOUR CODE HERE\n",
    "    raise NotImplementedError()\n",
    "\n",
    "# plotting\n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3c691f0a",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "b11d87e414ba9dfe2741d73dd95a2f12",
     "grade": true,
     "grade_id": "cell-140af6b31464fbef",
     "locked": true,
     "points": 15,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "assert f_inv_discrete([0.5,0.5],0.4)==1\n",
    "assert f_inv_discrete([0.5,0.5],0.8)==2\n",
    "assert f_inv_discrete([0,0,1],0.1)==3"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "47546d37",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "32dd38f0f963c6132fcbe3ef1f5b9682",
     "grade": false,
     "grade_id": "cell-49fd13dc534dfa28",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "## Central limit theorem? \n",
    "__(35 Points)__\n",
    "\n",
    "In this exercise we will have a closer look at central limits of the Pareto distribution, for which you implemented a random sampler in the previous exercise. By performing the appropriate integrals it is straightforward to show that \n",
    "\n",
    "$$ \n",
    "\\mathbb{E}[X] = \\begin{cases} \\infty & \\text{for }\\alpha \\leq 1 \\\\ \\frac{\\alpha b}{\\alpha - 1} & \\text{for }\\alpha > 1 \\end{cases}, \\qquad \\operatorname{Var}(X) = \\begin{cases} \\infty & \\text{for }\\alpha \\leq 2 \\\\ \\frac{\\alpha b^2}{(\\alpha - 1)^2(\\alpha-2)} & \\text{for }\\alpha > 2 \\end{cases}.\n",
    "$$\n",
    "\n",
    "This shows in particular that the distribution is **heavy tailed**, in the sense that some moments $\\mathbb{E}[X^k]$ diverge."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ccae582d",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "e6d5659ef88eccfb693b35a088d0d50f",
     "grade": false,
     "grade_id": "cell-a05e255c144ef6c5",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "__(a)__ Write a function `sample_Zn` that produces a random sample for $Z_n= \\frac{\\sqrt{n}}{\\sigma_X}(\\bar{X}_n - \\mathbb{E}[X])$ given $\\alpha>2$, $b>0$ and $n\\geq 1$. Visually verify the central limit theorem for $\\alpha = 4$, $b=1$ and $n=1000$ by comparing a histogram of $Z_n$ to the standard normal distribution (you may use `compare_plot`). __(10 pts)__"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "82fe6efd",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "177917ec75361799067d6c23a28569cd",
     "grade": false,
     "grade_id": "cell-b7186322b09717f8",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "def sample_Zn(alpha,b,n):\n",
    "    # YOUR CODE HERE\n",
    "    raise NotImplementedError()\n",
    "\n",
    "# Plotting\n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b5360d77",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "e50b33644ddd6bce391b36cefcc2e308",
     "grade": true,
     "grade_id": "cell-5d16b014bef9d86f",
     "locked": true,
     "points": 10,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "assert_almost_equal(np.mean([sample_Zn(3.5,2.1,100) for _ in range(100)]),0,delta=0.3)\n",
    "assert_almost_equal(np.std([sample_Zn(3.5,2.1,100) for _ in range(100)]),1,delta=0.3)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6192f05d",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "08ece68d59de21d798d9a955f59be690",
     "grade": false,
     "grade_id": "cell-3e7a23657e9b8374",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "__(b)__ Now take $\\alpha = 3/2$ and $b=1$. \n",
    "With some work (which you do not have to do) one can show that the characteristic function of $X$ admits the following expansion around $t=0$,\n",
    "\n",
    "$$\n",
    "\\varphi_X(t) = 1 + 3 i t - (|t|+i t)\\,\\sqrt{2\\pi|t|} + O(t^{2}).\n",
    "$$\n",
    "\n",
    "Based on this, prove the **generalized CLT** for this particular distribution $X$ which states that $Z_n = c\\, n^{1/3} (\\bar{X}_n - \\mathbb{E}[X])$ in the limit $n\\rightarrow\\infty$ converges in distribution, with a to-be-determined choice of overall constant $c$, to a limiting random variable $\\mathcal{S}$ with characteristic function \n",
    "\n",
    "$$\n",
    "\\varphi_{\\mathcal{S}}(t) = \\exp\\big(-(|t|+it)\\sqrt{|t|}\\big).\n",
    "$$\n",
    "\n",
    "__(15 pts)__"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9735cd88",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "dfd8683eea5663baa81f138a2809722b",
     "grade": true,
     "grade_id": "cell-b25551eca32c4807",
     "locked": false,
     "points": 15,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "source": [
    "YOUR ANSWER HERE"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5b1d9f54",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "342020128f929d47eabfdf9c075ff20c",
     "grade": false,
     "grade_id": "cell-d1701433c3c77172",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "__(c)__ The random variable $\\mathcal{S}$ has a [stable Lévy distribution](https://en.wikipedia.org/wiki/Stable_distribution) with index $\\alpha = 3/2$ and skewness $\\beta = 1$. Its probability density function $f_{\\mathcal{S}}(x)$ does not admit a simple expression, but can be accessed numerically using SciPy's `scipy.stats.levy_stable.pdf(x,1.5,1.0)`. Verify numerically that the generalized CLT of part (b) holds by comparing an appropriate histogram to this PDF. __(10 pts)__"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b06896e5",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "c6fe081427f342c354ee8a9b3b3331e7",
     "grade": true,
     "grade_id": "cell-e08d054985cfa762",
     "locked": false,
     "points": 10,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "from scipy.stats import levy_stable\n",
    "\n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f49856d8",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "d8c57e5a527eaad8318e7d31dba01694",
     "grade": false,
     "grade_id": "cell-bc80caacda124bf9",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "## Joint probability density functions and sampling the normal distribution \n",
    "__(30 Points)__\n",
    "\n",
    "Let $\\Phi$ be a uniform random variable on $(0,2\\pi)$ and $R$ an independent continuous random variable with probability density function $f_R(r) = r\\,e^{-r^2/2}$ for $r>0$. Set $X = R \\cos \\Phi$ and $Y = R \\sin \\Phi$. This is called the **Box-Muller transform**.\n",
    "\n",
    "__(a)__ Since $\\Phi$ and $R$ are independent, the joint probability density of $\\Phi$ and $R$ is $f_{\\Phi,R}(\\phi,r) = f_\\Phi(\\phi)f_R(r) = \\frac{1}{2\\pi}\\, r\\,e^{-r^2/2}$. Show by change of variables that $X$ and $Y$ are also independent and both distributed as a standard normal distribution $\\mathcal{N}$. __(15 pts)__"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aa3821de",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "2514e6664aeb4e24a9e881522a8f3a0f",
     "grade": true,
     "grade_id": "cell-4f20e3b730ba0d23",
     "locked": false,
     "points": 15,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "source": [
    "YOUR ANSWER HERE"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5d064cef",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "1af73334332fe512ef7d0edb5803a58d",
     "grade": false,
     "grade_id": "cell-2f07fdb2a906bb71",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "__(b)__ Write a function to sample a pair of independent normal random variables using the Box-Muller transform. Hint: to sample $R$ you can use the inversion method of the first exercise. Produce a histogram to check the distribution of your normal variables. __(15 pts)__"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e4023f99",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "86173970c865da7b0cb8ab78ec4a87b6",
     "grade": true,
     "grade_id": "cell-9bf8873cce1d179c",
     "locked": false,
     "points": 15,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "def random_normal_pair():\n",
    "    '''Return two independent normal random variables.'''\n",
    "    # YOUR CODE HERE\n",
    "    raise NotImplementedError()\n",
    "    return x, y\n",
    "\n",
    "# Plotting\n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}