Files
cds-numerical-methods/Week 3/7 Linear Equation Systems.ipynb

20 KiB

CDS: Numerical Methods Assignments

  • See lecture notes and documentation on Brightspace for Python and Jupyter basics. If you are stuck, try to google or get in touch via Discord.

  • Solutions must be submitted via the Jupyter Hub.

  • Make sure you fill in any place that says YOUR CODE HERE or "YOUR ANSWER HERE".

Submission

  1. Name all team members in the the cell below
  2. make sure everything runs as expected
  3. restart the kernel (in the menubar, select Kernel$\rightarrow$Restart)
  4. run all cells (in the menubar, select Cell$\rightarrow$Run All)
  5. Check all outputs (Out[*]) for errors and resolve them if necessary
  6. submit your solutions in time (before the deadline)
team_members = "Koen Vendrig, Kees van Kempen"

Linear Equation Systems

In the following you will implement the Gauss-Seidel (GS), Steepest Descent (SD) and the Conjugate Gradient (CG) algorithms to solve linear equation systems of the form

$$A \mathbf{x} = \mathbf{b},$$

with $A$ being an $n \times n$ matrix.

In [7]:
import numpy as np

# We'll generate a random matrix, but for now we're unsure whether it results in a
# solvable system.
n = 6
A =  7*(np.random.rand(n,n) - .5)
b = 13*(np.random.rand(n)   - .5)
A
Out[7]:
array([[ 1.75781401,  1.51265934, -2.74738978, -1.07039898,  0.8011521 ,
         1.4647544 ],
       [-2.71620167, -0.86457733,  0.10081129, -0.73161518, -2.78224414,
         0.41565052],
       [-0.67163289,  1.56932333,  2.48595706, -2.40628501,  0.71051658,
        -0.69975624],
       [ 1.58537064,  1.68053005,  2.06468208,  2.35512153, -0.66624755,
         2.62150694],
       [-3.14413521, -2.23161046,  1.14029891, -1.55518494,  2.72967878,
        -2.42673772],
       [-2.97430088, -2.15350692, -2.82033592,  2.92079141,  2.94935414,
        -0.35649436]])

Task 1

First, you need to implement a Python function $\text{diff(a,b)}$, which returns the difference $\text{d}$ between two $n$-dimensional vectors $\text{a}$ and $\text{b}$ according to

$$ d = || \mathbf{a} - \mathbf{b}||_\infty = \underset{i=1,2,\dots,n}{\operatorname{max}} |a_i - b_i|. $$
In [6]:
def diff(a, b):
    return np.max(a - b)

Task 2

The Gauss-Seidel iteration scheme to solve the linear equation system

$$A \mathbf{x} = \mathbf{b}$$

is defined by

$$x_i^{(k)} = \frac{1}{a_{ii}} \left[ -\sum_{j=0}^{i-1} a_{ij} x_j^{(k)} -\sum_{j=i+1}^{n-1} a_{ij} x_j^{(k-1)} + b_i \right].$$

Note especially the difference in the sums: the first one involves $x_j^{(k)}$ and the second one $x_j^{(k-1)}$.

Give the outline of the derivation in LaTeX math notation in the markdown cell below. (Double click on "YOUR ANSWER HERE" to open the cell, and ctrl+enter to compile.)

Hint: Similar to the Jacobi scheme, start by seperating the matrix $A$ into its diagonal ($D$), lower triangular ($L$) and upper triangular ($U$) forms, such that $A = D - L - U$.

YOUR ANSWER HERE

Task 3

Implement the Gauss-Seidel iteration scheme derived above

$$x_i^{(k)} = \frac{1}{a_{ii}} \left[ -\sum_{j=0}^{i-1} a_{ij} x_j^{(k)} -\sum_{j=i+1}^{n-1} a_{ij} x_j^{(k-1)} + b_i \right],$$

where $a_{ij}$ are the elements of the matrix $A$, and $x_i$ and $b_i$ the elements of vectors $\mathbf{x}$ and $\mathbf{b}$, respectively.

Write a Python function $\text{GS(A, b, eps)}$, where $\text{A}$ represents the $n \times n$ $A$ matrix, $\text{b}$ represents the $n$-dimensional solution vector $\mathbf{b}$, and $\text{eps}$ is a scalar $\varepsilon$ defining the accuracy up to which the iteration is performed. Your function should return both the solution vector $\mathbf{x}^{(k)}$ from the last iteration step and the corresponding iteration index $k$.

Use an assertion to make sure the diagonal elements of $A$ are all non-zero. Initialize your iteration with $\mathbf{x}^{(0)} = \mathbf{0}$ (or with $\mathbf{x}^{(1)} = D^{-1}\mathbf{b}$, with $D$ the diagonal of $A$) and increase $k$ until $|| \mathbf{x}^{(k)} - \mathbf{x}^{(k-1)}||_\infty < \varepsilon$.

In [ ]:
def GS(A, b, eps):
    # YOUR CODE HERE
    raise NotImplementedError()

Task 4

Verify your implementation by comparing your approximate result to an exact solution. Use $\text{numpy.linalg.solve()}$ to obtain the exact solution of the system

$$ \begin{align*} \begin{pmatrix} 10 & -1 & 2 & 0 \\ -1 & 11 &-1 & 3 \\ 2 & -1 & 10&-1 \\ 0 & 3 & -1& 8 \end{pmatrix} \mathbf{x}^* = \begin{pmatrix} 6 \\ 25 \\ -11\\ 15 \end{pmatrix} \end{align*} $$

Then compare you approximate result $\mathbf{\tilde{x}}$ to the exact result $\mathbf{x^*}$ by plotting $|| \mathbf{x}^* - \mathbf{\tilde{x}}||_\infty$ for different accuracies $\varepsilon = 10^{-1}, 10^{-2}, 10^{-3}, 10^{-4}$.

Implement a unit test for your function using this system.

In [ ]:
# Here, Koen, I made the matrix.
A = np.array([[ 10, - 1,   2,   0],
              [- 1,  11, - 1,   3],
              [  2, - 1,  10, - 1],
              [  0,   3, - 1,   8]])
b = np.array( [  6,  25, -11,  15])
# Do your plotting here ...

# YOUR CODE HERE
raise NotImplementedError()
In [ ]:
def test_GS():
    # YOUR CODE HERE
    raise NotImplementedError()
    
test_GS()

Task 5

Next, implement the Steepest Descent algorithm in a similar Python function $\text{SD(A, b, eps)}$, which calculates

\begin{align*} \mathbf{v}^{(k)} &= \mathbf{b} - A \mathbf{x}^{(k-1)} \\ t_k &= \frac{ \langle \mathbf{v}^{(k)}, \mathbf{v}^{(k)} \rangle }{ \langle \mathbf{v}^{(k)}, A \mathbf{v}^{(k)}\rangle } \\ \mathbf{x}^{(k)} &= \mathbf{x}^{(k-1)} + t_k \mathbf{v}^{(k)} . \end{align*}

Initialize your iteration again with $\mathbf{x}^{(0)} = \mathbf{0}$ and increase $k$ until $|| \mathbf{x}^{(k)} - \mathbf{x}^{(k-1)}||_\infty < \varepsilon$. Return the solution vector $\mathbf{x}^{(k)}$ from the last iteration step and the corresponding iteration index $k$. Implement a unit test for your implementation by comparing your result to the exact solution of the system in task 4. Use $\text{numpy.dot()}$ for all needed vector/vector and matrix/vector products.

In [ ]:
def SD(A, b, eps):
    # YOUR CODE HERE
    raise NotImplementedError()
In [ ]:
def test_SD():
    # YOUR CODE HERE
    raise NotImplementedError()
    
test_SD()

Task 6

Finally, based on your steepest decent implementation from task 5, implement the Conjugate Gradient algorithm in a Python function $\text{CG(A, b, eps)}$ in the following way:

Initialize your procedure with:

\begin{align*} \mathbf{x}^{(0)} &= \mathbf{0} \\ \mathbf{r}^{(0)} &= \mathbf{b} - A \mathbf{x}^{(0)} \\ \mathbf{v}^{(0)} &= \mathbf{r}^{(0)} \end{align*}

Then increase $k$ and repeat the following until $|| \mathbf{x}^{(k)} - \mathbf{x}^{(k-1)}||_\infty < \varepsilon$.

\begin{align*} t_k &= \frac{ \langle \mathbf{r}^{(k)}, \mathbf{r}^{(k)} \rangle }{ \langle \mathbf{v}^{(k)}, A \mathbf{v}^{(k)} \rangle } \\ \mathbf{x}^{(k+1)} &= \mathbf{x}^{(k)} + t_k \mathbf{v}^{(k)} \\ \mathbf{r}^{(k+1)} &= \mathbf{r}^{(k)} - t_k A \mathbf{v}^{(k)} \\ s_k &= \frac{ \langle \mathbf{r}^{(k+1)}, \mathbf{r}^{(k+1)} \rangle }{ \langle \mathbf{r}^{(k)}, \mathbf{r}^{(k)} \rangle } \\ \mathbf{v}^{(k+1)} &= \mathbf{r}^{(k+1)} + s_k \mathbf{v}^{(k)} \end{align*}

Return the solution vector $\mathbf{x}^{(k)}$ from the last iteration step and the corresponding iteration index $k$. Implement a unit test for your implementation by comparing your result to the exact solution of the system in task 4. Use $\text{numpy.dot()}$ for all needed vector/vector and matrix/vector products.

How do you expect the number of needed iteration steps to behave when changing the accuracy $\epsilon$? What do you see?

In [ ]:
def CG(A, b, eps):
    # YOUR CODE HERE
    raise NotImplementedError()
In [ ]:
def test_CG():
    # YOUR CODE HERE
    raise NotImplementedError()
    
test_CG()

Task 7

Apply all three methods to the following system

\begin{align*} \begin{pmatrix} 0.2& 0.1& 1.0& 1.0& 0.0 \\ 0.1& 4.0& -1.0& 1.0& -1.0 \\ 1.0& -1.0& 60.0& 0.0& -2.0 \\ 1.0& 1.0& 0.0& 8.0& 4.0 \\ 0.0& -1.0& -2.0& 4.0& 700.0 \end{pmatrix} \mathbf{x}^* = \begin{pmatrix} 1 \\ 2 \\ 3 \\ 4 \\ 5 \end{pmatrix}. \end{align*}

Plot the number of needed iterations for each method as a function of $\varepsilon$, using $\varepsilon = 10^{-1}, 10^{-2}, ..., 10^{-8}$.

Explain the observed behavior with the help of the condition number (which you can calculate using $\text{numpy.linalg.cond()}$).

In [ ]:
A = np.array([[  .2,   .1,   1.0,  1.0,    0.0],
              [  .1,  4.0, - 1.0,  1.0, -  1.0],
              [ 1.0, -1.0,  60.0,   .0,     .0],
              [ 1.0,  1.0,    .0,  8.0,    4.0],
              [  .0, -1.0, - 2.0,  4.0,  700.0]])
b = np.array( [ 1  ,  2  ,   3  ,  4  ,    5  ] )
# YOUR CODE HERE
raise NotImplementedError()

Task 8

Try to get a better convergence behavior by pre-conditioning your matrix $A$. Instead of $A$ use

$$ \tilde{A} = C A C,$$

where $C = \sqrt{D^{-1}}$. If you do so, you will need to replace $\mathbf{b}$ by

$$\mathbf{\tilde{b}} = C \mathbf{b}$$

and the vector $\mathbf{\tilde{x}}$ returned by your function will have to be transformed back via

$$\mathbf{x} = C \mathbf{\tilde{x}}.$$

What is the effect of $C$ on the condition number and why?

In [ ]:
# YOUR CODE HERE
raise NotImplementedError()