csci5607/exam-2/exam2.md

392 lines
15 KiB
Markdown
Raw Normal View History

2023-04-26 19:56:07 +00:00
---
geometry: margin=2cm
output: pdf_document
title: Exam 2
subtitle: CSCI 5607
date: \today
author: |
| Michael Zhang
| zhan4854@umn.edu $\cdot$ ID: 5289259
---
\renewcommand{\c}[1]{\textcolor{gray}{#1}}
2023-04-30 21:01:21 +00:00
\newcommand{\now}[1]{\textcolor{blue}{#1}}
2023-05-01 00:44:00 +00:00
\newcommand{\todo}[0]{\textcolor{red}{\textbf{TODO}}}
2023-04-26 19:56:07 +00:00
## Reflection and Refraction
2023-04-30 21:01:21 +00:00
1. \c{Consider a sphere $S$ made of solid glass ($\eta$ = 1.5) that has radius $r =
2023-04-26 19:56:07 +00:00
3$ and is centered at the location $s = (2, 2, 10)$ in a vaccum ($\eta =
1.0$). If a ray emanating from the point $e = (0, 0, 0)$ intersects $S$ at a
2023-04-30 21:01:21 +00:00
point $p = (1, 4, 8)$:}
2023-04-26 19:56:07 +00:00
2023-04-30 21:01:21 +00:00
a. \c{(2 points) What is the angle of incidence $\theta_i$?}
2023-04-26 19:56:07 +00:00
2023-04-30 21:01:21 +00:00
The incoming ray is in the direction $I = p - e = (1, 4, 8)$, and the normal at
that point is $N = p - s = (1, 4, 8) - (2, 2, 10) = (1, -2, 2)$. The angle can
be found by taking the opposite of the incoming ray $-I$ and using the
formula $\cos \theta_i = \frac{-I \cdot N}{|I| |N|} = \frac{(-1, -4, -8)
\cdot (1, -2, 2)}{9 \cdot 3} = \frac{-1 + 8 - 16}{27} = -\frac{1}{3}$. So the
angle $\boxed{\theta_i = \cos^{-1}(-\frac{1}{3})}$.
2023-04-26 19:56:07 +00:00
2023-04-30 21:01:21 +00:00
b. \c{(1 points) What is the angle of reflection $\theta_r$?}
2023-04-26 19:56:07 +00:00
2023-04-30 21:01:21 +00:00
The angle of reflection always equals the angle of incidence, $\theta_r =
\theta_i = \boxed{cos^{-1}(-\frac{1}{3})}$.
2023-04-26 19:56:07 +00:00
2023-04-30 21:01:21 +00:00
c. \c{(3 points) What is the direction of the reflected ray?}
The reflected ray can be found by first projecting the incident ray $-I$ onto
the normalized normal $N$, which is $v = N \times |-I|\cos(\theta_i) =
(\frac{1}{3}, -\frac{2}{3}, \frac{2}{3}) \times 9 \times \frac{1}{3} = (-1,
2, -2)$. Then, we know the point on N where this happened is $p' = p + v =
(1, 4, 8) + (-1, 2, -2) = (0, 6, 6)$.
Now, we can subtract this point from where the ray originated to know the
direction to add in the other direction, which is still $(0, 6, 6)$ in this
case since the ray starts at the origin. Adding this to the point $p'$ gets
us $(0, 12, 12)$, which means a point from the origin will get reflected to
$(0, 12, 12)$.
Finally, subtract the point to get the final answer $(0, 12, 12) - (1, 4, 8)
= \boxed{(-1, 8, 4)}$.
d. \c{(3 points) What is the angle of transmission $\theta_t$?}
Using Snell's law, we know that $\eta_i \sin \theta_i = \eta_t \sin \theta_t
= 1.0 \times \sin(\cos^{-1}(-\frac{1}{3})) = 1.5 \times \sin(\theta_t)$. To
find the angle $\theta_t$ we can just solve: $\theta_t =
\sin^{-1}(\frac{2}{3} \times \sin(\cos^{-1}(-\frac{1}{3}))) \approx
\boxed{0.6796}$ (in radians).
e. \c{(4 points) What is the direction of the transmitted ray?}
2023-04-26 19:56:07 +00:00
## Geometric Transformations
2. \c{(8 points) Consider the airplane model below, defined in object
coordinates with its center at $(0, 0, 0)$, its wings aligned with the $\pm
x$ axis, its tail pointing upwards in the $+y$ direction and its nose facing
in the $+z$ direction. Derive a sequence of model transformation matrices
that can be applied to the vertices of the airplane to position it in space
2023-04-30 21:03:33 +00:00
at the location $p = (4, 4, 7)$, with a direction of flight $w = (2, 1, -2)$
2023-04-30 22:10:11 +00:00
and the wings aligned with the direction $d = (-2, 2, -1)$.}
2023-04-26 19:56:07 +00:00
The translation matrix is
$$
\begin{bmatrix}
1 & 0 & 0 & x \\
0 & 1 & 0 & y \\
0 & 0 & 1 & z \\
0 & 0 & 0 & 1 \\
\end{bmatrix}
=
\begin{bmatrix}
1 & 0 & 0 & 4 \\
0 & 1 & 0 & 4 \\
0 & 0 & 1 & 7 \\
0 & 0 & 0 & 1 \\
\end{bmatrix}
$$
Since the direction of flight was originally $(0, 0, 1)$, we have to
transform it to $(2, 1, -2)$.
2023-04-30 22:10:11 +00:00
3.
2023-04-30 21:01:21 +00:00
2023-04-30 22:10:11 +00:00
## The Camera/Viewing Transformation
4. \c{Consider the viewing transformation matrix $V$ that enables all of the
vertices in a scene to be expressed in terms of a coordinate system in which
the eye is located at $(0, 0, 0)$, the viewing direction ($-n$) is aligned
2023-05-01 00:44:00 +00:00
with the $-z$ axis $(0, 0, -1)$, and the camera's 'up' direction (which
2023-04-30 22:10:11 +00:00
controls the roll of the view) is aligned with the $y$ axis (0, 1, 0).}
2023-05-01 00:44:00 +00:00
a. \c{(4 points) When the eye is located at $e = (2, 3, 5)$, the camera is
2023-04-30 22:10:11 +00:00
pointing in the direction $(1, -1, -1)$, and the camera's 'up' direction is
2023-05-01 00:44:00 +00:00
$(0, 1, 0)$, what are the entries in $V$?}
2023-04-30 22:10:11 +00:00
2023-05-01 00:44:00 +00:00
First we can calculate $n$ and $u$:
2023-04-30 22:10:11 +00:00
2023-05-01 00:44:00 +00:00
- Viewing direction is $(1, -1, -1)$.
- Normalized $n = (\frac{1}{\sqrt{3}}, -\frac{1}{\sqrt{3}}, -\frac{1}{\sqrt{3}})$.
- $u = up \times n = (-\frac{1}{\sqrt{2}}, 0, -\frac{1}{\sqrt{2}})$.
- $v = n \times u = (\frac{\sqrt{6}}{6}, \frac{\sqrt{6}}{3}, -\frac{\sqrt{6}}{6})$
$$
\begin{bmatrix}
-\frac{1}{\sqrt{2}} & 0 & -\frac{1}{\sqrt{2}} & d_x \\
1 & -1 & -1 & d_y \\
\end{bmatrix}
$$
\todo
b. \c{(2 points) How will this matrix change if the eye moves forward in the
2023-04-30 22:10:11 +00:00
direction of view? [which elements in V will stay the same? which elements
2023-05-01 00:44:00 +00:00
will change and in what way?]}
2023-04-30 22:10:11 +00:00
2023-05-01 00:44:00 +00:00
If the eye moves forward, the eye _position_ and everything that depends on it
will change, while everything else doesn't.
2023-04-30 22:10:11 +00:00
2023-05-01 00:44:00 +00:00
| $n$ | $u$ | $v$ | $d$ |
| ---- | ---- | ---- | --------- |
| same | same | same | different |
The $n$ is the same because the viewing direction does not change.
c. \c{(2 points) How will this matrix change if the viewing direction spins
in the clockwise direction around the camera's 'up' direction? [which
elements in V will stay the same? which elements will change and in what
way?]}
In this case, the eye _position_ stays the same, and everything else changes.
| $n$ | $u$ | $v$ | $d$ |
| --------- | --------- | --------- | ---- |
| different | different | different | same |
d. \c{(2 points) How will this matrix change if the viewing direction rotates
2023-04-30 22:10:11 +00:00
directly upward, within the plane defined by the viewing and 'up' directions?
[which elements in V will stay the same? which elements will change and in
2023-05-01 00:44:00 +00:00
what way?]}
In this case, the eye _position_ stays the same, and everything else changes.
| $n$ | $u$ | $v$ | $d$ |
| --------- | --------- | --------- | ---- |
| different | different | different | same |
5. \c{Suppose a viewer located at the point $(0, 0, 0)$ is looking in the $-z$
direction, with no roll ['up' = $(0, 1 ,0)$], towards a cube of width 2,
centered at the point $(0, 0, -5)$, whose sides are colored: red at the plane
$x = 1$, cyan at the plane $x = -1$, green at the plane $y = 1$, magenta at
the plane $y = -1$, blue at the plane $z = -4$, and yellow at the plane $z =
-6$.}
a. \c{(1 point) What is the color of the cube face that the user sees?}
\boxed{\textrm{Blue}}
b. \c{(3 points) Because the eye is at the origin, looking down the $-z$ axis
with 'up' = $(0,1,0)$, the viewing transformation matrix $V$ in this case is
the identity $I$. What is the model matrix $M$ that you could use to rotate
the cube so that when the image is rendered, it shows the red side of the
cube?}
You would have to do a combination of (1) translate to the origin, (2) rotate
around the origin, and then (3) untranslate back. This way, the eye position
doesn't change.
$$
M =
\begin{bmatrix}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & -5 \\
0 & 0 & 0 & 1 \\
\end{bmatrix} \cdot
\begin{bmatrix}
0 & 0 & -1 & 0 \\
0 & 1 & 0 & 0 \\
1 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 \\
\end{bmatrix} \cdot
\begin{bmatrix}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 5 \\
0 & 0 & 0 & 1 \\
\end{bmatrix}
=
\boxed{\begin{bmatrix}
0 & 0 & -1 & -5 \\
0 & 1 & 0 & 0 \\
1 & 0 & 0 & -5 \\
0 & 0 & 0 & 1 \\
\end{bmatrix}}
$$
2023-04-30 22:10:11 +00:00
2023-05-01 00:44:00 +00:00
To verify this, testing with an example point $(1, 1, -4)$ yields:
$$
\begin{bmatrix}
0 & 0 & -1 & -5 \\
0 & 1 & 0 & 0 \\
1 & 0 & 0 & -5 \\
0 & 0 & 0 & 1 \\
\end{bmatrix}
\cdot
\begin{bmatrix}
1 \\ 1 \\ -4 \\ 1
\end{bmatrix}
=
\begin{bmatrix}
-1 \\ 1 \\ -4 \\ 1
\end{bmatrix}
$$
c. \c{(4 points) Suppose now that you want to leave the model matrix $M$ as
the identity. What is the viewing matrix $V$ that you would need to use to
render an image of the scene from a re-defined camera configuration so that
when the scene is rendered, it shows the red side of the cube? Where is the
eye in this case and in what direction is the camera looking?}
For this, a different eye position will have to be used. Instead of looking
from the origin, you could view it from the red side, and then change the
direction so it's still pointing at the cube.
- eye is located at $(5, 0, -5)$
- viewing direction is $(-1, 0, 0)$
- $n = (1, 0, 0)$
- $u = up \times n = (0, 0, -1)$
- $v = n \times u = (0, 1, 0)$
- $d = (-5, 0, -5)$
The final viewing matrix is $\boxed{\begin{bmatrix}
0 & 0 & -1 & -5 \\
0 & 1 & 0 & 0 \\
1 & 0 & 0 & -5 \\
0 & 0 & 0 & 1 \\
\end{bmatrix}}$. Turns out it's the same matrix! Wow!
2023-04-30 22:10:11 +00:00
## The Projection Transformation
6.
7.
8. \c{Consider the perspective projection-normalization matrix $P$ which maps
the contents of the viewing frustum into a cube that extends from -1 to 1 in
$x, y, z$ (called normalized device coordinates).}
\c{Suppose you want to define a square, symmetric viewing frustum with a near
2023-04-30 21:01:21 +00:00
clipping plane located 0.5 units in front of the camera, a far clipping plane
2023-04-30 22:10:11 +00:00
located 20 units from the front of the camera, a $60^\circ$ vertical field of
view, and a $60^\circ$ horizontal field of view.}
a. \c{(2 points) What are the entries in $P$?}
2023-04-30 21:01:21 +00:00
2023-04-30 22:10:11 +00:00
The left / right values are found by using the tangent of the field-of-view
triangle: $\tan(60^\circ) = \frac{\textrm{right}}{0.5}$, so $\textrm{right} =
\tan(60^\circ) \times 0.5 = \boxed{\frac{\sqrt{3}}{2}}$. The same goes for the
vertical, which also yields $\frac{\sqrt{3}}{2}$.
2023-04-30 21:01:21 +00:00
2023-05-01 00:44:00 +00:00
$$
\begin{bmatrix}
2023-04-30 22:10:11 +00:00
\frac{2\times near}{right - left} & 0 & \frac{right + left}{right - left} & 0 \\
0 & \frac{2\times near}{top - bottom} & \frac{top + bottom}{top - bottom} & 0 \\
0 & 0 & -\frac{far + near}{far - near} & -\frac{2\times far\times near}{far - near} \\
0 & 0 & -1 & 0
2023-05-01 00:44:00 +00:00
\end{bmatrix}
$$
2023-04-30 21:01:21 +00:00
2023-05-01 00:44:00 +00:00
$$
= \begin{bmatrix}
2023-04-30 22:10:11 +00:00
\frac{2\times 0.5}{\frac{\sqrt{3}}{2} - (-\frac{\sqrt{3}}{2})} & 0 & \frac{\frac{\sqrt{3}}{2} + (-\frac{\sqrt{3}}{2})}{\frac{\sqrt{3}}{2} - (-\frac{\sqrt{3}}{2})} & 0 \\
0 & \frac{2\times 0.5}{\frac{\sqrt{3}}{2} - (-\frac{\sqrt{3}}{2})} & \frac{\frac{\sqrt{3}}{2} + (-\frac{\sqrt{3}}{2})}{\frac{\sqrt{3}}{2} - (-\frac{\sqrt{3}}{2})} & 0 \\
0 & 0 & -\frac{20 + 0.5}{20 - 0.5} & -\frac{2\times 20\times 0.5}{20 - 0.5} \\
0 & 0 & -1 & 0
2023-05-01 00:44:00 +00:00
\end{bmatrix}
$$
2023-04-30 22:10:11 +00:00
2023-05-01 00:44:00 +00:00
$$
= \boxed{\begin{bmatrix}
2023-04-30 22:10:11 +00:00
\frac{1}{\sqrt{3}} & 0 & 0 & 0 \\
0 & \frac{1}{\sqrt{3}} & 0 & 0 \\
0 & 0 & -\frac{41}{39} & -\frac{40}{39} \\
0 & 0 & -1 & 0
2023-05-01 00:44:00 +00:00
\end{bmatrix}}
$$
2023-04-30 22:10:11 +00:00
b. \c{(3 points) How should be matrix $P$ be re-defined if the viewing window
is re-sized to be twice as tall as it is wide?}
c. \c{(3 points) What are the new horizontal and vertical fields of view
after this change has been made?}
2023-04-30 21:01:21 +00:00
2023-04-26 19:56:07 +00:00
## Clipping
9. \c{Consider the triangle whose vertex positions, after the viewport
transformation, lie in the centers of the pixels: $p_0 = (3, 3), p_1 = (9,
5), p_2 = (11, 11)$.}
Starting at $p_0$, the three vectors are:
- $v_0 = p_1 - p_0 = (9 - 3, 5 - 3) = (6, 2)$
- $v_1 = p_2 - p_1 = (11 - 9, 11 - 5) = (2, 6)$
- $v_2 = p_0 - p_2 = (3 - 11, 3 - 11) = (-8, -8)$
The first edge vector $e$ would be $(6, 2)$, and the edge normal would be
that rotated by $90^\circ$.
a. \c{(6 points) Define the edge equations and tests that would be applied,
during the rasterization process, to each pixel $(x, y)$ within the bounding
rectangle $3 \le x \le 11, 3 \le y \le 11$ to determine if that pixel is
inside the triangle or not.}
b. \c{(3 points) Consider the three pixels $p_4 = (6, 4), p_5 = (7, 7)$, and
$p_6 = (10, 8)$. Which of these would be considered to lie inside the
triangle, according to the methods taught in class?}
2023-04-30 21:01:21 +00:00
10. \c{When a model contains many triangles that form a smoothly curving surface
patch, it can be inefficient to separately represent each triangle in the
patch independently as a set of three vertices because memory is wasted when
the same vertex location has to be specified multiple times. A triangle
2023-04-30 21:03:33 +00:00
strip offers a memory-efficient method for representing connected 'strips'
2023-04-30 21:01:21 +00:00
of triangles. For example, in the diagram below, the six vertices v0 .. v5
define four adjacent triangles: (v0, v1, v2), (v2, v1, v3), (v2, v3, v4),
(v4, v3, v5). [Notice that the vertex order is switched in every other
triangle to maintain a consistent counter-clockwise orientation.] Ordinarily
one would need to pass 12 vertex locations to the GPU to represent this
surface patch (three vertices for each triangle), but when the patch is
encoded as a triangle strip, only the six vertices need to be sent and the
geometry they represent will be interpreted using the correspondence pattern
just described.}
\c{(5 points) When triangle strips are clipped, however, things
can get complicated. Consider the short triangle strip shown below in the
context of a clipping cube.}
- \c{After the six vertices v0 .. v5 are sent to be clipped, what will the
vertex list be after clipping process has finished?}
- \c{How can this new result be expressed as a triangle strip? (Try to be as
efficient as possible)}
- \c{How many triangles will be encoded in the clipped triangle strip?}
## Ray Tracing vs Scan Conversion
11. \c{(8 points) List the essential steps in the scan-conversion (raster
graphics) rendering pipeline, starting with vertex processing and ending
with the assignment of a color to a pixel in a displayed image. For each
step briefly describe, in your own words, what is accomplished and how. You
do not need to include steps that we did not discuss in class, such as
tessellation (subdividing an input triangle into multiple subtriangles),
instancing (creating new geometric primitives from existing input vertices),
but you should not omit any steps that are essential to the process of
generating an image of a provided list of triangles.}
12. \c{(6 points) Compare and contrast the process of generating an image of a
scene using ray tracing versus scan conversion. Include a discussion of
outcomes that can be achieved using a ray tracing approach but not using a
scan-conversion approach, or vice versa, and explain the reasons why and why
not.}
With ray tracing, the process of generating pixels is very hierarchical.
The basic ray tracer was very simple, but the moment we even added shadows,
there were recursive rays that needed to be cast, not to mention the
jittering. None of those could be parallelized with the main one, because in
order to even figure out where to start, you need to have already performed
a lot of the calculations. (For my ray tracer implementation, I already
parallelized as much as I could using the work-stealing library `rayon`)
But with scan conversion, the majority of the transformations are just done
with matrix transformations over the geometries, which can be performed
completely in parallel with minimal branching (only depth testing is not
exactly) The rasterization process is also massively parallelizable. This
makes it faster to do on GPUs which are able to do a lot of independent
operations.