csci5607/exam-1/exam1.md
2023-03-03 01:58:45 -06:00

489 lines
20 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
geometry: margin=2cm
output: pdf_document
title: Exam 1
subtitle: CSCI 5607
date: \today
author: |
| Michael Zhang
| zhan4854@umn.edu $\cdot$ ID: 5289259
---
\renewcommand{\c}[1]{\textcolor{gray}{#1}}
## Image File Formats
1. \c{(8 points) Suppose you have used your raytracing program to create an
image in image file type? Other potential image file format options include:
JPEG, TIFF, and GIF. Discuss the main advantages and disadvantages in
converting your image from ASCII PPM to these other file types. Try to
identify at least one strength and one weakness of each option.}
ASCII PPM has the obvious deficiency of not having a concise representation
on disk (although P6 solves this problem). On the plus side, it is easy to
represent it in a way thats easy to debug, for example if you put double
line breaks between each row in the image, and added a comment making it easy
to identify rows in the image quickly.
JPEG is typically a lossy data format, which means while you can get insanely
high compression resulting in lower file sizes, you also get compression
artifacts which ruins the quality of the image. On the other hand, ASCII PPM
is lossless and preserves the original image at original quality entirely.
## Color Spaces
2. \c{(8 points) Your raytracing program generates images using an RGB color
model. But RGB is not the only color space that artists or programmers use
when creating or working with digital images. Identify two other color spaces
that are commonly used in computer graphics or related fields. Describe their
most important characteristics, and discuss how they are similar to and/or
different from using an (r, g, b) representation. For each color space, try
to identify at least one use case where it might be preferred over using an
RGB representation and explain.}
Another color space that is typically used for computer graphics is CMYK, and
YUV.
CMYK uses the secondary colors cyan, magenta, and yellow, as well as black.
This is primarily used for printing, because as opposed to when light is
emitted from a computer screen, reflected light from paper combines
differently. When applying more ink or combining ink together, the color
tends to get darker rather than lighter, so lighter base colors are
preferred. The black is added since it is not possible to create pure black
otherwise.
YUV is used more commonly in video formats. It uses luma, and red/blue
projections of the image rather than RGB. While equally expressive as RGB,
YUV is more efficient in video since the chroma of an image doesnt change as
frequently over video.
## Color Compositing
3. \c{"Alpha-blending" or "color compositing" is a method for combining the
colors of one image with the colors of another. A typical alpha-blending
function is: $C_{final} = C_{fg} \cdot a_{fg} + C_{bg} \cdot (1 - a_{fg})$
where $C_{fg}$ is the color of the foreground pixel, $Cbg$ is the color of
the background pixel, $a_{fg}$ is the opacity of the foreground pixel (where
$a_{fg} = 1$ represents "fully opaque" and $a_{fg} = 0$ represents "fully
transparent"), and $a_{bg}$ is assumed to be 1. When $a_{fg} = 0.5$,
$C_{final} = \frac{C_{fg} + C_{bg}}{2}$ which is equivalent to taking the
average of the foreground and background colors. Yet, the results of using
alpha-blending to composite multiple transparent surfaces is different than
the results of averaging their colors.}
\c{Consider a pure green image and a pure blue image, each with $\alpha =
0.5$. If $C_{bg} = (0,0,0,1)$, what color would be obtained by:}
a) \c{(3 points) compositing the blue image over background, then the green
image over the result?}
In this case, compositing the blue $(0, 0, 1, 0.5)$ over the background
$(0, 0, 0, 1)$ would be $C_{final} = (0, 0, 1) \cdot 0.5 + (0, 0, 0) \cdot
(1 - 0.5) = (0, 0, 0.5)$.
Then, compositing green over this gives us: $C_{final} = (0, 1, 0) \cdot
0.5 + (0, 0, 0.5) \cdot (1 - 0.5) = (0, 0.5, 0) + (0, 0, 0.25) =
\boxed{(0, 0.5, 0.25)}$.
b) \c{(2 points) compositing the green image over background, then the blue
image over the result?}
The result will be the same, except with the blue and green components
reversed: $\boxed{(0, 0.25, 0.5)}$.
\c{When $a_{bg} \ne 1$, the alpha blending function
becomes slightly more complicated: $a_{final} = a_{fg} + a_{bg} \cdot (1 -
a_{fg})$; $C_{final} = \frac{C_{fg} \cdot a_{fg} + C_{bg} \cdot a_{bg} \cdot
(1 - a_{fg})}{a_{final}}$.}
c) \c{(3 points) What result do you obtain by compositing the blue image over
the green image, and then compositing that result over the background?}
First, we want to composite the blue image over the green image, which
both have an alpha of $0.5$:
$a_{final} = 0.5 + 0.5 \cdot (1 - 0.5) = 0.75$.
Then
\begin{align*}
C_{final} &= \frac{C_{fg} \cdot a_{fg} + C_{bg} \cdot a_{bg} \cdot (1 -
a_{fg})}{a_{final}} \\
&= \frac{(0, 0, 1) \cdot 0.5 + (0, 1, 0) \cdot 0.5 \cdot (1 - 0.5)}{0.75}
\\
&= \frac{(0, 0, 0.5) + (0, 0.25, 0)}{0.75} \\
&= \frac{(0, 0.25, 0.5)}{0.75} \\
&= (0, 0.\overline{333}, 0.\overline{666})
\end{align*}
Now, composite this over the background:
\begin{align*}
C_{final} &= C_{fg} \cdot a_{fg} + C_{bg} \cdot (1 - a_{fg}) \\
&= (0, 0.\overline{333}, 0.\overline{666}) \cdot 0.75 + (0, 0, 0) \cdot (1
- 0.75) \\
&= (0, 0.25, 0.5) \cdot 0.75 \\
&= \boxed{(0, 0.1875, 0.375)}
\end{align*}
Alpha is 1 at the end because of blending with the background.
d) \c{(2 points) How does your answer in (c) differ from what you would get
by simply averaging the blue and green images, and then superimposing that
result over the background?}
Because the alphas are 0.5, the way you blend the images produces a more
nuanced 0.75 alpha for whichever ones you blend first. So if you were to
blend into the background first, the background has more of an effect on
the blended image.
## Viewing Parameter Specifications
4. \c{(10 points) Suppose that you are trying to help a marketing manager come
up with the largest possible number to use to advertise the field of view
offered by their new head-mounted display device. if the vertical field of
view is $90^\circ$ and the horizontal field of view is $110^\circ$ what is
the diagonal field of view?}
The relationship between FOV angle and the length (whichever way we're
measuring) is $\tan(\frac{1}{2} \theta) = \frac{L}{2d}$, where $d$ is the
distance the viewer is from the screen.
Because the viewer will remain the same distance from the screen while all
these calculations are being made, we can just ignore it completely by
setting it to 1. So we have $\tan(\frac{1}{2} \theta) = \frac{L}{2}$.
Now, we can find $w$ and $h$ by reversing this into $L = 2\tan(\frac{1}{2}
\theta)$:
- $w = 2\tan(55^\circ)$
- $h = 2\tan(45^\circ)$
The diagonal $D$ can be calculated using Pythagorean's theorem:
- $D = \sqrt{w^2 + h^2} = 2\sqrt{\tan^2(55^\circ) + \tan^2(45^\circ)} \approx
3.48$
```
In [68]: w = 2 * tan(radians(55))
In [69]: h = 2 * tan(radians(45))
In [71]: sqrt(pow(w, 2) + pow(h, 2))
Out[71]: 3.486893591242196
```
Now, we can convert this length back into an angle by turning the original
equation around: $\theta = 2\tan^{-1}(\frac{1}{2} L)$, which gives us a value
of around $120^\circ$.
```
In [72]: 2 * atan(0.5 * _)
Out[72]: 2.1000651019312633
In [73]: degrees(_)
Out[73]: 120.32486704337242
```
## Geometry for Computer Graphics
5. \c{(4 points) Consider a triangle defined by the vertices: $v_0 = (0, 1, 2),
v_1 = (4, 1, -2), v_2 = (2, 2, 0)$, specified in counter-clockwise order.
What is the area of this triangle?}
The area of a triangle given two of its sides is $\frac{1}{2} |e_1 \times
e_2|$. In this case, $e_1 = v_1 - v_0 = (4, 0, -4)$, and $e_2 = v_2 - v_0 =
(2, 1, -2)$.
The cross product is (4, 0, 4) (I used numpy to calculate this, see below
output), which normalizes to $\sqrt{32}$. Half of this is $\boxed{\sqrt{8}}$,
which is the area.
```
In [2]: v0 = np.array([0,1,2])
In [3]: v1 = np.array([4,1,-2])
In [4]: v2 = np.array([2,2,0])
In [5]: e1 = v1 - v0 # (4, 0, -4)
In [6]: e2 = v2 - v0 # (2, 1, -2)
In [8]: np.cross(e1, e2)
Out[8]: array([4, 0, 4])
```
## Point-in-Polygon Testing
6. \c{Consider a triangle defined by $v_0 = (3, 0, -2)$, $v_1 = (2, 4, 0)$, $v_2
= (-2, 2, 2)$, in counterclockwise order, and a point $p = (2, 1, -1)$.}
a) \c{(9 points) What are the barycentric coordinates that define the
location of the point $p$ with respect to the locations of $v_0$, $v_1$,
and $v_2$?}
Using $v_0$ as the base, we can write $e_1 = v_1 - v_0 = (-1, 4, 2)$, and
$e_2 = v_2 - v_0 = (-5, 2, 4)$. We also need $e_p = p - v_0 = (-1, 1, 1)$.
Now we can proceed to find $\beta$ and $\gamma$.
Using the above, we get:
- $d_{11} = e_1 \cdot e_1 = (-1, 4, 2) \cdot (-1, 4, 2) = 1 + 16 + 4 = 21$
- $d_{12} = e_1 \cdot e_2 = (-1, 4, 2) \cdot (-5, 2, 4) = 5 + 8 + 8 = 21$
- $d_{22} = e_2 \cdot e_2 = (-5, 2, 4) \cdot (-5, 2, 4) = 25 + 4 + 16 =
45$
- $d_{1p} = e_1 \cdot e_p = (-1, 4, 2) \cdot (-1, 1, 1) = 1 + 4 + 2 = 7$
- $d_{2p} = e_2 \cdot e_p = (-5, 2, 4) \cdot (-1, 1, 1) = 5 + 2 + 4 = 11$
Plugging this into the formula, we get:
\begin{align*}
\begin{bmatrix}
e_1 \cdot e_1 & e_1 \cdot e_2 \\
e_1 \cdot e_2 & e_2 \cdot e_2 \\
\end{bmatrix}
\begin{bmatrix}
\beta \\ \gamma
\end{bmatrix}
&=
\begin{bmatrix}
e_1 \cdot e_p \\ e_2 \cdot e_p
\end{bmatrix} \\
\begin{bmatrix} 21 & 21 \\ 21 & 45 \end{bmatrix}
\begin{bmatrix} \beta \\ \gamma \end{bmatrix}
&=
\begin{bmatrix} 7 \\ 11 \end{bmatrix} \\
\end{align*}
Checking the determinant $\det \begin{vmatrix} 21 & 21 \\ 21 & 45
\end{vmatrix} = 21 \times 45 - 21 \times 21 = 504$, which is non-zero.
This means the matrix is invertible and we can solve for $\beta$ and
$\gamma$. Now we have:
- $\beta = (d_{22} d_{1p} - d_{12} d_{2p}) / \det = (45 \times 7 - 21
\times 11) / 504 = \frac{1}{6}$
- $\gamma = (d_{11} d_{2p} - d_{12} d_{1p}) / \det = (21 \times 11 - 21
\times 7) / 504 = \frac{1}{6}$
Then, $\alpha = 1 - (\beta + \gamma) = \frac{2}{3}$.
b) \c{(1 point) Does p lie: inside the triangle; outside of the triangle; or
on the edge of the triangle? (please circle the correct answer)}
Using these values, we can see that all three values lie between 0 and 1,
which means the point is definitely inside the triangle.
## Smooth Shading
7. \c{(6 points) Consider a triangle defined by $v_0 = (1, 1, -2)$, $v_1 = (-1,
0, 1)$, $v_2 = (2, -1, -1)$, in counter-clockwise order, where the associated
normal directions are: $vn_0 = (\frac{2}{3}, -\frac{2}{3}, \frac{1}{3})$,
$vn_1 = (-\frac{2}{3}, \frac{1}{3}, \frac{2}{3})$, $vn_2 = (-\frac{1}{3},
\frac{2}{3}, \frac{2}{3})$. If the barycentric coordinates $\alpha = 0.3$,
$\beta = 0.6$, $\gamma = 0.1$ describe the location of $p$ with respect to
$v_0$, $v_1$, $v_2$, what is the smooth-shading surface normal direction at
$p$ (interpolated from the vertex normals) that should be used when computing
the Phong illumination at $p$?}
This is simply applying each of the barycentric coordinates (which is really
just a weight of how close $p$ is to a particular vertex) to the
corresponding vertex, and summing them. The corresponding vertex is the one
_opposite_ from the side that the coordinate defines.
\begin{align*}
n &= \alpha \cdot vn_0 + \beta \cdot vn_1 + \gamma \cdot vn_2 \\
&= 0.3 \cdot vn_0 + 0.6 \cdot vn_1 + 0.1 \cdot vn_2 \\
&= 0.3 \cdot \left(\frac{2}{3}, -\frac{2}{3}, \frac{1}{3} \right) + 0.6
\cdot \left(-\frac{2}{3}, \frac{1}{3}, \frac{2}{3} \right) + 0.1 \cdot
\left(-\frac{1}{3}, \frac{2}{3}, \frac{2}{3} \right) \\
&= (0.2, -0.2, 0.1) + (-0.4, 0.2, 0.4) + \left(-\frac{0.1}{3},
\frac{0.2}{3}, \frac{0.2}{3} \right) \\
&= \boxed{(-0.2\overline{333}, 0.0\overline{666}, 0.5\overline{666})}
\end{align*}
## Phong Illumination
8. \c{(10 points) Suppose you are rendering a simple scene that contains a
single sphere of radius 3 centered at (0, 0, 0) and illuminated by a single
white point light source located at (0, 4, 2). Use the Blinn-Phong
illumination model to determine what color a viewer at (6, 2, 3) will
perceive at the point $p = (2, 2, 1)$ if the material properties of the
sphere are $k_a = 0.2$, $k_d = 0.6$, $k_s = 0.3$, $n=2$, $O_{d\lambda} = (1,
0, 0)$ and $O_{s\lambda} = (1, 1, 0)$. Please provide the $(r, g, b)$ color
definition and also name the color that this value corresponds to.}
First, we have to find some vectors needed for the Blinn-Phong equation:
- For the normal $\vec{N}$, we can just subtract the sphere center from the
intersection point, yielding $\vec{N} = (2, 2, 1)$.
- For the vector to the viewer $\vec{V}$, we can subtract the intersection
point from the viewer location, yielding $\vec{V} = (6, -2, 3) - (2, 2,
1) = (4, -4, 2)$.
- For the light direction $\vec{L}$, subtract the intersection point from the
light $\vec{L} = (0, 4, 2) - (2, 2, 1) = (-2, 2, 1)$.
- The halfway direction is just halfway between the light and the viewer,
which is $\vec{H} = (\vec{L} + \vec{V}) / 2 = ((-2, 2, 1) + (4, -4, 2)) /
2 = (2, -2, 3) / 2 = (1, -1, 1.5)$.
First, substitute the values given into the Blinn-Phong equation:
\begin{align*}
I_\lambda &= k_a O_{d\lambda} + k_d O_{d\lambda} (\vec{N} \cdot \vec{L}) +
k_s O_{s\lambda} (\vec{N} \cdot \vec{H})^n \\
I_\lambda &= 0.2 \cdot (1, 0, 0) + 0.6 \cdot (1, 0, 0) ((2, 2, 1) \cdot
(-2, 2, 1)) + 0.3 \cdot (1, 1, 0) ((2, 2, 1) \cdot (1, -1, 1.5))^2 \\
I_\lambda &= (0.2, 0, 0) + (0.6, 0, 0) \cdot (-4 + 4 + 1) + (0.3, 0.3, 0)
(2 - 2 + 1.5)^2 \\
I_\lambda &= (0.2, 0, 0) + (0.6, 0, 0) + (0.3, 0.3, 0) \cdot 2.25 \\
I_\lambda &= (0.8, 0, 0) + (0.675, 0.675, 0) \\
I_\lambda &= (1.475, 0.675, 0) \\
\end{align*}
We clamp to 1, so the result is $\boxed{(1, 0.675, 0)}$. This is an
orange-ish brown color.
## Ray-Object Intersection
9. \c{(14 points) The set of all points on the surface of an infinite,
circularly symmetric, one-sheet hyperboloid, centered at the origin and
aligned with the z axis, can be defined by the implicit equation:
$\frac{x^2}{r^2} + \frac{y^2}{r^2} - \frac{z^2}{c^2} = 1$ where $r$ and $c$
are parameters that control the radius and curvature of the hyperboloid, and
the set of all points $(x, y, z)$ for which the implicit equation is true lie
on the surface of the hyperboloid. Let $r = 2$ and $c = 1$, to obtain the
surface shown at the right.}
\c{Consider a viewing ray that starts at $p_0 = (-2, 1, -1)$ and travels in
the direction $d = (0, 1, 1)$. Does this ray intersect this surface? If so,
at what point(s)?}
![](./problem-9.png){width=300px}
Starting by substituting in the ray equation, we would get:
\begin{align*}
\frac{(x_0 + tx_d)^2}{r^2} +
\frac{(y_0 + ty_d)^2}{r^2} -
\frac{(z_0 + tz_d)^2}{c^2} &= 1 \\
\frac{x_0^2 + 2tx_0x_d + t^2x_d^2}{r^2} +
\frac{y_0^2 + 2ty_0y_d + t^2y_d^2}{r^2} -
\frac{z_0^2 + 2tz_0z_d + t^2z_d^2}{c^2} &= 1 \\
\left(
\frac{x_d^2}{r^2} + \frac{y_d^2}{r^2} - \frac{z_d^2}{c^2}
\right) t^2 +
2 \left(
\frac{x_0x_d}{r^2} + \frac{y_0y_d}{r^2} - \frac{z_0z_d}{c^2}
\right) t +
\left(
\frac{x_0^2}{r^2} + \frac{y_0^2}{r^2} - \frac{z_0^2}{c^2} - 1
\right) &= 0
\end{align*}
Then, substituting in the values for $r$, $c$, and the ray parameters gives:
\begin{align*}
\left(
\frac{0^2}{2^2} + \frac{1^2}{2^2} - \frac{1^2}{1^2}
\right) t^2 +
2 \left(
\frac{-2 \cdot 0}{2^2} + \frac{1 \cdot 1}{2^2} - \frac{-1 \cdot 1}{1^2}
\right) t +
\left(
\frac{(-2)^2}{2^2} + \frac{1^2}{2^2} - \frac{(-1)^2}{1^2} - 1
\right) = 0 \\
\left( 0 + \frac{1}{4} - 1 \right) t^2 +
2 \left( \frac{1}{4} + 1 \right) t +
\left( 1 + \frac{1}{4} - 1 - 1 \right) = 0 \\
-\frac{3}{4} t^2 + \frac{5}{2} t -\frac{3}{4} = 0
\end{align*}
Now we can just use quadratic equation with $A = -\frac{3}{4}$, $B =
\frac{5}{2}$, and $C = -\frac{3}{4}$. Quickly checking the determinant $B^2 -
4AC = \frac{5}{2}^2 - 4 \cdot (-\frac{3}{4})^2 = \frac{25}{4} - \frac{9}{4} =
\frac{16}{4} = 4$, which is positive so there are \boxed{\textrm{two
solutions}}.
To get the points themselves, we solve for $t$:
\begin{align*}
t &= \frac{-\frac{5}{2} \pm \sqrt{4}}{2 \cdot -\frac{3}{4}} \\
t &= \frac{-\frac{5}{2} \pm 2}{-\frac{3}{2}} \\
t &= \frac{-5 \pm 4}{-3} \\
t &\in \{\frac{1}{3}, 3\}
\end{align*}
Now, just substitute the $t$ back into the ray equation $p = p_0 + dt$ to get
\begin{align*}
p_0 &= (-2, 1, -1) + \frac{1}{3} (0, 1, 1) \\
&= (-2, 1, -1) + (0, \frac{1}{3}, \frac{1}{3}) \\
&= (-2, \frac{4}{3}, -\frac{2}{3})
\end{align*}
\begin{align*}
p_1 &= (-2, 1, -1) + 3 (0, 1, 1) \\
&= (-2, 1, -1) + (0, 3, 3) \\
&= (-2, 4, 2)
\end{align*}
The ray intersects the hyperboloid at \boxed{(-2, \frac{4}{3}, -\frac{2}{3})
\textrm{and} (-2, 4, 2)}.
## Texture Mapping
10. \c{(10 points) Suppose that a texture image has been mapped onto the surface
of a truncated cylinder such that the values of $u$ increase from 0 to 1 as
the values of $\theta$ increase from $0$ to $2\pi$, and the values of $v$
decrease from 1 to 0 as the position along the cylinders height increases
from its base to its top. Consider a cylinder of radius 2 and height 4,
whose base is centered at $(0, 0, 0)$. What is the texture coordinate $(u,
v)$ at the point $p = (-\sqrt{2}, \sqrt{2}, 1)$?}
The point $(-\sqrt{2}, \sqrt{2})$ occurs at $\theta = \frac{3}{4}\pi$, so
when mapped to $u$, this corresponds to a value of $u = \frac{3}{8}$.
The height of 1 occurs halfway on the cylinder, which corresponds to a $v =
0.5$.
The coordinate is $\boxed{(\frac{3}{8}, 0.5)}$.
11. \c{Consider a texture image that is 1024 pixels wide and 512 pixels tall. If
bi-linear interpolation is used to retrieve a color from this image
corresponding to the texture coordinate (u, v) = (0.6, 0.2):}
a) \c{(4 points) From which four pixels $[i, j]$ in the image will the
texture colors be retrieved?}
Multiplying it out, we get the exact coordinates (0.6 * 1024, 0.2 * 512)
= (614.4, 102.4). This means, we will be sampling from:
- (614, 102)
- (614, 103)
- (615, 102)
- (615, 103)
b) \c{(6 points) Suppose that the colors in the texture image have been
defined to be white when $i$ and $j$ are both even numbers, red when $i$
is even and $j$ is odd, green when $i$ is odd and $j$ is even, and blue
when both $i$ and $j$ are odd. What color will be returned from the
texture lookup, after the retrieved colors have been appropriately
combined using bilinear interpolation?}
Using this information, the colors corresponding to each of the above
are:
- (614, 102): white
- (614, 103): red
- (615, 102): green
- (615, 103): blue
Then, we need to weigh these according to how close the point we want to
observe is from those four corners:
- (614, 102): 0.6 * 0.6 * (1, 1, 1) = 0.36 * (1, 1, 1) = (0.36, 0.36,
0.36)
- (614, 103): 0.6 * 0.4 * (1, 0, 0) = 0.24 * (1, 0, 0) = (0.24, 0, 0)
- (615, 102): 0.4 * 0.6 * (0, 1, 0) = 0.24 * (0, 1, 0) = (0, 0.24, 0)
- (615, 103): 0.4 * 0.4 * (0, 0, 1) = 0.16 * (0, 0, 1) = (0, 0, 0.16)
Taking the sum, we get (0.36 + 0.24, 0.36 + 0.24, 0.36 + 0.16) =
\boxed{(0.6, 0.6, 0.52)}.