This commit is contained in:
Michael Zhang 2023-10-21 16:42:47 -05:00
parent a74a9d6e0f
commit ac35e842d8
4 changed files with 21 additions and 6 deletions

View file

@ -8,5 +8,6 @@ function [] = Eigenfaces(training_data, test_data)
% TODO: perform PCA % TODO: perform PCA
% TODO: show the first 5 eigenvectors (see homework for example) % TODO: show the first 5 eigenvectors (see homework for example)
imagesc(reshape(faces_data(i,1:end-1),32,30)')
end % Function end end % Function end

View file

@ -37,4 +37,8 @@
b. #c[*(20 points)* Generate a plot of proportion of variance (see Figure 6.4 (b) in the main textbook) on the training data, and select the minimum number ($K$) of eigenvectors that explain at least 90% of the variance. Show both the plot and $K$ in the report. This can be accomplished by completing the _TODO_ headers in the `ProportionOfVariance.m` script. Project the training and test data to the $K$ principal components and run KNN on the projected data for $k = {1, 3, 5, 7}$. Print out the error rate on the test set. Implement your own version of and K-Nearest Neighbor classifier (KNN) for this problem. Classify each test point using a majority rule i.e., by choosing the most common class among the $k$ training points it is closest to. In the case where two classes are equally as frequent, perform a tie-breaker by choosing whichever class has on average a smaller distance to the test point. This can be accomplished by completing the _TODO_ comment headers in the `KNN.m` and `KNN_Error.m` scripts.] b. #c[*(20 points)* Generate a plot of proportion of variance (see Figure 6.4 (b) in the main textbook) on the training data, and select the minimum number ($K$) of eigenvectors that explain at least 90% of the variance. Show both the plot and $K$ in the report. This can be accomplished by completing the _TODO_ headers in the `ProportionOfVariance.m` script. Project the training and test data to the $K$ principal components and run KNN on the projected data for $k = {1, 3, 5, 7}$. Print out the error rate on the test set. Implement your own version of and K-Nearest Neighbor classifier (KNN) for this problem. Classify each test point using a majority rule i.e., by choosing the most common class among the $k$ training points it is closest to. In the case where two classes are equally as frequent, perform a tie-breaker by choosing whichever class has on average a smaller distance to the test point. This can be accomplished by completing the _TODO_ comment headers in the `KNN.m` and `KNN_Error.m` scripts.]
#figure(image("images/prop_var.png"))
I used $K = 41$.
c. #c[*(20 points)* Use the first $K = {10, 50, 100}$ principle components to approximate the first five images of the training set (first row of the data matrix) by projecting the centered data using the first $K$ principal components then "back project" (weighted sum of the components) to the original space and add the mean. For each $K$, plot the reconstructed image. This can be accomplished by completing the _TODO_ comment headers in the `Back_Project.m` script. Explain your observations in the report.] c. #c[*(20 points)* Use the first $K = {10, 50, 100}$ principle components to approximate the first five images of the training set (first row of the data matrix) by projecting the centered data using the first $K$ principal components then "back project" (weighted sum of the components) to the original space and add the mean. For each $K$, plot the reconstructed image. This can be accomplished by completing the _TODO_ comment headers in the `Back_Project.m` script. Explain your observations in the report.]

View file

@ -7,11 +7,21 @@ function [neigenvectors] = ProportionOfVariance(training_data)
% stack data % stack data
data = vertcat(training_data); data = vertcat(training_data);
% TODO: perform PCA % perform PCA
[~,~,latent] = pca(data);
% TODO: compute proportion of variance explained % compute proportion of variance explained
all_eigs = sum(latent);
prop_var = cumsum(latent) / all_eigs;
% TODO: show figure of proportion of variance explained where the x-axis is the number of eigenvectors and the y-axis is the percentage of % show figure of proportion of variance explained where the x-axis is the number of eigenvectors and the y-axis is the percentage of
% variance explained % variance explained
subplot(2,1,1);
plot(find(latent), latent);
subplot(2,1,2);
plot(find(prop_var), prop_var);
neigenvectors = find(prop_var > 0.9, 1);
end % Function end end % Function end

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB