This commit is contained in:
Michael Zhang 2023-10-09 04:30:22 -05:00
parent e0d966087d
commit bfcb5764d6
3 changed files with 8 additions and 5 deletions

View file

@ -7,7 +7,7 @@ CFLAGS := -std=c11 -fopenmp \
LDFLAGS := -std=c11 -fopenmp -L/opt/homebrew/opt/libomp/lib -O3
RUST_SOURCES := $(shell find . -name "*.rs")
all: lc_openmp lc_pthreads handin
all: lc_openmp lc_pthreads
handin: zhan4854.tar.gz

View file

@ -38,7 +38,7 @@ int main(int argc, char **argv) {
#pragma omp parallel for default(shared)
for (int i = 0; i < data->dimensions; i++) {
// #pragma omp parallel for default(shared)
#pragma omp parallel for default(shared) if (thread_count > data->dimensions)
for (int j = 0; j < data->rows; j++) {
FLOAT x_ni_w_ni = 0;
@ -56,6 +56,7 @@ int main(int argc, char **argv) {
FLOAT numer = 0, denom = 0;
// #pragma omp parallel for default(shared) reduction(+ : numer, denom)
// if(thread_count > data->dimensions)
for (int j = 0; j < data->rows; j++) {
FLOAT xij = data->buf[data->rows * i + j];
numer += xij * inner_calc[data->rows * i + j];

View file

@ -63,8 +63,10 @@ author: |
This data was generated using the `run_benchmark.sh > out.txt` script.
Small note: There's a part in the end of the program that performs validation on the trained model by using a train/test data set split. I didn't count this towards execution time but felt that it was important enough to keep since it ensured that my program was still behaving correctly.
## NOTES
```
I noticed that the loss sometimes fluctuates rather wildly. I think this is because there's no fixed learning rate, so instead of going incrementally, we kind of just take each dimension's minimum and haphazardly combine them together. In Wikipedia's description[^1] of the algorithm, they take the $w_i$ in particular that results in the minimal loss _by itself_, and then only use that $w_i$ for that outer iteration. I'm wondering if this will produce a better convergence, but I tried to stick to implementing the algorithm described in the pdf for the sake of the assignment since I'm guessing the effectiveness of the machine learning model isn't the important thing here.
```
[^1]: https://en.wikipedia.org/wiki/Coordinate_descent
Also, there's a part in the end of the program that performs validation on the trained model by using a train/test data set split. I didn't count this towards execution time but felt that it was important enough to keep since it ensured that my program was still behaving correctly.