This commit is contained in:
Michael Zhang 2021-09-06 16:04:46 -05:00
parent 609785d503
commit 33437b95c1
Signed by: michael
GPG key ID: BDA47A31A3C8EE6B
21 changed files with 363011 additions and 363011 deletions

View file

@ -1,216 +1,216 @@
# Non-Personalized Recommender Assignment # Non-Personalized Recommender Assignment
In this assignment, you will implement some non-personalized recommenders. In particular, you will In this assignment, you will implement some non-personalized recommenders. In particular, you will
implement raw and damped item mean recommenders and simple and advanced association rule implement raw and damped item mean recommenders and simple and advanced association rule
recommenders. recommenders.
You will implement these recommenders in the LensKit toolkit. You will implement these recommenders in the LensKit toolkit.
## Downloads and Resources ## Downloads and Resources
- Project template (from Coursera) - Project template (from Coursera)
- [LensKit for Teaching website](http://mooc.lenskit.org) (links to relevant documentation) - [LensKit for Teaching website](http://mooc.lenskit.org) (links to relevant documentation)
- [JavaDoc for included code](http://mooc.lenskit.org/assignments/nonpers/javadoc/) - [JavaDoc for included code](http://mooc.lenskit.org/assignments/nonpers/javadoc/)
- [Fastutil API docs](http://fastutil.di.unimi.it/docs/) documents the Fastutil optimized data - [Fastutil API docs](http://fastutil.di.unimi.it/docs/) documents the Fastutil optimized data
structure classes that are used in portions of LensKit. structure classes that are used in portions of LensKit.
The project template contains support code, the build file, and the input data that you will use. The project template contains support code, the build file, and the input data that you will use.
## Input Data ## Input Data
The input data contains the following files: The input data contains the following files:
- `ratings.csv` contains user ratings of movies - `ratings.csv` contains user ratings of movies
- `movies.csv` contains movie titles - `movies.csv` contains movie titles
- `movielens.yml` is a LensKit data manifest that describes the other input files - `movielens.yml` is a LensKit data manifest that describes the other input files
## Getting Started ## Getting Started
To get started with this assignment, unpack the template and import it in to your IDE as a Gradle To get started with this assignment, unpack the template and import it in to your IDE as a Gradle
project. The assignment video demonstrates how to do this in IntelliJ IDEA. project. The assignment video demonstrates how to do this in IntelliJ IDEA.
## Mean-Based Recommendation ## Mean-Based Recommendation
The first two recommenders you will implement will recommend items with the highest average rating. The first two recommenders you will implement will recommend items with the highest average rating.
With LensKit's scorer-model-builder architecture, you will just need to write the recommendation With LensKit's scorer-model-builder architecture, you will just need to write the recommendation
logic once, and you will implement two different mechanisms for computing item mean ratings. logic once, and you will implement two different mechanisms for computing item mean ratings.
You will work with the following classes: You will work with the following classes:
- `MeanItemBasedItemRecommender` (the *item recommender*) computes top-*N* recommendations based - `MeanItemBasedItemRecommender` (the *item recommender*) computes top-*N* recommendations based
on mean ratings. You will implement the logic to compute such recommendation lists. on mean ratings. You will implement the logic to compute such recommendation lists.
- `ItemMeanModel` is a *model class* that stores precomputed item means. You will not need to - `ItemMeanModel` is a *model class* that stores precomputed item means. You will not need to
modify this class, but you will write code to construct instances of it and use it in your modify this class, but you will write code to construct instances of it and use it in your
item recommender implementation. item recommender implementation.
- `ItemMeanModelProvider` computes item mean ratings from rating data and constructs the model. - `ItemMeanModelProvider` computes item mean ratings from rating data and constructs the model.
It computes raw means with no damping. It computes raw means with no damping.
- `DampedItemMeanModelProvider` is an alternate builder for item mean models that computes - `DampedItemMeanModelProvider` is an alternate builder for item mean models that computes
damped means instead of raw means. It takes the damping term as a parameter. The configuration damped means instead of raw means. It takes the damping term as a parameter. The configuration
file we provide you uses a damping term of 5. file we provide you uses a damping term of 5.
There are `// TODO` comments in all places where you need to write new code. There are `// TODO` comments in all places where you need to write new code.
### Computing Item Means ### Computing Item Means
Modify the `ItemMeanModelProvider` class to compute the mean rating for each item. Modify the `ItemMeanModelProvider` class to compute the mean rating for each item.
### Recommending Items ### Recommending Items
Modify the `MeanItemBasedItemRecommender` class to compute recommendations based on item mean Modify the `MeanItemBasedItemRecommender` class to compute recommendations based on item mean
ratings. For this, you need to: ratings. For this, you need to:
1. Obtain the mean rating for each item 1. Obtain the mean rating for each item
2. Order the items in decreasing order 2. Order the items in decreasing order
3. Return the *N* highest-rated items 3. Return the *N* highest-rated items
### Computing Damped Item Means ### Computing Damped Item Means
Modify the `DampedItemMeanModelProvider` class to compute the damped mean rating for each item. Modify the `DampedItemMeanModelProvider` class to compute the damped mean rating for each item.
This formula uses a damping factor $\alpha$, which is the number of 'fake' ratings at the global This formula uses a damping factor $\alpha$, which is the number of 'fake' ratings at the global
mean to assume for each item. In the Java code, this is available as the field `damping`. mean to assume for each item. In the Java code, this is available as the field `damping`.
The damped mean formula, as you may recall, is: The damped mean formula, as you may recall, is:
$$s(i) = \frac{\sum_{u \in U_i} r_{ui} + \alpha\mu}{|U_i| + \alpha}$$ $$s(i) = \frac{\sum_{u \in U_i} r_{ui} + \alpha\mu}{|U_i| + \alpha}$$
where $\mu$ is the *global* mean rating. where $\mu$ is the *global* mean rating.
### Example Outputs ### Example Outputs
To help you see if your output is correct, we have provided the following example correct values: To help you see if your output is correct, we have provided the following example correct values:
| ID | Title | Mean | Damped Mean | | ID | Title | Mean | Damped Mean |
| :-: | :---- | :--: | :---------: | | :-: | :---- | :--: | :---------: |
| 2959 | *Fight Club* | 4.259 | 4.252 | | 2959 | *Fight Club* | 4.259 | 4.252 |
| 1203 | *12 Angry Men* | 4.246 | 4.227 | | 1203 | *12 Angry Men* | 4.246 | 4.227 |
## Association Rules ## Association Rules
In the second part of the assignment, you will implement two versions of an association rule In the second part of the assignment, you will implement two versions of an association rule
recommender. recommender.
The association rule implementation consists of the following code: The association rule implementation consists of the following code:
- `AssociationItemBasedItemRecommender` recommends items using association rules. Unlike the mean - `AssociationItemBasedItemRecommender` recommends items using association rules. Unlike the mean
recommenders, this recommender uses a *reference item* to compute the recommendations. recommenders, this recommender uses a *reference item* to compute the recommendations.
- `AssociationModel` stores the association rule scores between pairs of items. You will not need - `AssociationModel` stores the association rule scores between pairs of items. You will not need
to modify this class. to modify this class.
- `BasicAssociationModelProvider` computes an association rule model using the basic association - `BasicAssociationModelProvider` computes an association rule model using the basic association
rule formula ($P(X \wedge Y) / P(X)$). rule formula ($P(X \wedge Y) / P(X)$).
- `LiftAssociationModelProvider` computes an association rule model using the lift formula ($P(X \wedge Y) / P(X) P(Y)$). - `LiftAssociationModelProvider` computes an association rule model using the lift formula ($P(X \wedge Y) / P(X) P(Y)$).
### Computing Association Scores ### Computing Association Scores
Like with the mean-based recommender, we pre-compute product association scores and store them in Like with the mean-based recommender, we pre-compute product association scores and store them in
a model before recommendation. We compute the scores between *all pairs* of items, so that the a model before recommendation. We compute the scores between *all pairs* of items, so that the
model can be used to score any item. When computing a single recommendation from the command line, model can be used to score any item. When computing a single recommendation from the command line,
this does not provide much benefit, but is useful in the general case so that the model can be used this does not provide much benefit, but is useful in the general case so that the model can be used
to very quickly compute many recommendations. to very quickly compute many recommendations.
The `BasicAssociationModelProvider` class computes the association rule scores using the following The `BasicAssociationModelProvider` class computes the association rule scores using the following
formula: formula:
$$P(i|j) = \frac{P(i \wedge j)}{P(j))} = \frac{|U_i \cap U_j|/|U|}{|U_j|/|U|}$$ $$P(i|j) = \frac{P(i \wedge j)}{P(j))} = \frac{|U_i \cap U_j|/|U|}{|U_j|/|U|}$$
In this case, $j$ is the *reference* item and $i$ is the item to be scored. In this case, $j$ is the *reference* item and $i$ is the item to be scored.
We estimate probabilities by counting: $P(i)$ is the fraction of users in the system We estimate probabilities by counting: $P(i)$ is the fraction of users in the system
who purchased item $i$; $P(i \wedge j)$ is the fraction that purchased both $i$ and $j$. who purchased item $i$; $P(i \wedge j)$ is the fraction that purchased both $i$ and $j$.
**Implement the association rule computation in this class.** **Implement the association rule computation in this class.**
### Computing Recommendations ### Computing Recommendations
Implement the recommendation logic in `AssociationItemBasedItemRecommender` to recommend items Implement the recommendation logic in `AssociationItemBasedItemRecommender` to recommend items
related to a given reference item. As with the mean recommender, it should compute the top *N* related to a given reference item. As with the mean recommender, it should compute the top *N*
recommendations and return them. recommendations and return them.
### Computing Advanced Association Rules ### Computing Advanced Association Rules
The `LiftAssociationModelProvider` recommender uses the *lift* metric that computes how The `LiftAssociationModelProvider` recommender uses the *lift* metric that computes how
much more likely someone is to rate a movie $i$ when they have rated $j$ than they would have if we do not know anything about whether they have rated $j$: much more likely someone is to rate a movie $i$ when they have rated $j$ than they would have if we do not know anything about whether they have rated $j$:
$$s(i|j) = \frac{P(j \wedge i)}{P(i) P(j)}$$ $$s(i|j) = \frac{P(j \wedge i)}{P(i) P(j)}$$
### Example Outputs ### Example Outputs
Following is the correct output for the basic association rules with reference item 260 (*Star Wars*), as generated with `./gradlew runBasicAssoc -PreferenceItemm=260`: Following is the correct output for the basic association rules with reference item 260 (*Star Wars*), as generated with `./gradlew runBasicAssoc -PreferenceItemm=260`:
2571 (Matrix, The (1999)): 0.916 2571 (Matrix, The (1999)): 0.916
1196 (Star Wars: Episode V - The Empire Strikes Back (1980)): 0.899 1196 (Star Wars: Episode V - The Empire Strikes Back (1980)): 0.899
4993 (Lord of the Rings: The Fellowship of the Ring, The (2001)): 0.892 4993 (Lord of the Rings: The Fellowship of the Ring, The (2001)): 0.892
1210 (Star Wars: Episode VI - Return of the Jedi (1983)): 0.847 1210 (Star Wars: Episode VI - Return of the Jedi (1983)): 0.847
356 (Forrest Gump (1994)): 0.843 356 (Forrest Gump (1994)): 0.843
5952 (Lord of the Rings: The Two Towers, The (2002)): 0.841 5952 (Lord of the Rings: The Two Towers, The (2002)): 0.841
7153 (Lord of the Rings: The Return of the King, The (2003)): 0.830 7153 (Lord of the Rings: The Return of the King, The (2003)): 0.830
296 (Pulp Fiction (1994)): 0.828 296 (Pulp Fiction (1994)): 0.828
1198 (Raiders of the Lost Ark (Indiana Jones and the Raiders of the Lost Ark) (1981)): 0.791 1198 (Raiders of the Lost Ark (Indiana Jones and the Raiders of the Lost Ark) (1981)): 0.791
480 (Jurassic Park (1993)): 0.789 480 (Jurassic Park (1993)): 0.789
And lift-based association rules for item 2761 (*The Iron Giant*): And lift-based association rules for item 2761 (*The Iron Giant*):
631 (All Dogs Go to Heaven 2 (1996)): 4.898 631 (All Dogs Go to Heaven 2 (1996)): 4.898
2532 (Conquest of the Planet of the Apes (1972)): 4.810 2532 (Conquest of the Planet of the Apes (1972)): 4.810
3615 (Dinosaur (2000)): 4.546 3615 (Dinosaur (2000)): 4.546
1649 (Fast, Cheap & Out of Control (1997)): 4.490 1649 (Fast, Cheap & Out of Control (1997)): 4.490
340 (War, The (1994)): 4.490 340 (War, The (1994)): 4.490
1016 (Shaggy Dog, The (1959)): 4.490 1016 (Shaggy Dog, The (1959)): 4.490
2439 (Affliction (1997)): 4.490 2439 (Affliction (1997)): 4.490
332 (Village of the Damned (1995)): 4.377 332 (Village of the Damned (1995)): 4.377
2736 (Brighton Beach Memoirs (1986)): 4.329 2736 (Brighton Beach Memoirs (1986)): 4.329
3213 (Batman: Mask of the Phantasm (1993)): 4.317 3213 (Batman: Mask of the Phantasm (1993)): 4.317
## Running your code ## Running your code
The Gradle build file we have provided is set up to automatically run all four of your recommenders. The Gradle build file we have provided is set up to automatically run all four of your recommenders.
The following Gradle targets will do this: The following Gradle targets will do this:
- `runMean` runs the raw mean recommender - `runMean` runs the raw mean recommender
- `runDampedMean` runs the damped mean recommender - `runDampedMean` runs the damped mean recommender
- `runBasicAssoc` runs the basic association rule recommender - `runBasicAssoc` runs the basic association rule recommender
- `runLiftAssoc` runs the advanced (lift-based) association rule recommender - `runLiftAssoc` runs the advanced (lift-based) association rule recommender
You can run these using the IntelliJ Gradle runner (open the Gradle panel, browse the tree to find You can run these using the IntelliJ Gradle runner (open the Gradle panel, browse the tree to find
a task, and double-click it), or from the command line: a task, and double-click it), or from the command line:
./gradlew runMean ./gradlew runMean
The association rule recommenders can also take the reference item ID on the command line as a The association rule recommenders can also take the reference item ID on the command line as a
`referenceItem` parameter. For example: `referenceItem` parameter. For example:
./gradlew runLiftAssoc -PreferenceItem=1 ./gradlew runLiftAssoc -PreferenceItem=1
The IntelliJ Run Configuration dialog will allow you to specify additional script parameters to The IntelliJ Run Configuration dialog will allow you to specify additional script parameters to
the Gradle invocation. the Gradle invocation.
### Debugging ### Debugging
If you run the Gradle tasks using IntelliJ's Gradle runner, you can run them under the debugger to debug your code. If you run the Gradle tasks using IntelliJ's Gradle runner, you can run them under the debugger to debug your code.
The Gradle file also configures LensKit to write log output to log files under the `build` The Gradle file also configures LensKit to write log output to log files under the `build`
directory. If you use the SLF4J logger (the `logger` field on the classes we provide) to emit debug directory. If you use the SLF4J logger (the `logger` field on the classes we provide) to emit debug
messages, you can find them there when you run one of the recommender tasks such as `runDampedMean`. messages, you can find them there when you run one of the recommender tasks such as `runDampedMean`.
## Submitting ## Submitting
You will submit a compiled `jar` file containing your solution. To prepare your project for You will submit a compiled `jar` file containing your solution. To prepare your project for
submission, run the Gradle `prepareSubmission` task: submission, run the Gradle `prepareSubmission` task:
./gradlew prepareSubmission ./gradlew prepareSubmission
This will create file `nonpers-submission.jar` under `build/distributions` that contains your final This will create file `nonpers-submission.jar` under `build/distributions` that contains your final
solution code in a format the grader will understand. Upload this `jar` file to the Coursera solution code in a format the grader will understand. Upload this `jar` file to the Coursera
assignment grader. assignment grader.
## Grading ## Grading
Your grade for each part will be based on two components: Your grade for each part will be based on two components:
- Outputting items in the correct order: 75% - Outputting items in the correct order: 75%
- Computing correct scores for items (within an error tolerance): 25% - Computing correct scores for items (within an error tolerance): 25%
The parts themselves are weighted equally. The parts themselves are weighted equally.

View file

@ -1,87 +1,87 @@
apply plugin: 'java' apply plugin: 'java'
ext.lenskitVersion = '3.0-M1' ext.lenskitVersion = '3.0-M1'
if (!hasProperty('dataDir')) { if (!hasProperty('dataDir')) {
ext.dataDir = 'data' ext.dataDir = 'data'
} }
sourceCompatibility = 1.7 sourceCompatibility = 1.7
apply from: "$rootDir/gradle/repositories.gradle" apply from: "$rootDir/gradle/repositories.gradle"
dependencies { dependencies {
compile "org.lenskit:lenskit-core:$lenskitVersion" compile "org.lenskit:lenskit-core:$lenskitVersion"
runtime "org.lenskit:lenskit-cli:$lenskitVersion" runtime "org.lenskit:lenskit-cli:$lenskitVersion"
} }
dependencies { dependencies {
testCompile group: 'junit', name: 'junit', version: '4.11' testCompile group: 'junit', name: 'junit', version: '4.11'
} }
task runMean(type: JavaExec, group: 'run') { task runMean(type: JavaExec, group: 'run') {
description "Run the simple mean recommender." description "Run the simple mean recommender."
classpath sourceSets.main.runtimeClasspath classpath sourceSets.main.runtimeClasspath
main 'org.lenskit.cli.Main' main 'org.lenskit.cli.Main'
args '--log-file', file("$buildDir/recommend-mean.log"), '--log-file-level', 'DEBUG' args '--log-file', file("$buildDir/recommend-mean.log"), '--log-file-level', 'DEBUG'
args 'global-recommend' args 'global-recommend'
args '--data-source', "$dataDir/movielens.yml" args '--data-source', "$dataDir/movielens.yml"
args '-c', file('etc/mean.groovy') args '-c', file('etc/mean.groovy')
args '-n', 10 args '-n', 10
if (project.hasProperty('lenskit.maxMemory')) { if (project.hasProperty('lenskit.maxMemory')) {
maxHeapSize project.getProperty('lenskit.maxMemory') maxHeapSize project.getProperty('lenskit.maxMemory')
} }
} }
task runDampedMean(type: JavaExec, group: 'run') { task runDampedMean(type: JavaExec, group: 'run') {
description "Run the damped mean recommender." description "Run the damped mean recommender."
mustRunAfter runMean mustRunAfter runMean
classpath sourceSets.main.runtimeClasspath classpath sourceSets.main.runtimeClasspath
main 'org.lenskit.cli.Main' main 'org.lenskit.cli.Main'
args '--log-file', file("$buildDir/recommend-damped-mean.log"), '--log-file-level', 'DEBUG' args '--log-file', file("$buildDir/recommend-damped-mean.log"), '--log-file-level', 'DEBUG'
args 'global-recommend' args 'global-recommend'
args '--data-source', "$dataDir/movielens.yml" args '--data-source', "$dataDir/movielens.yml"
args '-c', file('etc/damped-mean.groovy') args '-c', file('etc/damped-mean.groovy')
if (project.hasProperty('lenskit.maxMemory')) { if (project.hasProperty('lenskit.maxMemory')) {
maxHeapSize project.getProperty('lenskit.maxMemory') maxHeapSize project.getProperty('lenskit.maxMemory')
} }
} }
task runBasicAssoc(type: JavaExec, group: 'run') { task runBasicAssoc(type: JavaExec, group: 'run') {
description "Run the damped mean recommender." description "Run the damped mean recommender."
mustRunAfter runDampedMean mustRunAfter runDampedMean
classpath sourceSets.main.runtimeClasspath classpath sourceSets.main.runtimeClasspath
main 'org.lenskit.cli.Main' main 'org.lenskit.cli.Main'
args '--log-file', file("$buildDir/recommend-basic-assoc.log"), '--log-file-level', 'DEBUG' args '--log-file', file("$buildDir/recommend-basic-assoc.log"), '--log-file-level', 'DEBUG'
args 'global-recommend' args 'global-recommend'
args '--data-source', "$dataDir/movielens.yml" args '--data-source', "$dataDir/movielens.yml"
args '-c', file('etc/simple-assoc.groovy') args '-c', file('etc/simple-assoc.groovy')
args findProperty('referenceItem') ?: 260 args findProperty('referenceItem') ?: 260
if (project.hasProperty('lenskit.maxMemory')) { if (project.hasProperty('lenskit.maxMemory')) {
maxHeapSize project.getProperty('lenskit.maxMemory') maxHeapSize project.getProperty('lenskit.maxMemory')
} }
} }
task runLiftAssoc(type: JavaExec, group: 'run') { task runLiftAssoc(type: JavaExec, group: 'run') {
description "Run the damped mean recommender." description "Run the damped mean recommender."
classpath sourceSets.main.runtimeClasspath classpath sourceSets.main.runtimeClasspath
mustRunAfter runBasicAssoc mustRunAfter runBasicAssoc
main 'org.lenskit.cli.Main' main 'org.lenskit.cli.Main'
args '--log-file', file("$buildDir/recommend-lift-assoc.log"), '--log-file-level', 'DEBUG' args '--log-file', file("$buildDir/recommend-lift-assoc.log"), '--log-file-level', 'DEBUG'
args 'global-recommend' args 'global-recommend'
args '--data-source', "$dataDir/movielens.yml" args '--data-source', "$dataDir/movielens.yml"
args '-c', file('etc/lift-assoc.groovy') args '-c', file('etc/lift-assoc.groovy')
args findProperty('referenceItem') ?: 2761 args findProperty('referenceItem') ?: 2761
if (project.hasProperty('lenskit.maxMemory')) { if (project.hasProperty('lenskit.maxMemory')) {
maxHeapSize project.getProperty('lenskit.maxMemory') maxHeapSize project.getProperty('lenskit.maxMemory')
} }
} }
task runAll(group: 'run') { task runAll(group: 'run') {
dependsOn runMean, runDampedMean dependsOn runMean, runDampedMean
dependsOn runBasicAssoc, runLiftAssoc dependsOn runBasicAssoc, runLiftAssoc
} }
task prepareSubmission(type: Copy) { task prepareSubmission(type: Copy) {
from jar from jar
into distsDir into distsDir
rename(/-assignment/, '-submission') rename(/-assignment/, '-submission')
} }

View file

@ -1,28 +1,28 @@
ratings: ratings:
type: textfile type: textfile
file: ratings.csv file: ratings.csv
format: csv format: csv
entity_type: rating entity_type: rating
header: true header: true
movies: movies:
type: textfile type: textfile
file: movies.csv file: movies.csv
format: csv format: csv
entity_type: item entity_type: item
header: true header: true
columns: [id, name] columns: [id, name]
tags: tags:
type: textfile type: textfile
file: tags.csv file: tags.csv
format: csv format: csv
entity_type: item-tag entity_type: item-tag
header: true header: true
columns: columns:
- name: item - name: item
type: long type: long
- name: user - name: user
type: long type: long
- name: tag - name: tag
type: string type: string
- name: timestamp - name: timestamp
type: long type: long

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -1,14 +1,14 @@
import org.lenskit.api.ItemBasedItemRecommender import org.lenskit.api.ItemBasedItemRecommender
import org.lenskit.baseline.MeanDamping import org.lenskit.baseline.MeanDamping
import org.lenskit.mooc.nonpers.mean.DampedItemMeanModelProvider import org.lenskit.mooc.nonpers.mean.DampedItemMeanModelProvider
import org.lenskit.mooc.nonpers.mean.ItemMeanModel import org.lenskit.mooc.nonpers.mean.ItemMeanModel
import org.lenskit.mooc.nonpers.mean.MeanItemBasedItemRecommender import org.lenskit.mooc.nonpers.mean.MeanItemBasedItemRecommender
// set up the recommender // set up the recommender
bind ItemBasedItemRecommender to MeanItemBasedItemRecommender bind ItemBasedItemRecommender to MeanItemBasedItemRecommender
// this time, we will use the damped mean model // this time, we will use the damped mean model
bind ItemMeanModel toProvider DampedItemMeanModelProvider bind ItemMeanModel toProvider DampedItemMeanModelProvider
// use a mean damping of 5 // use a mean damping of 5
set MeanDamping to 5 set MeanDamping to 5

View file

@ -1,7 +1,7 @@
import org.lenskit.api.ItemBasedItemRecommender import org.lenskit.api.ItemBasedItemRecommender
import org.lenskit.mooc.nonpers.assoc.LiftAssociationModelProvider import org.lenskit.mooc.nonpers.assoc.LiftAssociationModelProvider
import org.lenskit.mooc.nonpers.assoc.AssociationItemBasedItemRecommender import org.lenskit.mooc.nonpers.assoc.AssociationItemBasedItemRecommender
import org.lenskit.mooc.nonpers.assoc.AssociationModel import org.lenskit.mooc.nonpers.assoc.AssociationModel
bind ItemBasedItemRecommender to AssociationItemBasedItemRecommender bind ItemBasedItemRecommender to AssociationItemBasedItemRecommender
bind AssociationModel toProvider LiftAssociationModelProvider bind AssociationModel toProvider LiftAssociationModelProvider

View file

@ -1,4 +1,4 @@
import org.lenskit.mooc.nonpers.mean.MeanItemBasedItemRecommender import org.lenskit.mooc.nonpers.mean.MeanItemBasedItemRecommender
import org.lenskit.api.ItemBasedItemRecommender import org.lenskit.api.ItemBasedItemRecommender
bind ItemBasedItemRecommender to MeanItemBasedItemRecommender bind ItemBasedItemRecommender to MeanItemBasedItemRecommender

View file

@ -1,7 +1,7 @@
import org.lenskit.api.ItemBasedItemRecommender import org.lenskit.api.ItemBasedItemRecommender
import org.lenskit.mooc.nonpers.assoc.AssociationItemBasedItemRecommender import org.lenskit.mooc.nonpers.assoc.AssociationItemBasedItemRecommender
import org.lenskit.mooc.nonpers.assoc.AssociationModel import org.lenskit.mooc.nonpers.assoc.AssociationModel
import org.lenskit.mooc.nonpers.assoc.BasicAssociationModelProvider import org.lenskit.mooc.nonpers.assoc.BasicAssociationModelProvider
bind ItemBasedItemRecommender to AssociationItemBasedItemRecommender bind ItemBasedItemRecommender to AssociationItemBasedItemRecommender
bind AssociationModel toProvider BasicAssociationModelProvider bind AssociationModel toProvider BasicAssociationModelProvider

View file

@ -1,6 +1,6 @@
repositories { repositories {
mavenCentral() mavenCentral()
maven { maven {
url 'https://oss.sonatype.org/content/repositories/snapshots/' url 'https://oss.sonatype.org/content/repositories/snapshots/'
} }
} }

View file

@ -1,6 +1,6 @@
#Fri Mar 25 17:48:43 CDT 2016 #Fri Mar 25 17:48:43 CDT 2016
distributionBase=GRADLE_USER_HOME distributionBase=GRADLE_USER_HOME
distributionPath=wrapper/dists distributionPath=wrapper/dists
zipStoreBase=GRADLE_USER_HOME zipStoreBase=GRADLE_USER_HOME
zipStorePath=wrapper/dists zipStorePath=wrapper/dists
distributionUrl=https\://services.gradle.org/distributions/gradle-2.14-bin.zip distributionUrl=https\://services.gradle.org/distributions/gradle-2.14-bin.zip

View file

@ -1,90 +1,90 @@
@if "%DEBUG%" == "" @echo off @if "%DEBUG%" == "" @echo off
@rem ########################################################################## @rem ##########################################################################
@rem @rem
@rem Gradle startup script for Windows @rem Gradle startup script for Windows
@rem @rem
@rem ########################################################################## @rem ##########################################################################
@rem Set local scope for the variables with windows NT shell @rem Set local scope for the variables with windows NT shell
if "%OS%"=="Windows_NT" setlocal if "%OS%"=="Windows_NT" setlocal
@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script. @rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.
set DEFAULT_JVM_OPTS= set DEFAULT_JVM_OPTS=
set DIRNAME=%~dp0 set DIRNAME=%~dp0
if "%DIRNAME%" == "" set DIRNAME=. if "%DIRNAME%" == "" set DIRNAME=.
set APP_BASE_NAME=%~n0 set APP_BASE_NAME=%~n0
set APP_HOME=%DIRNAME% set APP_HOME=%DIRNAME%
@rem Find java.exe @rem Find java.exe
if defined JAVA_HOME goto findJavaFromJavaHome if defined JAVA_HOME goto findJavaFromJavaHome
set JAVA_EXE=java.exe set JAVA_EXE=java.exe
%JAVA_EXE% -version >NUL 2>&1 %JAVA_EXE% -version >NUL 2>&1
if "%ERRORLEVEL%" == "0" goto init if "%ERRORLEVEL%" == "0" goto init
echo. echo.
echo ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH. echo ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.
echo. echo.
echo Please set the JAVA_HOME variable in your environment to match the echo Please set the JAVA_HOME variable in your environment to match the
echo location of your Java installation. echo location of your Java installation.
goto fail goto fail
:findJavaFromJavaHome :findJavaFromJavaHome
set JAVA_HOME=%JAVA_HOME:"=% set JAVA_HOME=%JAVA_HOME:"=%
set JAVA_EXE=%JAVA_HOME%/bin/java.exe set JAVA_EXE=%JAVA_HOME%/bin/java.exe
if exist "%JAVA_EXE%" goto init if exist "%JAVA_EXE%" goto init
echo. echo.
echo ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME% echo ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%
echo. echo.
echo Please set the JAVA_HOME variable in your environment to match the echo Please set the JAVA_HOME variable in your environment to match the
echo location of your Java installation. echo location of your Java installation.
goto fail goto fail
:init :init
@rem Get command-line arguments, handling Windowz variants @rem Get command-line arguments, handling Windowz variants
if not "%OS%" == "Windows_NT" goto win9xME_args if not "%OS%" == "Windows_NT" goto win9xME_args
if "%@eval[2+2]" == "4" goto 4NT_args if "%@eval[2+2]" == "4" goto 4NT_args
:win9xME_args :win9xME_args
@rem Slurp the command line arguments. @rem Slurp the command line arguments.
set CMD_LINE_ARGS= set CMD_LINE_ARGS=
set _SKIP=2 set _SKIP=2
:win9xME_args_slurp :win9xME_args_slurp
if "x%~1" == "x" goto execute if "x%~1" == "x" goto execute
set CMD_LINE_ARGS=%* set CMD_LINE_ARGS=%*
goto execute goto execute
:4NT_args :4NT_args
@rem Get arguments from the 4NT Shell from JP Software @rem Get arguments from the 4NT Shell from JP Software
set CMD_LINE_ARGS=%$ set CMD_LINE_ARGS=%$
:execute :execute
@rem Setup the command line @rem Setup the command line
set CLASSPATH=%APP_HOME%\gradle\wrapper\gradle-wrapper.jar set CLASSPATH=%APP_HOME%\gradle\wrapper\gradle-wrapper.jar
@rem Execute Gradle @rem Execute Gradle
"%JAVA_EXE%" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% "-Dorg.gradle.appname=%APP_BASE_NAME%" -classpath "%CLASSPATH%" org.gradle.wrapper.GradleWrapperMain %CMD_LINE_ARGS% "%JAVA_EXE%" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% "-Dorg.gradle.appname=%APP_BASE_NAME%" -classpath "%CLASSPATH%" org.gradle.wrapper.GradleWrapperMain %CMD_LINE_ARGS%
:end :end
@rem End local scope for the variables with windows NT shell @rem End local scope for the variables with windows NT shell
if "%ERRORLEVEL%"=="0" goto mainEnd if "%ERRORLEVEL%"=="0" goto mainEnd
:fail :fail
rem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of rem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of
rem the _cmd.exe /c_ return code! rem the _cmd.exe /c_ return code!
if not "" == "%GRADLE_EXIT_CONSOLE%" exit 1 if not "" == "%GRADLE_EXIT_CONSOLE%" exit 1
exit /b 1 exit /b 1
:mainEnd :mainEnd
if "%OS%"=="Windows_NT" endlocal if "%OS%"=="Windows_NT" endlocal
:omega :omega

View file

@ -1,73 +1,73 @@
package org.lenskit.mooc.nonpers.assoc; package org.lenskit.mooc.nonpers.assoc;
import it.unimi.dsi.fastutil.longs.LongSet; import it.unimi.dsi.fastutil.longs.LongSet;
import org.lenskit.api.Result; import org.lenskit.api.Result;
import org.lenskit.api.ResultList; import org.lenskit.api.ResultList;
import org.lenskit.basic.AbstractItemBasedItemRecommender; import org.lenskit.basic.AbstractItemBasedItemRecommender;
import org.lenskit.results.Results; import org.lenskit.results.Results;
import org.lenskit.util.collections.LongUtils; import org.lenskit.util.collections.LongUtils;
import org.slf4j.Logger; import org.slf4j.Logger;
import org.slf4j.LoggerFactory; import org.slf4j.LoggerFactory;
import javax.annotation.Nullable; import javax.annotation.Nullable;
import javax.inject.Inject; import javax.inject.Inject;
import java.util.ArrayList; import java.util.ArrayList;
import java.util.List; import java.util.List;
import java.util.Set; import java.util.Set;
/** /**
* An item-based item scorer that uses association rules. * An item-based item scorer that uses association rules.
*/ */
public class AssociationItemBasedItemRecommender extends AbstractItemBasedItemRecommender { public class AssociationItemBasedItemRecommender extends AbstractItemBasedItemRecommender {
private static final Logger logger = LoggerFactory.getLogger(AssociationItemBasedItemRecommender.class); private static final Logger logger = LoggerFactory.getLogger(AssociationItemBasedItemRecommender.class);
private final AssociationModel model; private final AssociationModel model;
/** /**
* Construct the item scorer. * Construct the item scorer.
* *
* @param m The association rule model. * @param m The association rule model.
*/ */
@Inject @Inject
public AssociationItemBasedItemRecommender(AssociationModel m) { public AssociationItemBasedItemRecommender(AssociationModel m) {
model = m; model = m;
} }
@Override @Override
public ResultList recommendRelatedItemsWithDetails(Set<Long> basket, int n, @Nullable Set<Long> candidates, @Nullable Set<Long> exclude) { public ResultList recommendRelatedItemsWithDetails(Set<Long> basket, int n, @Nullable Set<Long> candidates, @Nullable Set<Long> exclude) {
LongSet items; LongSet items;
if (candidates == null) { if (candidates == null) {
items = model.getKnownItems(); items = model.getKnownItems();
} else { } else {
items = LongUtils.asLongSet(candidates); items = LongUtils.asLongSet(candidates);
} }
if (exclude != null) { if (exclude != null) {
items = LongUtils.setDifference(items, LongUtils.asLongSet(exclude)); items = LongUtils.setDifference(items, LongUtils.asLongSet(exclude));
} }
if (basket.isEmpty()) { if (basket.isEmpty()) {
return Results.newResultList(); return Results.newResultList();
} else if (basket.size() > 1) { } else if (basket.size() > 1) {
logger.warn("Reference set has more than 1 item, picking arbitrarily."); logger.warn("Reference set has more than 1 item, picking arbitrarily.");
} }
long refItem = basket.iterator().next(); long refItem = basket.iterator().next();
return recommendItems(n, refItem, items); return recommendItems(n, refItem, items);
} }
/** /**
* Recommend items with an association rule. * Recommend items with an association rule.
* @param n The number of recommendations to produce. * @param n The number of recommendations to produce.
* @param refItem The reference item. * @param refItem The reference item.
* @param candidates The candidate items (set of items that can possibly be recommended). * @param candidates The candidate items (set of items that can possibly be recommended).
* @return The list of results. * @return The list of results.
*/ */
private ResultList recommendItems(int n, long refItem, LongSet candidates) { private ResultList recommendItems(int n, long refItem, LongSet candidates) {
List<Result> results = new ArrayList<>(); List<Result> results = new ArrayList<>();
// TODO Compute the n highest-scoring items from candidates // TODO Compute the n highest-scoring items from candidates
return Results.newResultList(results); return Results.newResultList(results);
} }
} }

View file

@ -1,90 +1,90 @@
package org.lenskit.mooc.nonpers.assoc; package org.lenskit.mooc.nonpers.assoc;
import com.google.common.base.Preconditions; import com.google.common.base.Preconditions;
import it.unimi.dsi.fastutil.longs.LongSet; import it.unimi.dsi.fastutil.longs.LongSet;
import org.lenskit.inject.Shareable; import org.lenskit.inject.Shareable;
import org.lenskit.util.keys.SortedKeyIndex; import org.lenskit.util.keys.SortedKeyIndex;
import org.slf4j.Logger; import org.slf4j.Logger;
import org.slf4j.LoggerFactory; import org.slf4j.LoggerFactory;
import java.io.Serializable; import java.io.Serializable;
import java.util.Map; import java.util.Map;
/** /**
* An association rule model, storing item-item association scores. * An association rule model, storing item-item association scores.
* *
* <p>You <strong>should note</strong> need to change this class. It has some internal optimizations to reduce * <p>You <strong>should note</strong> need to change this class. It has some internal optimizations to reduce
* the memory requirements after the model is built.</p> * the memory requirements after the model is built.</p>
*/ */
@Shareable @Shareable
public class AssociationModel implements Serializable { public class AssociationModel implements Serializable {
private static final Logger logger = LoggerFactory.getLogger(AssociationModel.class); private static final Logger logger = LoggerFactory.getLogger(AssociationModel.class);
private static final long serialVersionUID = 1L; private static final long serialVersionUID = 1L;
private final SortedKeyIndex index; private final SortedKeyIndex index;
private final double[][] scores; private final double[][] scores;
/** /**
* Construct a new association model. * Construct a new association model.
* @param assocScores The association scores. The outer map's keys are the X items, and the inner map's keys are * @param assocScores The association scores. The outer map's keys are the X items, and the inner map's keys are
* the Y items. So {@code assocScores.get(x).get(y)} should return the score for {@code y} * the Y items. So {@code assocScores.get(x).get(y)} should return the score for {@code y}
* with respect to {@code x}. * with respect to {@code x}.
*/ */
public AssociationModel(Map<Long, ? extends Map<Long,Double>> assocScores) { public AssociationModel(Map<Long, ? extends Map<Long,Double>> assocScores) {
index = SortedKeyIndex.fromCollection(assocScores.keySet()); index = SortedKeyIndex.fromCollection(assocScores.keySet());
int n = index.size(); int n = index.size();
logger.debug("transforming input map for {} items into log data", n); logger.debug("transforming input map for {} items into log data", n);
scores = new double[n][n]; scores = new double[n][n];
for (int i = 0; i < n; i++) { for (int i = 0; i < n; i++) {
long itemX = index.getKey(i); long itemX = index.getKey(i);
for (int j = 0; j < n; j++) { for (int j = 0; j < n; j++) {
if (i == j) { if (i == j) {
continue; // skip self-similarities continue; // skip self-similarities
} }
long itemY = index.getKey(j); long itemY = index.getKey(j);
Double score = assocScores.get(itemX).get(itemY); Double score = assocScores.get(itemX).get(itemY);
if (score == null) { if (score == null) {
logger.error("no score found for items {} and {}", itemX, itemY); logger.error("no score found for items {} and {}", itemX, itemY);
String msg = String.format("no score found for x=%d, y=%d", itemX, itemY); String msg = String.format("no score found for x=%d, y=%d", itemX, itemY);
throw new IllegalArgumentException(msg); throw new IllegalArgumentException(msg);
} }
scores[i][j] = score; scores[i][j] = score;
} }
} }
} }
/** /**
* Get the set of known items. * Get the set of known items.
* @return The set of known item IDs. * @return The set of known item IDs.
*/ */
public LongSet getKnownItems() { public LongSet getKnownItems() {
return index.keySet(); return index.keySet();
} }
/** /**
* Query whether the model knows about an item. * Query whether the model knows about an item.
* @param item The item ID. * @param item The item ID.
* @return {@code true} if the model knows about the item {@code item}, {@code false} otherwise. * @return {@code true} if the model knows about the item {@code item}, {@code false} otherwise.
*/ */
public boolean hasItem(long item) { public boolean hasItem(long item) {
return index.containsKey(item); return index.containsKey(item);
} }
/** /**
* Get the association between two items. * Get the association between two items.
* @param ref The reference item (X). * @param ref The reference item (X).
* @param item The item to score (Y). * @param item The item to score (Y).
* @return The score between X and Y. * @return The score between X and Y.
* @throws IllegalArgumentException if either item is invalid. * @throws IllegalArgumentException if either item is invalid.
*/ */
public double getItemAssociation(long ref, long item) { public double getItemAssociation(long ref, long item) {
// look up item positions // look up item positions
int refIndex = index.tryGetIndex(ref); int refIndex = index.tryGetIndex(ref);
Preconditions.checkArgument(refIndex >= 0, "unknown reference item %d", ref); Preconditions.checkArgument(refIndex >= 0, "unknown reference item %d", ref);
int itemIndex = index.tryGetIndex(item); int itemIndex = index.tryGetIndex(item);
Preconditions.checkArgument(itemIndex >= 0, "unknown target item %d", item); Preconditions.checkArgument(itemIndex >= 0, "unknown target item %d", item);
return scores[refIndex][itemIndex]; return scores[refIndex][itemIndex];
} }
} }

View file

@ -1,82 +1,82 @@
package org.lenskit.mooc.nonpers.assoc; package org.lenskit.mooc.nonpers.assoc;
import it.unimi.dsi.fastutil.longs.*; import it.unimi.dsi.fastutil.longs.*;
import org.lenskit.data.dao.DataAccessObject; import org.lenskit.data.dao.DataAccessObject;
import org.lenskit.data.entities.CommonAttributes; import org.lenskit.data.entities.CommonAttributes;
import org.lenskit.data.ratings.Rating; import org.lenskit.data.ratings.Rating;
import org.lenskit.inject.Transient; import org.lenskit.inject.Transient;
import org.lenskit.util.IdBox; import org.lenskit.util.IdBox;
import org.lenskit.util.collections.LongUtils; import org.lenskit.util.collections.LongUtils;
import org.lenskit.util.io.ObjectStream; import org.lenskit.util.io.ObjectStream;
import javax.inject.Inject; import javax.inject.Inject;
import javax.inject.Provider; import javax.inject.Provider;
import java.util.List; import java.util.List;
/** /**
* Build a model for basic association rules. This class computes the association for all pairs of items. * Build a model for basic association rules. This class computes the association for all pairs of items.
*/ */
public class BasicAssociationModelProvider implements Provider<AssociationModel> { public class BasicAssociationModelProvider implements Provider<AssociationModel> {
private final DataAccessObject dao; private final DataAccessObject dao;
@Inject @Inject
public BasicAssociationModelProvider(@Transient DataAccessObject dao) { public BasicAssociationModelProvider(@Transient DataAccessObject dao) {
this.dao = dao; this.dao = dao;
} }
@Override @Override
public AssociationModel get() { public AssociationModel get() {
// First step: map each item to the set of users who have rated it. // First step: map each item to the set of users who have rated it.
// This map will map each item ID to the set of users who have rated it. // This map will map each item ID to the set of users who have rated it.
Long2ObjectMap<LongSortedSet> itemUsers = new Long2ObjectOpenHashMap<>(); Long2ObjectMap<LongSortedSet> itemUsers = new Long2ObjectOpenHashMap<>();
LongSet allUsers = new LongOpenHashSet(); LongSet allUsers = new LongOpenHashSet();
// Open a stream, grouping ratings by item ID // Open a stream, grouping ratings by item ID
try (ObjectStream<IdBox<List<Rating>>> ratingStream = dao.query(Rating.class) try (ObjectStream<IdBox<List<Rating>>> ratingStream = dao.query(Rating.class)
.groupBy(CommonAttributes.ITEM_ID) .groupBy(CommonAttributes.ITEM_ID)
.stream()) { .stream()) {
// Process each item's ratings // Process each item's ratings
for (IdBox<List<Rating>> item: ratingStream) { for (IdBox<List<Rating>> item: ratingStream) {
// Build a set of users. We build an array first, then convert to a set. // Build a set of users. We build an array first, then convert to a set.
LongList users = new LongArrayList(); LongList users = new LongArrayList();
// Add each rating's user ID to the user sets // Add each rating's user ID to the user sets
for (Rating r: item.getValue()) { for (Rating r: item.getValue()) {
long user = r.getUserId(); long user = r.getUserId();
users.add(user); users.add(user);
allUsers.add(user); allUsers.add(user);
} }
// put this item's user set into the item user map // put this item's user set into the item user map
// a frozen set will be very efficient later // a frozen set will be very efficient later
itemUsers.put(item.getId(), LongUtils.frozenSet(users)); itemUsers.put(item.getId(), LongUtils.frozenSet(users));
} }
} }
// Second step: compute all association rules // Second step: compute all association rules
// We need a map to store them // We need a map to store them
Long2ObjectMap<Long2DoubleMap> assocMatrix = new Long2ObjectOpenHashMap<>(); Long2ObjectMap<Long2DoubleMap> assocMatrix = new Long2ObjectOpenHashMap<>();
// then loop over 'x' items // then loop over 'x' items
for (Long2ObjectMap.Entry<LongSortedSet> xEntry: itemUsers.long2ObjectEntrySet()) { for (Long2ObjectMap.Entry<LongSortedSet> xEntry: itemUsers.long2ObjectEntrySet()) {
long xId = xEntry.getLongKey(); long xId = xEntry.getLongKey();
LongSortedSet xUsers = xEntry.getValue(); LongSortedSet xUsers = xEntry.getValue();
// set up a map to hold the scores for each 'y' item for this 'x' // set up a map to hold the scores for each 'y' item for this 'x'
Long2DoubleMap itemScores = new Long2DoubleOpenHashMap(); Long2DoubleMap itemScores = new Long2DoubleOpenHashMap();
// loop over the 'y' items // loop over the 'y' items
for (Long2ObjectMap.Entry<LongSortedSet> yEntry: itemUsers.long2ObjectEntrySet()) { for (Long2ObjectMap.Entry<LongSortedSet> yEntry: itemUsers.long2ObjectEntrySet()) {
long yId = yEntry.getLongKey(); long yId = yEntry.getLongKey();
LongSortedSet yUsers = yEntry.getValue(); LongSortedSet yUsers = yEntry.getValue();
// TODO Compute P(Y & X) / P(X) and store in itemScores // TODO Compute P(Y & X) / P(X) and store in itemScores
} }
// save the score map to the main map // save the score map to the main map
assocMatrix.put(xId, itemScores); assocMatrix.put(xId, itemScores);
} }
return new AssociationModel(assocMatrix); return new AssociationModel(assocMatrix);
} }
} }

View file

@ -1,83 +1,83 @@
package org.lenskit.mooc.nonpers.assoc; package org.lenskit.mooc.nonpers.assoc;
import it.unimi.dsi.fastutil.longs.*; import it.unimi.dsi.fastutil.longs.*;
import org.lenskit.data.dao.DataAccessObject; import org.lenskit.data.dao.DataAccessObject;
import org.lenskit.data.entities.CommonAttributes; import org.lenskit.data.entities.CommonAttributes;
import org.lenskit.data.ratings.Rating; import org.lenskit.data.ratings.Rating;
import org.lenskit.inject.Transient; import org.lenskit.inject.Transient;
import org.lenskit.util.IdBox; import org.lenskit.util.IdBox;
import org.lenskit.util.collections.LongUtils; import org.lenskit.util.collections.LongUtils;
import org.lenskit.util.io.ObjectStream; import org.lenskit.util.io.ObjectStream;
import org.slf4j.Logger; import org.slf4j.Logger;
import org.slf4j.LoggerFactory; import org.slf4j.LoggerFactory;
import javax.inject.Inject; import javax.inject.Inject;
import javax.inject.Provider; import javax.inject.Provider;
import java.util.List; import java.util.List;
/** /**
* Build an association rule model using a lift metric. * Build an association rule model using a lift metric.
*/ */
public class LiftAssociationModelProvider implements Provider<AssociationModel> { public class LiftAssociationModelProvider implements Provider<AssociationModel> {
private static final Logger logger = LoggerFactory.getLogger(LiftAssociationModelProvider.class); private static final Logger logger = LoggerFactory.getLogger(LiftAssociationModelProvider.class);
private final DataAccessObject dao; private final DataAccessObject dao;
@Inject @Inject
public LiftAssociationModelProvider(@Transient DataAccessObject dao) { public LiftAssociationModelProvider(@Transient DataAccessObject dao) {
this.dao = dao; this.dao = dao;
} }
@Override @Override
public AssociationModel get() { public AssociationModel get() {
// First step: map each item to the set of users who have rated it. // First step: map each item to the set of users who have rated it.
// While we're at it, compute the set of all users. // While we're at it, compute the set of all users.
// This set contains all users. // This set contains all users.
LongSet allUsers = new LongOpenHashSet(); LongSet allUsers = new LongOpenHashSet();
// This map will map each item ID to the set of users who have rated it. // This map will map each item ID to the set of users who have rated it.
Long2ObjectMap<LongSortedSet> itemUsers = new Long2ObjectOpenHashMap<>(); Long2ObjectMap<LongSortedSet> itemUsers = new Long2ObjectOpenHashMap<>();
// Open a stream, grouping ratings by item ID // Open a stream, grouping ratings by item ID
try (ObjectStream<IdBox<List<Rating>>> ratingStream = dao.query(Rating.class) try (ObjectStream<IdBox<List<Rating>>> ratingStream = dao.query(Rating.class)
.groupBy(CommonAttributes.ITEM_ID) .groupBy(CommonAttributes.ITEM_ID)
.stream()) { .stream()) {
// Process each item's ratings // Process each item's ratings
for (IdBox<List<Rating>> item: ratingStream) { for (IdBox<List<Rating>> item: ratingStream) {
// Build a set of users. We build an array first, then convert to a set. // Build a set of users. We build an array first, then convert to a set.
LongList users = new LongArrayList(); LongList users = new LongArrayList();
// Add each rating's user ID to the user sets // Add each rating's user ID to the user sets
for (Rating r: item.getValue()) { for (Rating r: item.getValue()) {
long user = r.getUserId(); long user = r.getUserId();
users.add(user); users.add(user);
allUsers.add(user); allUsers.add(user);
} }
// put this item's user set into the item user map // put this item's user set into the item user map
// a frozen set will be very efficient later // a frozen set will be very efficient later
itemUsers.put(item.getId(), LongUtils.frozenSet(users)); itemUsers.put(item.getId(), LongUtils.frozenSet(users));
} }
} }
// Second step: compute all association rules // Second step: compute all association rules
// We need a map to store them // We need a map to store them
Long2ObjectMap<Long2DoubleMap> assocMatrix = new Long2ObjectOpenHashMap<>(); Long2ObjectMap<Long2DoubleMap> assocMatrix = new Long2ObjectOpenHashMap<>();
// then loop over 'x' items // then loop over 'x' items
for (Long2ObjectMap.Entry<LongSortedSet> xEntry: itemUsers.long2ObjectEntrySet()) { for (Long2ObjectMap.Entry<LongSortedSet> xEntry: itemUsers.long2ObjectEntrySet()) {
long xId = xEntry.getLongKey(); long xId = xEntry.getLongKey();
LongSortedSet xUsers = xEntry.getValue(); LongSortedSet xUsers = xEntry.getValue();
// set up a map to hold the scores for each 'y' item // set up a map to hold the scores for each 'y' item
Long2DoubleMap itemScores = new Long2DoubleOpenHashMap(); Long2DoubleMap itemScores = new Long2DoubleOpenHashMap();
// TODO Compute lift association formulas for all other 'Y' items with respect to this 'X' // TODO Compute lift association formulas for all other 'Y' items with respect to this 'X'
// save the score map to the main map // save the score map to the main map
assocMatrix.put(xId, itemScores); assocMatrix.put(xId, itemScores);
} }
return new AssociationModel(assocMatrix); return new AssociationModel(assocMatrix);
} }
} }

View file

@ -1,97 +1,97 @@
package org.lenskit.mooc.nonpers.mean; package org.lenskit.mooc.nonpers.mean;
import it.unimi.dsi.fastutil.longs.Long2DoubleOpenHashMap; import it.unimi.dsi.fastutil.longs.Long2DoubleOpenHashMap;
import it.unimi.dsi.fastutil.longs.Long2IntOpenHashMap; import it.unimi.dsi.fastutil.longs.Long2IntOpenHashMap;
import it.unimi.dsi.fastutil.longs.LongSet; import it.unimi.dsi.fastutil.longs.LongSet;
import org.lenskit.baseline.MeanDamping; import org.lenskit.baseline.MeanDamping;
import org.lenskit.data.dao.DataAccessObject; import org.lenskit.data.dao.DataAccessObject;
import org.lenskit.data.ratings.Rating; import org.lenskit.data.ratings.Rating;
import org.lenskit.inject.Transient; import org.lenskit.inject.Transient;
import org.lenskit.util.io.ObjectStream; import org.lenskit.util.io.ObjectStream;
import org.slf4j.Logger; import org.slf4j.Logger;
import org.slf4j.LoggerFactory; import org.slf4j.LoggerFactory;
import javax.inject.Inject; import javax.inject.Inject;
import javax.inject.Provider; import javax.inject.Provider;
/** /**
* Provider class that builds the mean rating item scorer, computing damped item means from the * Provider class that builds the mean rating item scorer, computing damped item means from the
* ratings in the DAO. * ratings in the DAO.
*/ */
public class DampedItemMeanModelProvider implements Provider<ItemMeanModel> { public class DampedItemMeanModelProvider implements Provider<ItemMeanModel> {
/** /**
* A logger that you can use to emit debug messages. * A logger that you can use to emit debug messages.
*/ */
private static final Logger logger = LoggerFactory.getLogger(DampedItemMeanModelProvider.class); private static final Logger logger = LoggerFactory.getLogger(DampedItemMeanModelProvider.class);
/** /**
* The data access object, to be used when computing the mean ratings. * The data access object, to be used when computing the mean ratings.
*/ */
private final DataAccessObject dao; private final DataAccessObject dao;
/** /**
* The damping factor. * The damping factor.
*/ */
private final double damping; private final double damping;
/** /**
* Constructor for the mean item score provider. * Constructor for the mean item score provider.
* *
* <p>The {@code @Inject} annotation tells LensKit to use this constructor. * <p>The {@code @Inject} annotation tells LensKit to use this constructor.
* *
* @param dao The data access object (DAO), where the builder will get ratings. The {@code @Transient} * @param dao The data access object (DAO), where the builder will get ratings. The {@code @Transient}
* annotation on this parameter means that the DAO will be used to build the model, but the * annotation on this parameter means that the DAO will be used to build the model, but the
* model will <strong>not</strong> retain a reference to the DAO. This is standard procedure * model will <strong>not</strong> retain a reference to the DAO. This is standard procedure
* for LensKit models. * for LensKit models.
* @param damping The damping factor for Bayesian damping. This is number of fake global-mean ratings to * @param damping The damping factor for Bayesian damping. This is number of fake global-mean ratings to
* assume. It is provided as a parameter so that it can be reconfigured. See the file * assume. It is provided as a parameter so that it can be reconfigured. See the file
* {@code damped-mean.groovy} for how it is used. * {@code damped-mean.groovy} for how it is used.
*/ */
@Inject @Inject
public DampedItemMeanModelProvider(@Transient DataAccessObject dao, public DampedItemMeanModelProvider(@Transient DataAccessObject dao,
@MeanDamping double damping) { @MeanDamping double damping) {
this.dao = dao; this.dao = dao;
this.damping = damping; this.damping = damping;
} }
/** /**
* Construct an item mean model. * Construct an item mean model.
* *
* <p>The {@link Provider#get()} method constructs whatever object the provider class is intended to build.</p> * <p>The {@link Provider#get()} method constructs whatever object the provider class is intended to build.</p>
* *
* @return The item mean model with mean ratings for all items. * @return The item mean model with mean ratings for all items.
*/ */
@Override @Override
public ItemMeanModel get() { public ItemMeanModel get() {
// TODO Compute damped means // TODO Compute damped means
Long2DoubleOpenHashMap means = new Long2DoubleOpenHashMap(); Long2DoubleOpenHashMap means = new Long2DoubleOpenHashMap();
Long2IntOpenHashMap lens = new Long2IntOpenHashMap(); Long2IntOpenHashMap lens = new Long2IntOpenHashMap();
double globalMean = 0; double globalMean = 0;
int globalLen = 0; int globalLen = 0;
try (ObjectStream<Rating> ratings = dao.query(Rating.class).stream()) { try (ObjectStream<Rating> ratings = dao.query(Rating.class).stream()) {
for (Rating r: ratings) { for (Rating r: ratings) {
// this loop will run once for each rating in the data set // this loop will run once for each rating in the data set
means.addTo(r.getItemId(), r.getValue()); means.addTo(r.getItemId(), r.getValue());
lens.addTo(r.getItemId(), 1); lens.addTo(r.getItemId(), 1);
globalMean += r.getValue(); globalMean += r.getValue();
globalLen += 1; globalLen += 1;
} }
} }
globalMean /= globalLen; globalMean /= globalLen;
LongSet keys = means.keySet(); LongSet keys = means.keySet();
for (long key : keys) { for (long key : keys) {
double val = means.get(key); double val = means.get(key);
val = (val + damping * globalMean) / (lens.get(key) + damping); val = (val + damping * globalMean) / (lens.get(key) + damping);
means.put(key, val); means.put(key, val);
if (key == 2959 || key == 1203) { if (key == 2959 || key == 1203) {
logger.info("Damped mean for item {} is {}", key, val); logger.info("Damped mean for item {} is {}", key, val);
} }
} }
return new ItemMeanModel(means); return new ItemMeanModel(means);
} }
} }

View file

@ -1,68 +1,68 @@
package org.lenskit.mooc.nonpers.mean; package org.lenskit.mooc.nonpers.mean;
import com.google.common.base.Preconditions; import com.google.common.base.Preconditions;
import it.unimi.dsi.fastutil.longs.Long2DoubleMap; import it.unimi.dsi.fastutil.longs.Long2DoubleMap;
import it.unimi.dsi.fastutil.longs.LongSet; import it.unimi.dsi.fastutil.longs.LongSet;
import org.grouplens.grapht.annotation.DefaultProvider; import org.grouplens.grapht.annotation.DefaultProvider;
import org.lenskit.inject.Shareable; import org.lenskit.inject.Shareable;
import org.lenskit.util.collections.LongUtils; import org.lenskit.util.collections.LongUtils;
import javax.annotation.concurrent.Immutable; import javax.annotation.concurrent.Immutable;
import java.io.Serializable; import java.io.Serializable;
import java.util.Map; import java.util.Map;
/** /**
* A <em>model</em> class that stores item mean ratings. * A <em>model</em> class that stores item mean ratings.
* *
* <p>The {@link Shareable} annotation is common for model objects, and tells LensKit that the class can be shared * <p>The {@link Shareable} annotation is common for model objects, and tells LensKit that the class can be shared
* between multiple recommender instances.</p> * between multiple recommender instances.</p>
* *
* <p>The {@link DefaultProvider} annotation tells LensKit to use a <em>provider class</em> &mdash; the mean item scorer * <p>The {@link DefaultProvider} annotation tells LensKit to use a <em>provider class</em> &mdash; the mean item scorer
* provider &mdash; to create instances of this class.</p> * provider &mdash; to create instances of this class.</p>
* *
* <p>You <strong>should not</strong> need to make any changes to this class.</p> * <p>You <strong>should not</strong> need to make any changes to this class.</p>
*/ */
@Shareable @Shareable
@Immutable @Immutable
@DefaultProvider(ItemMeanModelProvider.class) @DefaultProvider(ItemMeanModelProvider.class)
public class ItemMeanModel implements Serializable { public class ItemMeanModel implements Serializable {
private static final long serialVersionUID = 1L; private static final long serialVersionUID = 1L;
private final Long2DoubleMap itemMeans; private final Long2DoubleMap itemMeans;
/** /**
* Construct a new item mean model. * Construct a new item mean model.
* @param means A map of item IDs to their mean ratings. * @param means A map of item IDs to their mean ratings.
*/ */
public ItemMeanModel(Map<Long, Double> means) { public ItemMeanModel(Map<Long, Double> means) {
itemMeans = LongUtils.frozenMap(means); itemMeans = LongUtils.frozenMap(means);
} }
/** /**
* Get the set of items known by the model. * Get the set of items known by the model.
* @return The set of items known by the model. * @return The set of items known by the model.
*/ */
public LongSet getKnownItems() { public LongSet getKnownItems() {
return itemMeans.keySet(); return itemMeans.keySet();
} }
/** /**
* Query whether this model knows about an item. * Query whether this model knows about an item.
* @param item The item ID. * @param item The item ID.
* @return {@code true} if the item is known by the model, {@code false} otherwise. * @return {@code true} if the item is known by the model, {@code false} otherwise.
*/ */
public boolean hasItem(long item) { public boolean hasItem(long item) {
return itemMeans.containsKey(item); return itemMeans.containsKey(item);
} }
/** /**
* Get the mean rating for an item. * Get the mean rating for an item.
* @param item The item ID. * @param item The item ID.
* @return The mean rating. * @return The mean rating.
* @throws IllegalArgumentException if the item is not a known itemm. * @throws IllegalArgumentException if the item is not a known itemm.
*/ */
public double getMeanRating(long item) { public double getMeanRating(long item) {
Preconditions.checkArgument(hasItem(item), "unknown item " + item); Preconditions.checkArgument(hasItem(item), "unknown item " + item);
return itemMeans.get(item); return itemMeans.get(item);
} }
} }

View file

@ -1,80 +1,80 @@
package org.lenskit.mooc.nonpers.mean; package org.lenskit.mooc.nonpers.mean;
import it.unimi.dsi.fastutil.longs.Long2DoubleOpenHashMap; import it.unimi.dsi.fastutil.longs.Long2DoubleOpenHashMap;
import it.unimi.dsi.fastutil.longs.Long2IntOpenHashMap; import it.unimi.dsi.fastutil.longs.Long2IntOpenHashMap;
import it.unimi.dsi.fastutil.longs.LongSet; import it.unimi.dsi.fastutil.longs.LongSet;
import org.lenskit.data.dao.DataAccessObject; import org.lenskit.data.dao.DataAccessObject;
import org.lenskit.data.ratings.Rating; import org.lenskit.data.ratings.Rating;
import org.lenskit.inject.Transient; import org.lenskit.inject.Transient;
import org.lenskit.util.io.ObjectStream; import org.lenskit.util.io.ObjectStream;
import org.slf4j.Logger; import org.slf4j.Logger;
import org.slf4j.LoggerFactory; import org.slf4j.LoggerFactory;
import javax.inject.Inject; import javax.inject.Inject;
import javax.inject.Provider; import javax.inject.Provider;
/** /**
* Provider class that builds the mean rating item scorer, computing item means from the * Provider class that builds the mean rating item scorer, computing item means from the
* ratings in the DAO. * ratings in the DAO.
*/ */
public class ItemMeanModelProvider implements Provider<ItemMeanModel> { public class ItemMeanModelProvider implements Provider<ItemMeanModel> {
/** /**
* A logger that you can use to emit debug messages. * A logger that you can use to emit debug messages.
*/ */
private static final Logger logger = LoggerFactory.getLogger(ItemMeanModelProvider.class); private static final Logger logger = LoggerFactory.getLogger(ItemMeanModelProvider.class);
/** /**
* The data access object, to be used when computing the mean ratings. * The data access object, to be used when computing the mean ratings.
*/ */
private final DataAccessObject dao; private final DataAccessObject dao;
/** /**
* Constructor for the mean item score provider. * Constructor for the mean item score provider.
* *
* <p>The {@code @Inject} annotation tells LensKit to use this constructor. * <p>The {@code @Inject} annotation tells LensKit to use this constructor.
* *
* @param dao The data access object (DAO), where the builder will get ratings. The {@code @Transient} * @param dao The data access object (DAO), where the builder will get ratings. The {@code @Transient}
* annotation on this parameter means that the DAO will be used to build the model, but the * annotation on this parameter means that the DAO will be used to build the model, but the
* model will <strong>not</strong> retain a reference to the DAO. This is standard procedure * model will <strong>not</strong> retain a reference to the DAO. This is standard procedure
* for LensKit models. * for LensKit models.
*/ */
@Inject @Inject
public ItemMeanModelProvider(@Transient DataAccessObject dao) { public ItemMeanModelProvider(@Transient DataAccessObject dao) {
this.dao = dao; this.dao = dao;
} }
/** /**
* Construct an item mean model. * Construct an item mean model.
* *
* <p>The {@link Provider#get()} method constructs whatever object the provider class is intended to build.</p> * <p>The {@link Provider#get()} method constructs whatever object the provider class is intended to build.</p>
* *
* @return The item mean model with mean ratings for all items. * @return The item mean model with mean ratings for all items.
*/ */
@Override @Override
public ItemMeanModel get() { public ItemMeanModel get() {
Long2DoubleOpenHashMap means = new Long2DoubleOpenHashMap(); Long2DoubleOpenHashMap means = new Long2DoubleOpenHashMap();
Long2IntOpenHashMap lens = new Long2IntOpenHashMap(); Long2IntOpenHashMap lens = new Long2IntOpenHashMap();
try (ObjectStream<Rating> ratings = dao.query(Rating.class).stream()) { try (ObjectStream<Rating> ratings = dao.query(Rating.class).stream()) {
for (Rating r: ratings) { for (Rating r: ratings) {
// this loop will run once for each rating in the data set // this loop will run once for each rating in the data set
means.addTo(r.getItemId(), r.getValue()); means.addTo(r.getItemId(), r.getValue());
lens.addTo(r.getItemId(), 1); lens.addTo(r.getItemId(), 1);
} }
} }
LongSet keys = means.keySet(); LongSet keys = means.keySet();
for (long key : keys) { for (long key : keys) {
double val = means.get(key); double val = means.get(key);
val /= lens.get(key); val /= lens.get(key);
means.put(key, val); means.put(key, val);
if (key == 2959 || key == 1203) { if (key == 2959 || key == 1203) {
logger.info("Damped mean for item {} is {}", key, val); logger.info("Damped mean for item {} is {}", key, val);
} }
} }
logger.info("computed mean ratings for {} items", means.size()); logger.info("computed mean ratings for {} items", means.size());
return new ItemMeanModel(means); return new ItemMeanModel(means);
} }
} }

View file

@ -1,92 +1,92 @@
package org.lenskit.mooc.nonpers.mean; package org.lenskit.mooc.nonpers.mean;
import it.unimi.dsi.fastutil.longs.LongSet; import it.unimi.dsi.fastutil.longs.LongSet;
import org.lenskit.api.Result; import org.lenskit.api.Result;
import org.lenskit.api.ResultList; import org.lenskit.api.ResultList;
import org.lenskit.api.ResultMap; import org.lenskit.api.ResultMap;
import org.lenskit.basic.AbstractItemBasedItemRecommender; import org.lenskit.basic.AbstractItemBasedItemRecommender;
import org.lenskit.results.Results; import org.lenskit.results.Results;
import org.lenskit.util.collections.LongUtils; import org.lenskit.util.collections.LongUtils;
import org.slf4j.Logger; import org.slf4j.Logger;
import org.slf4j.LoggerFactory; import org.slf4j.LoggerFactory;
import javax.annotation.Nullable; import javax.annotation.Nullable;
import javax.inject.Inject; import javax.inject.Inject;
import java.util.ArrayList; import java.util.ArrayList;
import java.util.List; import java.util.List;
import java.util.Set; import java.util.Set;
/** /**
* An item scorer that scores each item with its mean rating. * An item scorer that scores each item with its mean rating.
*/ */
public class MeanItemBasedItemRecommender extends AbstractItemBasedItemRecommender { public class MeanItemBasedItemRecommender extends AbstractItemBasedItemRecommender {
private static final Logger logger = LoggerFactory.getLogger(MeanItemBasedItemRecommender.class); private static final Logger logger = LoggerFactory.getLogger(MeanItemBasedItemRecommender.class);
private final ItemMeanModel model; private final ItemMeanModel model;
/** /**
* Construct a mean global item scorer. * Construct a mean global item scorer.
* *
* <p>The {@code @Inject} annotation tells LensKit to use this constructor.</p> * <p>The {@code @Inject} annotation tells LensKit to use this constructor.</p>
* *
* @param m The model containing item mean ratings. LensKit will automatically build an {@link ItemMeanModel} * @param m The model containing item mean ratings. LensKit will automatically build an {@link ItemMeanModel}
* object. Its use as a parameter type in this constructor declares it as a <em>dependency</em> of the * object. Its use as a parameter type in this constructor declares it as a <em>dependency</em> of the
* mean-based item scorer. * mean-based item scorer.
*/ */
@Inject @Inject
public MeanItemBasedItemRecommender(ItemMeanModel m) { public MeanItemBasedItemRecommender(ItemMeanModel m) {
model = m; model = m;
} }
/** /**
* {@inheritDoc} * {@inheritDoc}
* *
* This is the LensKit recommend method. It takes several parameters; we implement it for you in terms of a * This is the LensKit recommend method. It takes several parameters; we implement it for you in terms of a
* simpler method ({@link #recommendItems(int, LongSet)}). * simpler method ({@link #recommendItems(int, LongSet)}).
*/ */
@Override @Override
public ResultList recommendRelatedItemsWithDetails(Set<Long> basket, int n, @Nullable Set<Long> candidates, @Nullable Set<Long> exclude) { public ResultList recommendRelatedItemsWithDetails(Set<Long> basket, int n, @Nullable Set<Long> candidates, @Nullable Set<Long> exclude) {
LongSet items; LongSet items;
if (candidates == null) { if (candidates == null) {
items = model.getKnownItems(); items = model.getKnownItems();
} else { } else {
items = LongUtils.asLongSet(candidates); items = LongUtils.asLongSet(candidates);
} }
if (exclude != null) { if (exclude != null) {
items = LongUtils.setDifference(items, LongUtils.asLongSet(exclude)); items = LongUtils.setDifference(items, LongUtils.asLongSet(exclude));
} }
logger.info("computing {} recommendations from {} items", n, items.size()); logger.info("computing {} recommendations from {} items", n, items.size());
return recommendItems(n, items); return recommendItems(n, items);
} }
/** /**
* Recommend some items from a set of candidate items. * Recommend some items from a set of candidate items.
* *
* <p>Your code needs to obtain the mean rating, if one is available, for each item, and return a list of the * <p>Your code needs to obtain the mean rating, if one is available, for each item, and return a list of the
* {@code n} highest-rated items, in decreasing order of score.</p> * {@code n} highest-rated items, in decreasing order of score.</p>
* *
* <p>To create the {@link ResultMap} data structure, do the following:</p> * <p>To create the {@link ResultMap} data structure, do the following:</p>
* *
* <ol> * <ol>
* <li>Create a {@link List} to hold {@link Result} objects.</li> * <li>Create a {@link List} to hold {@link Result} objects.</li>
* <li>Create a result object for each item that can be scored. Use {@link Results#create(long, double)} to * <li>Create a result object for each item that can be scored. Use {@link Results#create(long, double)} to
* create the result object. If an item cannot be scored (because there is no mean available), ignore it and * create the result object. If an item cannot be scored (because there is no mean available), ignore it and
* do not add a result to the list.</li> * do not add a result to the list.</li>
* <li>Convert the list of results to a {@link ResultList} using {@link Results#newResultList(List)}.</li> * <li>Convert the list of results to a {@link ResultList} using {@link Results#newResultList(List)}.</li>
* </ol> * </ol>
* *
* @param n The number of items to recommend. If this is negative, then recommend all possible items. * @param n The number of items to recommend. If this is negative, then recommend all possible items.
* @param items The items to score. * @param items The items to score.
* @return A {@link ResultMap} containing the scores. * @return A {@link ResultMap} containing the scores.
*/ */
private ResultList recommendItems(int n, LongSet items) { private ResultList recommendItems(int n, LongSet items) {
List<Result> results = new ArrayList<>(); List<Result> results = new ArrayList<>();
// TODO Find the top N items by mean rating // TODO Find the top N items by mean rating
return Results.newResultList(results); return Results.newResultList(results);
} }
} }