Problems
Known Condition: 
Set {C}——users 
Set {S}——possible recommended items 
utility function u—— meansures the usefulness of s to user c; 
matrix R ——User-Item Rating Matrix
Problem: 
For each user c∈C, choose s'∈S that maximizes the user's utility.
Example: 
C = {Alice, Bob, Cindy, David}; 
S = {K-PAX, life of Brian, Memento, Notorious};
R: 
Recommander sytem predicts the unknown ratings from the matrix R.
| K-PAX | life of Brian | Memento | Notorious | |
|---|---|---|---|---|
| Alice | 4 | 3 | 2 | 4 | 
| Bob | Φ | 4 | 5 | 5 | 
| Cindy | 2 | 2 | 4 | Φ | 
| David | 3 | Φ | 5 | 2 | 
Glossary
- Recommandation system (RS):
- attempt to suggest items that are likely to interest the users.
 
- Heuristic-based RS: define the utility function and empirically validating its performance
- Model-based RS: Estimating the utility function that optimizes certain performance criterion, such as the mean square error.
- Content-based RS: The user will be recommended items similar to the ones the user perferred in the past.- Limited Content Analysis
- Overspecialization
- New User Problem
 
- Collaborative RS: The user will be recommended items that people with similar tastes and preferences liked in the past.- New User problem
- New Item problem
- Sparsity
 
- Hybrid RS: These methods combine collaborative and content-based methods.
- Utility function: function that meansures the usefulness of item s to user c. In content-based system, the utility function is defined as:
 u(c, s) = score(ContentBasedProfile( c), Content(s)).
Papers
Tutorial
| Recommendation Approach | Heuristic-based | Model-based | 
|---|---|---|
| Content-based | Commonly used techniques: * TF-IDF * Clustering Representative research examples: * Lang 1995 * Balabanovic 1997 * Pazzani & Billsus 1997 | Commonly used techniques: * Bayesian classifiers * Clustering * Decision trees * Artificial neural networks Representative research examples: * Pazzan & Billsus 1997 * Mooney 1998 * Billsus& Pazzani 1999, 2000 * Zhang et al. 2002 | 
| Collaborative | Commonly used techniques: * Nearest neighbor (cosine, correlation) * Clusstering * Graph theory Representative research examples * Resnick et al. 1994 * Hill et al. 1995 * Shardanand & Maes 1995 * Breese et al. 1998 * Nakamura & Abe 1998 * Aggarwal et al. 1999 * Delgado & Ishii 1999 * Pennock & Horwitz 1999 * Sarwar et al. 2001 | Commonly used techniques: *Bayesian classifiers * Clustering * Decision trees * Artificial neural networks * Linear regression * Problistic models Representative research examples: * Billsus & Pazzani 1998 * Breese et al. 1998 * Ungar & Foster 1998 * Chien & George 1999 Getoor & Sahami 1999 * Pennock & Horwitz 1999 * Goldberg et al. 2001 * Kumar et al. 2001 * Pavlov & Pennock 2002 * Shani ett al. 2002 * Yu et al. 2002, 2004 * Hofmann 2003, 2004 * Marlin 2003 * Si & Jin 2003 | 
| Hybrid | Combining content-based and collaborative components by: * Linear combination of predicted ratings * Various voting schemes * Incorporating one component as a part of the heuristic for the other representative research examples: * Balabanovic & Shoham 1997 * Claypool et al. 1999 * Good et al. 1999 * Pazzani 1999 * Billsus & Pazzani 2000 * Tran & Cohen 2000 * Melville et al. 2002 | Combining content-based and collaborative components by: * Incorporating one component as a part of the model for the other * building one unifying model Representative research examples: * Basu et al. 1998 * Condliff et al. 1999 * Soboroff & Nicholas 1999 * Ansari et al. 2000 * Pospescul et al. 2001 * Schein et al. 2002 | 
Dimensionality Reduction
- 
- Abstract: Presents two different experiments to reduce the dimensionality of recommender system databases. One compares the effectiveness of system at predicting connsumer perferences, the other compares the effectiveness of sytem at producing Top-N list. Experiments shows the potential to meet the challenges of recommender systems.
- Explore :
 
- 
- Abstract: Model collaborative filtering task as the classification or regression problem in machine learning and Apply SVD to reduce the dimensionality.
- Explore :
 
Article
- SoRec: paper_cikm08_sorec_hao.pdf
Survey
Collaborative filtering system
Heuristic-based Methods
Model-based Methods
- Learning collaborative information filters (ICML, 1998) (pdf)- Abstract: Model collaborative filtering task as the classification or regression problem in machine learning and Apply SVD to reduce the dimensionality. (convert the training data,the sparse matrix of user ratings to Boolean feature vetors, resulting in a matrix filled with zeros and ones).
- Explore :
 
- Empirical analysis of predictive algorithms for collaborative filtering (UAI, 1998) pdf- Abstract: Cluster Models and Bayes Network model is embedded. Train the parameter by training data and predict the unknown ratings.Naive Bayes: p(C = c, v_1, v_2, ... v_n)= Pr(C = c)\prod_{i=1} ^n (v_i|C=c)
- Explore :
 
- Using Probabilistic Relational Models for Collaborative Filtering (WebKDD 1999)(pdf)- Abstract: Apply probabilistic relational models (PRM),which is similar to Entity relational model, to the system.
- Explore :
 
- Eigentaste: A constant Time Collaborative Filtering Algorithm (Information Retrival J, 2001)(pdf)- Abstract: Select gauge set (all valid users rated all items in the gauge set) and apply PCA for dimensionality reduction. Then cluster users. Classify new users to the corresponding cluster and recommend items.
- Explore : 1. Cluster method not fit to multi-model. 2. gauge set may hard to select 3. when gauge set is small, the recommend is not accurate.
 
- Collaborative Filtering via Gaussian Probabilistic Latent Semantic Analysis (SIGIR, 2003)(pdf)- Abstract: Model collaborative filtering task as the classification or regression problem in machine learning and Apply SVD to reduce the dimensionality.
- Explore :
 
- Modeling User Rating Profiles for Collaborative Filtering (NIPS, 2003)(pdf)- Abstract: Model collaborative filtering task as the classification or regression problem in machine learning and Apply SVD to reduce the dimensionality.
- Explore :
 
- A Maximum Entropy Approach to Collaborative Filtring in Dynamic, Sparse, High-Dimensional Domains (NIPS, 2002)(pdf)- Abstract: Model collaborative filtering task as the classification or regression problem in machine learning and Apply SVD to reduce the dimensionality.
- Explore :
 
- Clustring Methods for Collaborative Filtring(Technical Report, 1998)(pdf)- Abstract: Model collaborative filtering task as the classification or regression problem in machine learning and Apply SVD to reduce the dimensionality.
- Explore :
 
- Probabilistic Matrix Factorization (PMF) (NIPS, 2008)(pdf)—— Introduction- Abstract: PMF apply a probabilistic approach using Gaussian assumptions on the knonw data and the factor matrics to factor the matrix and pridicting the missing values.Experimental resuts show that PMF perform quite well.
- Superiority: Scales linearly, performs well on the large, spase and imbalanced dataset.
 
- Bayesian probabilistic matrix factorization using Markov chain Monte Carlo (ICML, 2008)(pdf)- Abstract: Present a full Bayesian treatment of the PMF model in which model capacity is controlled automatically by integrating over all model parameters and hyperparameters.
- Superiority: higher prediction accuracy thant PMF models.
 
- Sorec: social recommendation using probabilistic matrix factorization (CIKM, 2008)(pdf)- Abstract: PMF apply a probabilistic approach using Gaussian assumptions on the knonw data and the factor matrics to factor the matrix and pridicting the missing values.Experimental resuts show that PMF perform quite well.
- Superiority: Scales linearly, performs well on the large, spase and imbalanced dataset.
 
- Learning to Recommend with Social Trust Ensemble (SIGIR, 2009)(pdf)- Abstract: PMF apply a probabilistic approach using Gaussian assumptions on the knonw data and the factor matrics to factor the matrix and pridicting the missing values.Experimental resuts show that PMF perform quite well.
- Superiority: Scales linearly, performs well on the large, spase and imbalanced dataset.
 
Content-based Methods
Resources
PMF:Training PMF (matlab)
Datasets
- MovieLens :
- MovieData : Sample data that contains triplets (user_id, movie_id, rating).
works
list of some papers:
1. Relational learning via collective matrix Factorization : Ajit P.Singh 2. Locality Preserving Nonnegative matrix factorization dengcai 3. relation regularized matrix factorization, wu0jun Li 4. Modeling user rating Profiles for collaborative filtering 5. Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions 6. application of dimensionality reduction in recommender system-a case study 7. collaborative ifltering via guassian probabilistic latent semantic analysis 8 item based collaborative filtering recommendation algorithms 9. maximum likelihood estimation of intrinsic dimension 10 Optimization algorithms in machine learning — stephen wright 11. global analytic solution for variational bayesian matrix factorization 12. variational bayesian approach to movie rating prediction 13. implicit regularization in variational bayesian matrix factorization 14. sparse inverse covariance estimation with the graphical lasso 15. matrix factorization techniques for recommender systemns 16probabilistic sparse matrix factorization 17. learning with local and global consistency 18
















 
  





