ReHLine: Matrix Factorization

This tutorial illustrates how to conduct Matrix Factorization (MF) with multiple PLQ loss functions through ReHLine.

Mathematical Formulation

Considering a User-Item-Rating triplet dataset \((u, i, r_{ui})\) derived from target sparse matrix, the optimization problem corresponding to this scenario is:

\[\begin{split}\min_{\substack{ \mathbf{P} \in \mathbb{R}^{n \times k}\ \pmb{\alpha} \in \mathbb{R}^n \\ \mathbf{Q} \in \mathbb{R}^{m \times k}\ \pmb{\beta} \in \mathbb{R}^m }} \left[ \sum_{(u,i)\in \Omega} C \cdot \text{PLQ}(r_{ui}, \ \mathbf{p}_u^T \mathbf{q}_i + \alpha_u + \beta_i) \right] + \left[ \frac{\rho}{n}\sum_{u=1}^n(\|\mathbf{p}_u\|_2^2 + \alpha_u^2) + \frac{1-\rho}{m}\sum_{i=1}^m(\|\mathbf{q}_i\|_2^2 + \beta_i^2) \right]\end{split}\]
\[\begin{split}\ \text{ s.t. } \ \mathbf{A}_{\text{user}} \begin{pmatrix} \alpha_u \\ \mathbf{p}_u \end{pmatrix} + \mathbf{b}_{\text{user}} \geq \mathbf{0},\ u = 1,\dots,n \quad \text{and} \quad \mathbf{A}_{\text{item}} \begin{pmatrix} \beta_i \\ \mathbf{q}_i \end{pmatrix} + \mathbf{b}_{\text{item}} \geq \mathbf{0},\ i = 1,\dots,m\end{split}\]

where

  • \(\text{PLQ}(\cdot , \cdot)\) is a convex piecewise linear-quadratic loss function. You can find built-in loss functions in the Loss section.

  • \(\mathbf{A}_{\text{user}}\) is a \(d \times (k+1)\) matrix and \(\mathbf{b}_{\text{user}}\) is a \(d\)-dimensional vector representing \(d\) linear constraints to user side parameters. See Constraints for more details.

  • \(\mathbf{A}_{\text{item}}\) is a \(d \times (k+1)\) matrix and \(\mathbf{b}_{\text{item}}\) is a \(d\)-dimensional vector representing \(d\) linear constraints to item side parameters. See Constraints for more details.

  • \(\Omega\) is a user-item collection that records all training data

  • \(n\) is number of users, \(m\) is number of items

  • \(k\) is length of latent factors (rank of MF)

  • \(C\) is regularization parameter, \(\rho\) balances regularization strength between user and item

  • \(\mathbf{p}_u\) and \(\alpha_u\) are latent vector and individual bias of u-th user. Specifically, \(\mathbf{p}_u\) is the u-th row of \(\mathbf{P}\), and \(\alpha_u\) is the u-th element of \(\pmb{\alpha}\)

  • \(\mathbf{q}_i\) and \(\beta_i\) are latent vector and individual bias of i-th item. Specifically, \(\mathbf{q}_i\) is the i-th row of \(\mathbf{Q}\), and \(\beta_i\) is the i-th element of \(\pmb{\beta}\)

Implementation Guide

A simple synthetic dataset is used for illustration. The implementation can be easily adapted to your specific triplet data, allowing you to experiment with various loss functions.

Setup

To proceed, ensure that you have already installed rehline:

pip install rehline

Basic Usage

# 1. Necessary Packages
import numpy as np
from rehline import plqMF_Ridge, make_mf_dataset
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error


# 2. Data Preparation
# Generate synthetic data (replace with your own data in practice)
user_num, item_num = 1200, 4000
ratings = make_mf_dataset(n_users=user_num, n_items=item_num,
                          n_interactions=50000, seed=42)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    ratings['X'], ratings['y'], test_size=0.3, random_state=42)


# 3. Model Construction
clf = plqMF_Ridge(
    C=0.001,                             ## Regularization strength
    rank=6,                              ## Latent factor dimension
    loss={'name': 'mae'},                ## Use absolute loss
    n_users=user_num,                    ## Number of users
    n_items=item_num,                    ## Number of items
)
clf.fit(X_train, y_train)


# 4. Evaluation
y_pred = clf.decision_function(X_test)
mae_score = mean_absolute_error(y_test, y_pred)
print(f"Test MAE: {mae_score:.3f}")

Advanced Configuration

Choosing different loss functions through loss:

# Square loss
clf_mse = plqMF_Ridge(
     C=0.001,
     rank=6,
     loss={'name': 'mse'},               ## Choose square loss
     n_users=user_num,
     n_items=item_num)

# Hinge loss (suitable for binary data)
clf_hinge = plqMF_Ridge(
     C=0.001,
     rank=6,
     loss={'name': 'hinge'},             ## Choose hinge loss
     n_users=user_num,
     n_items=item_num)

Linear constraints can be applied via constraint_user and constraint_item:

# Implement a linear constraint
clf_nonnegative = plqMF_Ridge(
     C=0.001,
     rank=6,
     loss={'name': 'mae'},
     n_users=user_num,
     n_items=item_num,
     constraint_user=[{'name': '>=0'}],  ## Use nonnegative constraint
     constraint_item=[{'name': '>=0'}]
 )

The algorithm includes bias terms \(\mathbf{\alpha}\) and \(\mathbf{\beta}\) by default. To disable them, that is, \(\mathbf{\alpha} = \mathbf{0}\) and \(\mathbf{\beta} = \mathbf{0}\), set: biased=False:

# Exclude user and item biases
clf_unbiased = plqMF_Ridge(
     C=0.001,
     rank=6,
     loss={'name': 'mae'},
     n_users=user_num,
     n_items=item_num,
     biased=False                        ## Disable bias terms
 )

Imposing different strengths of regularization on items/users through rho:

# Imbalanced penalty
clf_asymmetric = plqMF_Ridge(
     C=0.001,
     rank=6,
     loss={'name': 'mae'},
     n_users=user_num,
     n_items=item_num,
     rho=0.7                             ## Add heavier penalties for user parameters
 )

Parameter Tuning

The model complexity is mainly controlled by C and rank.

for C_value in [0.0002, 0.001, 0.005]:
    clf = plqMF_Ridge(
         C=C_value,                      ## Try different regularization strengths
         rank=6,
         loss={'name': 'mae'},
         n_users=user_num,
         n_items=item_num
     )
    clf.fit(X_train, y_train)
    y_pred = clf.decision_function(X_test)
    mae = mean_absolute_error(y_test, y_pred)
    print(f"C={C_value}: MAE = {mae:.3f}")


for rank_value in [4, 8, 12]:
    clf = plqMF_Ridge(
         C=0.001,
         rank=rank_value,                ## Try different latent factor dimensions
         loss={'name': 'mae'},
         n_users=user_num,
         n_items=item_num
     )
    clf.fit(X_train, y_train)
    y_pred = clf.decision_function(X_test)
    mae = mean_absolute_error(y_test, y_pred)
    print(f"rank={rank_value}: MAE = {mae:.3f}")

Practical Guidance

  • The first column of X corresponds to users, and the second column corresponds to items. Please ensure this aligns with your n_users and n_items parameters.

  • The default penalty strength is relatively weak; it is recommended to set a relatively small C value initially.

  • When using larger C values, consider increasing max_iter to avoid ConvergenceWarning.

Example