Getting Started¶

This page provides a starter example to introduce users to the rehline package and showcase its primary features, facilitating exploration and familiarization.

To proceed, ensure that you have already installed rehline:

pip install rehline

rehline is a versatile solver for machine learning problems, particularly effective for Empirical Risk Minimization (ERM) with non-smooth objectives. We will use ERM as our starting example to demonstrate that:

Note

With rehline, you can easily transform different loss functions and add constraints to your ERM with no tears!

Let’s begin by generating a toy dataset and splitting it into training and test sets using scikit-learn’s make_regression.

# Import necessary libraries
import numpy as np
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

np.random.seed(1024)
# Generate toy data
n, d = 1000, 5
scaler = StandardScaler()
X, y = make_regression(n_samples=n, n_features=d, noise=1.0)
# Normalize X and add intercept
X = scaler.fit_transform(X)
X = np.hstack((X, np.ones((n, 1))))

# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=50)

Quantile Regression¶

Next, let’s use rehline to fit a quantile regression (QR) at quantile level 0.95 (\(\kappa=0.95\)).

The ridge-regularized QR solves the following optimization problem:

\[\min_{\beta \in \mathbb{R}^{d}} \ C \sum_{i=1}^n \rho_\kappa ( y_i - x_i^\intercal \beta ) + \frac{1}{2} \| \beta \|^2,\]

where \(\rho_\kappa(u) = u \cdot (\kappa - \mathbf{1}(u < 0))\) is the check loss, \(x_i \in \mathbb{R}^d\) is a feature vector, and \(y_i \in \mathbb{R}\) is the response variable.

Since the check loss is a piecewise linear quadratic function (PLQ), it can be solved using rehline.plqERM_Ridge:

from rehline import plqERM_Ridge
# Define a QR estimator
clf = plqERM_Ridge(loss={'name': 'QR', 'qt': 0.95}, C=1.0)
clf.fit(X=X_train, y=y_train)
# Make predictions
q_predict = clf.decision_function(X_test)

# Plot results
import matplotlib.pyplot as plt
plt.scatter(x=X_test[:, 0], y=y_test, label='y_true')
plt.scatter(x=X_test[:, 0], y=q_predict, alpha=0.5, label='q_95')
plt.legend(loc="upper left")
plt.show()

Huber Regression¶

If you prefer Huber regression, it is also a PLQ function.

The ridge-regularized Huber minimization solves the following optimization problem:

\[\min_{\mathbf{\beta}} C \sum_{i=1}^n H_\kappa( y_i - \mathbf{x}_i^\intercal \mathbf{\beta} ) + \frac{1}{2} \| \mathbf{\beta} \|_2^2,\]

where \(H_\kappa(\cdot)\) is the Huber loss defined as follows:

\[\begin{split}\begin{equation*} H_\kappa(z) = \begin{cases} z^2/2, & 0 < |z| \leq \kappa, \\ \kappa ( |z| - \kappa/2 ), & |z| > \kappa. \end{cases} \end{equation*}\end{split}\]
from rehline import plqERM_Ridge
# Define a Huber estimator
clf = plqERM_Ridge(loss={'name': 'huber', 'tau': 0.5}, C=1.0)
clf.fit(X=X_train, y=y_train)
# Make predictions
y_huber = clf.decision_function(X_test)

# Plot results
import matplotlib.pyplot as plt
plt.scatter(x=X_test[:, 0], y=y_test, label='y_true')
plt.scatter(x=X_test[:, 0], y=y_huber, alpha=0.5, label='y_huber')
plt.legend(loc="upper left")
plt.show()

Fairness Constraints¶

You have now learned that the fitted Huber regression requires a fairness constraint for the first feature \(\mathbf{X}_{1}\). Specifically, the correlation between the predicted \(\hat{Y}\) and \(\mathbf{X}_{1}\) must be less than tol=0.1, that is,

\[\min_{\mathbf{\beta}} C \sum_{i=1}^n H_\kappa( y_i - \mathbf{x}_i^\intercal \mathbf{\beta} ) + \frac{1}{2} \| \mathbf{\beta} \|_2^2, \quad \text{s.t.} \quad \Big | \frac{1}{n} \sum_{i=1}^n \mathbf{z}_i \mathbf{\beta}^\intercal \mathbf{x}_i \Big| \leq \mathbf{\rho}\]

With rehline, you can easily add a fairness constraint to your ERM.

from rehline import plqERM_Ridge
from scipy.stats import pearsonr
# Define a Huber estimator with fairness constraint
clf = plqERM_Ridge(loss={'name': 'huber', 'tau': 0.5},
                   constraint=[{'name': 'fair', 'sen_idx': [0], 'tol_sen': 0.1}],
                   C=1.0,
                   max_iter=10000)
clf.fit(X=X_train, y=y_train)
# Make predictions
y_huber_fair = clf.decision_function(X_test)

# Plot results
import matplotlib.pyplot as plt
plt.scatter(x=X_test[:, 0], y=y_test, label='y_true')
plt.scatter(x=X_test[:, 0], y=y_huber, alpha=0.5, label='y_huber')
plt.scatter(x=X_test[:, 0], y=y_huber_fair, alpha=0.5, label='y_huber_fair')
plt.legend(loc="upper left")
plt.show()