Getting Started
===============

This page provides a starter example to introduce users to the ``rehline`` package and showcase its primary features, facilitating exploration and familiarization.

To proceed, ensure that you have already installed ``rehline``:

.. code:: bash

    pip install rehline

--------------------------------

``rehline`` is a versatile solver for machine learning problems, particularly effective for Empirical Risk Minimization (ERM) with `non-smooth` objectives. We will use ERM as our starting example to demonstrate that:

.. admonition:: Note
   :class: tip

   With ``rehline``, you can easily transform different `loss functions` and add `constraints` to your ERM with no tears!

Let's begin by generating a toy dataset and splitting it into training and test sets using scikit-learn's `make_regression`.

.. code:: python

    # Import necessary libraries
    import numpy as np
    from sklearn.datasets import make_regression
    from sklearn.model_selection import train_test_split
    from sklearn.preprocessing import StandardScaler

    np.random.seed(1024)
    # Generate toy data
    n, d = 1000, 5
    scaler = StandardScaler()
    X, y = make_regression(n_samples=n, n_features=d, noise=1.0)
    # Normalize X and add intercept
    X = scaler.fit_transform(X)
    X = np.hstack((X, np.ones((n, 1))))

    # Split data into training and test sets
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=50)

Quantile Regression
-------------------

Next, let's use ``rehline`` to fit a quantile regression (QR) at quantile level 0.95 (:math:`\kappa=0.95`).

The ridge-regularized QR solves the following optimization problem:

.. math::

    \min_{\beta \in \mathbb{R}^{d}} \ C \sum_{i=1}^n  \rho_\kappa ( y_i - x_i^\intercal \beta ) + \frac{1}{2} \| \beta \|^2,

where :math:`\rho_\kappa(u) = u \cdot (\kappa - \mathbf{1}(u < 0))` is the `check loss`, :math:`x_i \in \mathbb{R}^d` is a feature vector, and :math:`y_i \in \mathbb{R}` is the response variable.

Since the `check loss` is a piecewise linear quadratic function (PLQ), it can be solved using ``rehline.plqERM_Ridge``:

.. code:: python

    from rehline import plqERM_Ridge
    # Define a QR estimator
    clf = plqERM_Ridge(loss={'name': 'QR', 'qt': 0.95}, C=1.0)
    clf.fit(X=X_train, y=y_train)
    # Make predictions
    q_predict = clf.decision_function(X_test)

    # Plot results
    import matplotlib.pyplot as plt
    plt.scatter(x=X_test[:, 0], y=y_test, label='y_true')
    plt.scatter(x=X_test[:, 0], y=q_predict, alpha=0.5, label='q_95')
    plt.legend(loc="upper left")
    plt.show()

Huber Regression
----------------

If you prefer Huber regression, it is also a PLQ function.

The ridge-regularized Huber minimization solves the following optimization problem:

.. math::

    \min_{\mathbf{\beta}} C \sum_{i=1}^n H_\kappa( y_i - \mathbf{x}_i^\intercal \mathbf{\beta} ) + \frac{1}{2} \| \mathbf{\beta} \|_2^2,

where :math:`H_\kappa(\cdot)` is the Huber loss defined as follows:

.. math::
    \begin{equation*}
    H_\kappa(z) =
    \begin{cases}
    z^2/2, & 0 < |z| \leq \kappa, \\
    \kappa ( |z| - \kappa/2 ), & |z| > \kappa.
    \end{cases}
    \end{equation*}

.. code:: python

    from rehline import plqERM_Ridge
    # Define a Huber estimator
    clf = plqERM_Ridge(loss={'name': 'huber', 'tau': 0.5}, C=1.0)
    clf.fit(X=X_train, y=y_train)
    # Make predictions
    y_huber = clf.decision_function(X_test)

    # Plot results
    import matplotlib.pyplot as plt
    plt.scatter(x=X_test[:, 0], y=y_test, label='y_true')
    plt.scatter(x=X_test[:, 0], y=y_huber, alpha=0.5, label='y_huber')
    plt.legend(loc="upper left")
    plt.show()

Fairness Constraints
--------------------

You have now learned that the fitted Huber regression requires a fairness constraint for the first feature :math:`\mathbf{X}_{1}`. Specifically, the correlation between the predicted :math:`\hat{Y}` and :math:`\mathbf{X}_{1}` must be less than `tol=0.1`, that is,

.. math::

    \min_{\mathbf{\beta}} C \sum_{i=1}^n H_\kappa( y_i - \mathbf{x}_i^\intercal \mathbf{\beta} ) + \frac{1}{2} \| \mathbf{\beta} \|_2^2, \quad \text{s.t.} \quad \Big | \frac{1}{n} \sum_{i=1}^n \mathbf{z}_i \mathbf{\beta}^\intercal \mathbf{x}_i \Big| \leq \mathbf{\rho}

With `rehline`, you can easily add a `fairness constraint` to your ERM.

.. code:: python

    from rehline import plqERM_Ridge
    from scipy.stats import pearsonr
    # Define a Huber estimator with fairness constraint
    clf = plqERM_Ridge(loss={'name': 'huber', 'tau': 0.5},
                       constraint=[{'name': 'fair', 'sen_idx': [0], 'tol_sen': 0.1}],
                       C=1.0,
                       max_iter=10000)
    clf.fit(X=X_train, y=y_train)
    # Make predictions
    y_huber_fair = clf.decision_function(X_test)

    # Plot results
    import matplotlib.pyplot as plt
    plt.scatter(x=X_test[:, 0], y=y_test, label='y_true')
    plt.scatter(x=X_test[:, 0], y=y_huber, alpha=0.5, label='y_huber')
    plt.scatter(x=X_test[:, 0], y=y_huber_fair, alpha=0.5, label='y_huber_fair')
    plt.legend(loc="upper left")
    plt.show()

.. nblinkgallery::
   :caption: Related Examples
   :name: rst-link-gallery

   examples/QR.ipynb