{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "MFDBSDOzXDVt" }, "source": [ "# Ridge Quantile Regression\n", "\n", "[](https://rehline-python.readthedocs.io/en/latest/)\n", "\n", "The regularized quantile regression solves the following optimization problem:\n", "\n", "$$\n", "\\min_{\\mathbf{\\beta} \\in \\mathbb{R}^d} C \\sum_{i=1}^n \\rho_\\kappa (y_i - \\mathbf{x}_i^\\top \\mathbf{\\beta}) + \\frac{1}{2} \\|\\mathbf{\\beta}\\|^2,\n", "$$\n", "\n", "where $\\rho_\\kappa(u) = u \\cdot (\\kappa - \\mathbf{1}(u < 0))$ is the check loss, $\\mathbf{x}_i \\in \\mathbb{R}^d$ is a feature vector, $y_i \\in \\mathbb{R}$ is the response variable." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> **Note.** Since the check loss is a plq function, we can optimize it using `rehline.plq_Ridge_Regressor`. \n", "> Moreover, this wrapper adapts the `plqERM_Ridge` into a regressor, compatible with the scikit-learn API." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "rTzmChto1ltC" }, "outputs": [], "source": [ "## install rehline\n", "%pip install rehline -q" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "id": "BWLs6_gP1YHE" }, "outputs": [], "source": [ "## simulate data\n", "import numpy as np\n", "from sklearn.datasets import make_regression\n", "from sklearn.preprocessing import StandardScaler\n", "\n", "scaler = StandardScaler()\n", "\n", "n, d = 10000, 5\n", "X, y = make_regression(n_samples=n, n_features=d, noise=1.0, random_state=42)\n", "X = scaler.fit_transform(X)\n", "y = y / y.std()" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 80 }, "id": "L8KI3Odl4l_Z", "outputId": "1d6664a6-aef8-4bc4-dbda-9e07a1eee914" }, "outputs": [ { "data": { "text/html": [ "
plq_Ridge_Regressor(C=0.001, loss={'name': 'QR', 'qt': 0.95})In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. plq_Ridge_Regressor(C=0.001, loss={'name': 'QR', 'qt': 0.95})Pipeline(steps=[('scaler', StandardScaler()),\n",
" ('reg',\n",
" plq_Ridge_Regressor(C=0.001,\n",
" loss={'name': 'QR', 'qt': 0.95}))])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. Pipeline(steps=[('scaler', StandardScaler()),\n",
" ('reg',\n",
" plq_Ridge_Regressor(C=0.001,\n",
" loss={'name': 'QR', 'qt': 0.95}))])StandardScaler()
plq_Ridge_Regressor(C=0.001, loss={'name': 'QR', 'qt': 0.95})