{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "GgZvzYp0_ltO" }, "source": [ "# Quantile Regression with ε-tolerance\n", "[](https://rehline-python.readthedocs.io/en/latest/)\n", "\n", "Quantile Regression with ε-tolerance solves the following optimization problem:\n", "\n", "$$\n", "\\min_{\\beta \\in \\mathbb{R}^d}\n", "\\sum_{i=1}^n\n", "\\left(\\rho_{\\kappa}\\!\\left(y_i - \\mathbf{x}_i^\\top \\beta\\right)-\\epsilon\\right)_+\n", "+ \\frac{\\lambda}{2}\\|\\beta\\|^2\n", "$$\n", "\n", "where\n", "\n", "- $\\rho_{\\kappa}(r)=r\\cdot\\big(\\kappa-\\mathbf{1}(r<0)\\big)$ is the check loss (quantile loss),\n", "- $\\mathbf{x}_i\\in\\mathbb{R}^d$ is a feature vector,\n", "- $y_i\\in\\mathbb{R}$ is a continuous response variable,\n", "- $\\kappa\\in(0,1)$ is the quantile level,\n", "- $\\epsilon\\ge 0$ is the tolerance parameter.\n", "\n", "> **Note.** Since the check loss is a plq function, we can optimize it using `rehline.plq_Ridge_Regressor`. \n", "> Moreover, this wrapper adapts the `plqERM_Ridge` into a regressor, compatible with the scikit-learn API." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "id": "OYf6ieCxG5Is" }, "outputs": [], "source": [ "## install rehline\n", "%pip install rehline -q" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "id": "54zIQq6-Bqm8" }, "outputs": [], "source": [ "# Simulate Data\n", "import numpy as np\n", "\n", "np.random.seed(42)\n", "n = 2000\n", "x = np.random.randn(n)\n", "noise = np.random.randn(n) * (0.3 + 0.5 * np.abs(x))\n", "y = 2 * x + noise\n", "X = x.reshape(-1, 1)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 97 }, "id": "LpdGRLz3CKFM", "outputId": "1ebabd56-fe21-4cfc-9a09-109e5cf577e6" }, "outputs": [ { "data": { "text/html": [ "
plq_Ridge_Regressor(C=0.005,\n",
" loss={'epsilon': 0.1, 'name': 'check_eps', 'qt': 0.9})In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. plq_Ridge_Regressor(C=0.005,\n",
" loss={'epsilon': 0.1, 'name': 'check_eps', 'qt': 0.9})plq_Ridge_Regressor(C=0.005,\n",
" loss={'epsilon': 0.2, 'name': 'check_eps', 'qt': 0.95})In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. plq_Ridge_Regressor(C=0.005,\n",
" loss={'epsilon': 0.2, 'name': 'check_eps', 'qt': 0.95})Pipeline(steps=[('scaler', StandardScaler()),\n",
" ('reg',\n",
" plq_Ridge_Regressor(C=0.005,\n",
" loss={'epsilon': 0.3, 'name': 'check_eps',\n",
" 'qt': 0.95}))])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. Pipeline(steps=[('scaler', StandardScaler()),\n",
" ('reg',\n",
" plq_Ridge_Regressor(C=0.005,\n",
" loss={'epsilon': 0.3, 'name': 'check_eps',\n",
" 'qt': 0.95}))])StandardScaler()
plq_Ridge_Regressor(C=0.005,\n",
" loss={'epsilon': 0.3, 'name': 'check_eps', 'qt': 0.95})