{ "cells": [ { "cell_type": "markdown", "source": [ "# Custom QR\n", "\n", "The Custom QR solves the following optimization problem:\n", "\n", "$$\n", "\\min_{\\mathbf{\\beta} \\in \\mathbb{R}^d}\n", "C\\sum_{i=1}^n \\rho_{\\kappa}(y_i-\\mathbf{x}_i^\\top\\mathbf{\\beta})\n", "+\\frac{1}{2}\\|\\mathbf{\\beta}\\|_2^2,\n", "$$\n", "\n", "$$\n", "\\text{subject to} \\quad A\\mathbf{\\beta}+b \\ge 0,\n", "$$\n", "\n", "where:\n", "\n", "* $\\mathbf{x}_i \\in \\mathbb{R}^d$ is a feature vector\n", "* $y_i \\in \\mathbb{R}$ is a continuous response variable\n", "* $\\rho_{\\kappa}(u)=u\\big(\\kappa-\\mathbf{1}(u<0)\\big)$ is the quantile (check) loss\n", "* $A \\in \\mathbb{R}^{K \\times d}$ and $b \\in \\mathbb{R}^{K}$ define custom linear constraints\n", "\n", "The custom constraints allow arbitrary prior knowledge (e.g. sign, ordering, budget, or linear relation constraints) to be incorporated into quantile regression.\n", "\n", "> **Note.** Since the check loss is a plq function and the constraints are linear ($A\\beta+b\\ge0$), we can optimize it using `rehline.plq_Ridge_Regressor`." ], "metadata": { "id": "VMY5iF-Wgfm9" } }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "350gRKl8oZO8" }, "outputs": [], "source": [ "## install rehline\n", "%pip install rehline -q" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "id": "MbOVhGJWC1bM" }, "outputs": [], "source": [ "## simulate data\n", "from sklearn.datasets import make_regression\n", "from sklearn.preprocessing import StandardScaler\n", "import numpy as np\n", "\n", "scaler = StandardScaler()\n", "\n", "n, d = 10000, 5\n", "X, y = make_regression(n_samples=n, n_features=d, noise=1.0, random_state=42)\n", "X = scaler.fit_transform(X)\n", "y = y / y.std()" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "id": "FGHO9F1QZK5M" }, "outputs": [], "source": [ "# Example: beta_0 + beta_1 >= 3\n", "A = np.zeros((1, d))\n", "A[0, 0] = 1\n", "A[0, 1] = 1\n", "b = np.array([-3.0])" ] }, { "cell_type": "markdown", "metadata": { "id": "jAQHQ9HRcbzd" }, "source": [ "## QR as baseline" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 80 }, "id": "F4kiv_VqJF6t", "outputId": "1af75ee7-5a23-4909-8541-6ac79ad14768" }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "plq_Ridge_Regressor(fit_intercept=False, max_iter=10000)" ], "text/html": [ "
plq_Ridge_Regressor(fit_intercept=False, max_iter=10000)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
plq_Ridge_Regressor(fit_intercept=False, max_iter=10000)
plq_Ridge_Regressor(constraint=[{'A': array([[1., 1., 0., 0., 0.]]),\n",
" 'b': array([-3.]), 'name': 'custom'}],\n",
" fit_intercept=False, max_iter=10000)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. plq_Ridge_Regressor(constraint=[{'A': array([[1., 1., 0., 0., 0.]]),\n",
" 'b': array([-3.]), 'name': 'custom'}],\n",
" fit_intercept=False, max_iter=10000)