{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "ofNokKz6XDRq" }, "source": [ "# SVR\n", "\n", "[](https://rehline-python.readthedocs.io/en/latest/)\n", "\n", "SVR (Support Vector Regression) solves the following optimization problem:\n", "\n", "$$\n", "\\min_{\\beta \\in \\mathbb{R}^d}\n", "\\sum_{i=1}^n \\left(\\left|y_i-\\mathbf{x}_i^\\top \\beta\\right|-\\epsilon\\right)_+\n", "+\\frac{\\lambda}{2}\\|\\beta\\|^2\n", "$$\n", "\n", "where $\\mathbf{x}_i \\in \\mathbb{R}^d$ is a feature vector, and $y_i \\in \\mathbb{R}$ is a continuous response variable." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> **Note.** Since the Huber loss is a plq function, we can optimize it using `rehline.plq_Ridge_Regressor`. \n", "> Moreover, this wrapper adapts the `plqERM_Ridge` into a regressor, compatible with the scikit-learn API." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "NjjjqDBE4pkX" }, "outputs": [], "source": [ "## install rehline\n", "%pip install rehline -q" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "id": "sIbawm0_3iq9" }, "outputs": [], "source": [ "import warnings\n", "\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "import seaborn as sns\n", "from sklearn.datasets import make_regression\n", "from sklearn.preprocessing import StandardScaler" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "id": "eq-Bqnt14Xcf" }, "outputs": [], "source": [ "# Simulate data\n", "np.random.seed(42)\n", "scaler_svr = StandardScaler()\n", "\n", "n, d = 10000, 5\n", "X, y = make_regression(n_samples=n, n_features=d, noise=1.0)\n", "X = scaler_svr.fit_transform(X)\n", "y = y / y.std()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 80 }, "id": "HgExVVV54a58", "outputId": "e3bd1fc8-f4c4-49e9-83df-3b3ccfceb0cc" }, "outputs": [ { "data": { "text/html": [ "
plq_Ridge_Regressor(loss={'epsilon': 0.1, 'name': 'svr'})In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. plq_Ridge_Regressor(loss={'epsilon': 0.1, 'name': 'svr'})Pipeline(steps=[('scaler', StandardScaler()),\n",
" ('reg',\n",
" plq_Ridge_Regressor(loss={'epsilon': 0.1, 'name': 'svr'}))])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. Pipeline(steps=[('scaler', StandardScaler()),\n",
" ('reg',\n",
" plq_Ridge_Regressor(loss={'epsilon': 0.1, 'name': 'svr'}))])StandardScaler()
plq_Ridge_Regressor(loss={'epsilon': 0.1, 'name': 'svr'})