{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "S7Tq3zYqOu6u" }, "source": [ "# Monotonic SVM\n", "The Monotonic SVM solves the following optimization problem:\n", "\n", "$$\n", " \\min_{\\mathbf{\\beta} \\in \\mathbb{R}^d} C\\sum_{i=1}^n (1 - y_i \\mathbf{\\beta}^\\intercal \\mathbf{x}_i)_+ + \\frac{1}{2} \\|\\mathbf{\\beta}\\|_2^2,\n", "$$\n", "$$\n", " \\text{subject to} \\quad \\beta_j \\le \\beta_{j+1} \\quad \\forall j \\in \\{1, \\dots, d-1\\} \\quad (\\text{Increasing})\n", "$$\n", "$$\n", " \\text{or} \\quad \\beta_j \\ge \\beta_{j+1} \\quad \\forall j \\in \\{1, \\dots, d-1\\} \\quad (\\text{Decreasing})\n", "$$\n", "\n", "where:\n", "\n", "* $\\mathbf{x}_i \\in \\mathbb{R}^d$ is a feature vector\n", "* $y_i \\in \\{-1, 1\\}$ is a binary label\n", "* $\\beta_j$ represents the $j$-th component of the coefficient vector $\\mathbf{\\beta}$\n", "\n", "The monotonicity constraints ensure that the learned coefficients $\\beta$ follow a strictly non-decreasing or non-increasing order, useful when incorporating prior domain knowledge.\n", "\n", "> **Note.** Since the hinge loss is a plq function and the monotonicity constraints are purely linear (e.g., $\\beta_j - \\beta_{j+1} \\le 0$), we can optimize it using `rehline.plq_Ridge_Classifier`.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "g0aAqkkRKc6z" }, "outputs": [], "source": [ "## install rehline\n", "%pip install rehline -q" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "id": "4ibK-1gsR0ZB" }, "outputs": [], "source": [ "## simulate data\n", "from sklearn.datasets import make_classification\n", "from sklearn.preprocessing import StandardScaler\n", "import numpy as np\n", "\n", "scaler = StandardScaler()\n", "\n", "n, d = 10000, 5\n", "X, y = make_classification(n_samples=n, n_features=d, n_redundant=0, random_state=42)\n", "y = 2*y - 1\n", "X = scaler.fit_transform(X)" ] }, { "cell_type": "markdown", "metadata": { "id": "Oak-k1Ps9hDS" }, "source": [ "## SVM as baseline" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 80 }, "id": "Uk31Pe_cg702", "outputId": "9e177265-dad9-4780-e074-700f022680e7" }, "outputs": [ { "data": { "text/html": [ "
plq_Ridge_Classifier(C=0.001, loss={'name': 'svm'}, max_iter=10000)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. plq_Ridge_Classifier(C=0.001, loss={'name': 'svm'}, max_iter=10000)plq_Ridge_Classifier(C=0.001,\n",
" constraint=[{'decreasing': True, 'name': 'monotonic'}],\n",
" loss={'name': 'svm'}, max_iter=10000)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. plq_Ridge_Classifier(C=0.001,\n",
" constraint=[{'decreasing': True, 'name': 'monotonic'}],\n",
" loss={'name': 'svm'}, max_iter=10000)