rehline
=======

.. py:module:: rehline


Overview
--------

.. list-table:: Classes
   :header-rows: 0
   :widths: auto
   :class: summarytable

   * - :py:obj:`CQR_Ridge <rehline.CQR_Ridge>`
     - Composite Quantile Regressor (CQR) with a ridge penalty.
   * - :py:obj:`ReHLine <rehline.ReHLine>`
     - ReHLine Minimization [1]_.
   * - :py:obj:`plqERM_ElasticNet <rehline.plqERM_ElasticNet>`
     - Empirical Risk Minimization (ERM) with a piecewise linear-quadratic (PLQ) objective with a elastic net penalty.
   * - :py:obj:`plqERM_Ridge <rehline.plqERM_Ridge>`
     - Empirical Risk Minimization (ERM) with a piecewise linear-quadratic (PLQ) objective with a ridge penalty.
   * - :py:obj:`plqMF_Ridge <rehline.plqMF_Ridge>`
     - Matrix Factorization (MF) with a piecewise linear-quadratic objective and ridge penalty.
   * - :py:obj:`plq_ElasticNet_Classifier <rehline.plq_ElasticNet_Classifier>`
     - Empirical Risk Minimization (ERM) Classifier with a Piecewise Linear-Quadratic (PLQ) loss
   * - :py:obj:`plq_ElasticNet_Regressor <rehline.plq_ElasticNet_Regressor>`
     - Empirical Risk Minimization (ERM) regressor with a Piecewise Linear-Quadratic (PLQ) loss
   * - :py:obj:`plq_Ridge_Classifier <rehline.plq_Ridge_Classifier>`
     - Empirical Risk Minimization (ERM) Classifier with a Piecewise Linear-Quadratic (PLQ) loss
   * - :py:obj:`plq_Ridge_Regressor <rehline.plq_Ridge_Regressor>`
     - Empirical Risk Minimization (ERM) regressor with a Piecewise Linear-Quadratic (PLQ) loss


.. list-table:: Function
   :header-rows: 0
   :widths: auto
   :class: summarytable

   * - :py:obj:`ReHLine_solver <rehline.ReHLine_solver>`\ (X, U, V, Tau, S, T, A, b, rho, Lambda, Gamma, xi, mu, max_iter, tol, shrink, verbose, trace_freq)
     - \-
   * - :py:obj:`make_mf_dataset <rehline.make_mf_dataset>`\ (n_users, n_items, n_factors, n_interactions, density, noise_std, seed, rating_min, rating_max, return_params)
     - Generate synthetic rating data using matrix factorization model.
   * - :py:obj:`CQR_Ridge_path_sol <rehline.CQR_Ridge_path_sol>`\ (X, y, \*None, quantiles, eps, n_Cs, Cs, max_iter, tol, verbose, shrink, warm_start, return_time)
     - Compute the regularization path for Composite Quantile Regression (CQR) with ridge penalty.
   * - :py:obj:`plqERM_Ridge_path_sol <rehline.plqERM_Ridge_path_sol>`\ (X, y, \*None, loss, constraint, eps, n_Cs, Cs, max_iter, tol, verbose, shrink, warm_start, return_time)
     - Compute the PLQ Empirical Risk Minimization (ERM) path over a range of regularization parameters.


Classes
-------

.. py:class:: CQR_Ridge(quantiles, C=1.0, max_iter=1000, tol=0.0001, shrink=1, warm_start=0, verbose=0, trace_freq=100)

   Bases: :py:obj:`rehline._base._BaseReHLine`, :py:obj:`sklearn.base.BaseEstimator`

   Composite Quantile Regressor (CQR) with a ridge penalty.

   It allows for the fitting of a linear regression model that minimizes a composite quantile loss function.

   .. math::

       \min_{\mathbf{\beta} \in \mathbb{R}^d, \mathbf{\beta_0} \in \mathbb{R}^K} \sum_{k=1}^K \sum_{i=1}^n \text{PLQ}(y_i, \mathbf{x}_i^T \mathbf{\beta} + \mathbf{\beta_0k}) + \frac{1}{2} \| \mathbf{\beta} \|_2^2.


   Parameters
   ----------
   quantiles : list of float (n_quantiles,)
       The quantiles to be estimated.

   C : float, default=1.0
       Regularization parameter. The strength of the regularization is
       inversely proportional to C. Must be strictly positive.
       `C` will be absorbed by the ReHLine parameters when `self.make_ReLHLoss` is conducted.

   verbose : int, default=0
       Enable verbose output. Note that this setting takes advantage of a
       per-process runtime setting in liblinear that, if enabled, may not work
       properly in a multithreaded context.

   max_iter : int, default=1000
       The maximum number of iterations to be run.

   tol : float, default=1e-4
       The tolerance for the stopping criterion.

   shrink : float, default=1
       The shrinkage of dual variables for the ReHLine algorithm.

   warm_start : bool, default=False
       Whether to use the given dual params as an initial guess for the
       optimization algorithm.

   trace_freq : int, default=100
       The frequency at which to print the optimization trace.

   Attributes
   ----------
   coef\_ : array-like
       The optimized model coefficients.

   intercept\_ : array-like
       The optimized model intercepts.

   quantiles\_: array-like
       The quantiles to be estimated.

   n_iter\_ : int
       The number of iterations performed by the ReHLine solver.

   opt_result\_ : object
       The optimization result object.

   dual_obj\_ : array-like
       The dual objective function values.

   primal_obj\_ : array-like
       The primal objective function values.

   Methods
   -------
   fit(X, y, sample_weight=None)
       Fit the model based on the given training data.

   predict(X)
       The prediction for the given dataset.


   Overview
   ========


   .. list-table:: Methods
      :header-rows: 0
      :widths: auto
      :class: summarytable

      * - :py:obj:`fit <rehline.CQR_Ridge.fit>`\ (X, y, sample_weight)
        - Fit the model based on the given training data.
      * - :py:obj:`predict <rehline.CQR_Ridge.predict>`\ (X)
        - The prediction for the given dataset.


   Members
   =======

   .. py:method:: fit(X, y, sample_weight=None)

      Fit the model based on the given training data.

      Parameters
      ----------

      X: {array-like} of shape (n_samples, n_features)
          Training vector, where `n_samples` is the number of samples and
          `n_features` is the number of features.

      y : array-like of shape (n_samples,)
          The target variable.

      sample_weight : array-like of shape (n_samples,), default=None
          Array of weights that are assigned to individual
          samples. If not provided, then each sample is given unit weight.

      Returns
      -------
      self : object
          An instance of the estimator.


   .. py:method:: predict(X)

      The prediction for the given dataset.

      Parameters
      ----------
      X : array-like of shape (n_samples, n_features)
          The data matrix.

      Returns
      -------
      ndarray of shape (n_samples, n_quantiles)
          Returns the predicted quantile values for the samples.


.. py:class:: ReHLine(C=1.0, U=None, V=None, Tau=None, S=None, T=None, A=None, b=None, max_iter=1000, tol=0.0001, shrink=1, warm_start=0, verbose=0, trace_freq=100)

   Bases: :py:obj:`rehline._base._BaseReHLine`, :py:obj:`sklearn.base.BaseEstimator`

   ReHLine Minimization [1]_.

   .. math::

       \min_{\mathbf{\beta} \in \mathbb{R}^d} \sum_{i=1}^n \sum_{l=1}^L \text{ReLU}( u_{li} \mathbf{x}_i^\intercal \mathbf{\beta} + v_{li}) + \sum_{i=1}^n \sum_{h=1}^H {\text{ReHU}}_{\tau_{hi}}( s_{hi} \mathbf{x}_i^\intercal \mathbf{\beta} + t_{hi}) + \frac{1}{2} \| \mathbf{\beta} \|_2^2, \\ \text{ s.t. }
       \mathbf{A} \mathbf{\beta} + \mathbf{b} \geq \mathbf{0},

   where :math:`\mathbf{U} = (u_{li}),\mathbf{V} = (v_{li}) \in \mathbb{R}^{L \times n}`
   and :math:`\mathbf{S} = (s_{hi}),\mathbf{T} = (t_{hi}),\mathbf{\tau} = (\tau_{hi}) \in \mathbb{R}^{H \times n}`
   are the ReLU-ReHU loss parameters, and :math:`(\mathbf{A},\mathbf{b})` are the constraint parameters.

   Parameters
   ----------
   C : float, default=1.0
       Regularization parameter. The strength of the regularization is
       inversely proportional to C. Must be strictly positive.

   _U, _V: array of shape (L, n_samples), default=np.empty(shape=(0, 0))
       The parameters pertaining to the ReLU part in the loss function.

   _Tau, _S, _T: array of shape (H, n_samples), default=np.empty(shape=(0, 0))
       The parameters pertaining to the ReHU part in the loss function.

   _A: array of shape (K, n_features), default=np.empty(shape=(0, 0))
       The coefficient matrix in the linear constraint.

   _b: array of shape (K, ), default=np.empty(shape=0)
       The intercept vector in the linear constraint.

   verbose : int, default=0
       Enable verbose output.

   max_iter : int, default=1000
       The maximum number of iterations to be run.

   tol : float, default=1e-4
       The tolerance for the stopping criterion.

   shrink : float, default=1
       The shrinkage of dual variables for the ReHLine algorithm.

   warm_start : bool, default=False
       Whether to use the given dual params as an initial guess for the
       optimization algorithm.

   trace_freq : int, default=100
       The frequency at which to print the optimization trace.

   Attributes
   ----------
   coef\_ : array-like
       The optimized model coefficients.

   n_iter\_ : int
       The number of iterations performed by the ReHLine solver.

   opt_result\_ : object
       The optimization result object.

   dual_obj\_ : array-like
       The dual objective function values.

   primal_obj\_ : array-like
       The primal objective function values.

   _Lambda: array-like
       The optimized dual variables for ReLU parts.

   _Gamma: array-like
       The optimized dual variables for ReHU parts.

   _xi: array-like
       The optimized dual variables for linear constraints.

   Examples
   --------

   >>> ## test SVM on simulated dataset
   >>> import numpy as np
   >>> from rehline import ReHLine

   >>> # simulate classification dataset
   >>> n, d, C = 1000, 3, 0.5
   >>> np.random.seed(1024)
   >>> X = np.random.randn(1000, 3)
   >>> beta0 = np.random.randn(3)
   >>> y = np.sign(X.dot(beta0) + np.random.randn(n))

   >>> # Usage of ReHLine
   >>> n, d = X.shape
   >>> U = -(C*y).reshape(1,-1)
   >>> L = U.shape[0]
   >>> V = (C*np.array(np.ones(n))).reshape(1,-1)
   >>> clf = ReHLine(C=C)
   >>> clf._U, clf._V = U, V
   >>> clf.fit(X=X)
   >>> print('sol provided by rehline: %s' %clf.coef_)
   >>> sol provided by rehline: [ 0.7410154  -0.00615574  2.66990408]
   >>> print(clf.decision_function([[.1,.2,.3]]))
   >>> [0.87384162]

   References
   ----------
   .. [1] `Dai, B., Qiu, Y,. (2023). ReHLine: Regularized Composite ReLU-ReHU Loss Minimization with Linear Computation and Linear Convergence <https://openreview.net/pdf?id=3pEBW2UPAD>`_


   Overview
   ========


   .. list-table:: Methods
      :header-rows: 0
      :widths: auto
      :class: summarytable

      * - :py:obj:`fit <rehline.ReHLine.fit>`\ (X, sample_weight)
        - Fit the model based on the given training data.
      * - :py:obj:`decision_function <rehline.ReHLine.decision_function>`\ (X)
        - The decision function evaluated on the given dataset


   Members
   =======

   .. py:method:: fit(X, sample_weight=None)

      Fit the model based on the given training data.

      Parameters
      ----------

      X: {array-like} of shape (n_samples, n_features)
          Training vector, where `n_samples` is the number of samples and
          `n_features` is the number of features.

      sample_weight : array-like of shape (n_samples,), default=None
          Array of weights that are assigned to individual
          samples. If not provided, then each sample is given unit weight.

      Returns
      -------
      self : object
          An instance of the estimator.


   .. py:method:: decision_function(X)

      The decision function evaluated on the given dataset

      Parameters
      ----------
      X : array-like of shape (n_samples, n_features)
          The data matrix.

      Returns
      -------
      ndarray of shape (n_samples, )
          Returns the decision function of the samples.


.. py:class:: plqERM_ElasticNet(loss, constraint=None, C=1.0, l1_ratio=0.5, omega=None, U=None, V=None, Tau=None, S=None, T=None, A=None, b=None, max_iter=1000, tol=0.0001, shrink=1, warm_start=0, verbose=0, trace_freq=100)

   Bases: :py:obj:`rehline._base._BaseReHLine`, :py:obj:`sklearn.base.BaseEstimator`

   Empirical Risk Minimization (ERM) with a piecewise linear-quadratic (PLQ) objective with a elastic net penalty.

   .. math::

       \min_{\mathbf{\beta} \in \mathbb{R}^d} C \sum_{i=1}^n \text{PLQ}(y_i, \mathbf{x}_i^T \mathbf{\beta}) + \text{l1_ratio} \sum_{j=1}^d \omega_j | \beta_j | + \frac{1}{2} (1 - \text{l1_ratio})  \| \mathbf{\beta} \|_2^2, \ \text{ s.t. } \
       \mathbf{A} \mathbf{\beta} + \mathbf{b} \geq \mathbf{0},

   The function supports various loss functions, including:
       - 'hinge', 'svm' or 'SVM'
       - 'check' or 'quantile' or 'quantile regression' or 'QR'
       - 'sSVM' or 'smooth SVM' or 'smooth hinge'
       - 'TV'
       - 'huber' or 'Huber'
       - 'SVR' or 'svr'

   The following constraint types are supported:
       * 'nonnegative' or '>=0': A non-negativity constraint.
       * 'fair' or 'fairness': A fairness constraint.
       * 'custom': A custom constraint, where the user must provide the constraint matrix 'A' and vector 'b'.

   Parameters
   ----------
   loss : dict
       A dictionary specifying the loss function parameters.

   constraint : list of dict
       A list of dictionaries, where each dictionary represents a constraint.
       Each dictionary must contain a 'name' key, which specifies the type of constraint.

   C : float, default=1.0
       Regularization parameter. The strength of the regularization is
       inversely proportional to C. Must be strictly positive.
       `C` will be absorbed by the ReHLine parameters when `self.make_ReLHLoss` is conducted.

   l1_ratio : float, default=0.5
       The ElasticNet mixing parameter, with 0 <= l1_ratio < 1. For l1_ratio = 0 the penalty
       is an L2 penalty. For 0 < l1_ratio < 1, the penalty is a combination of L1 and L2.

   omega : array of shape (n_features, ), default=np.empty(shape=0)
       Non-negative weight coefficients for adaptive lasso. If not provided, all coefficients receive the 
       same L1 penalty controlled by ``l1_ratio``.

   verbose : int, default=0
       Enable verbose output. Note that this setting takes advantage of a
       per-process runtime setting in liblinear that, if enabled, may not work
       properly in a multithreaded context.

   max_iter : int, default=1000
       The maximum number of iterations to be run.

   _U, _V: array of shape (L, n_samples), default=np.empty(shape=(0, 0))
       The parameters pertaining to the ReLU part in the loss function.

   _Tau, _S, _T: array of shape (H, n_samples), default=np.empty(shape=(0, 0))
       The parameters pertaining to the ReHU part in the loss function.

   _A: array of shape (K, n_features), default=np.empty(shape=(0, 0))
       The coefficient matrix in the linear constraint.

   _b: array of shape (K, ), default=np.empty(shape=0)
       The intercept vector in the linear constraint.

   Attributes
   ----------
   coef\_ : array-like
       The optimized model coefficients.

   n_iter\_ : int
       The number of iterations performed by the ReHLine solver.

   opt_result\_ : object
       The optimization result object.

   dual_obj\_ : array-like
       The dual objective function values.

   primal_obj\_ : array-like
       The primal objective function values.

   Methods
   -------
   fit(X, y, sample_weight=None)
       Fit the model based on the given training data.

   decision_function(X)
       The decision function evaluated on the given dataset.

   Notes
   -----
   The `plqERM_ElasticNet` class is a subclass of `_BaseReHLine` and `BaseEstimator`, which suggests that it is part of a larger framework for implementing ReHLine algorithms.


   Overview
   ========


   .. list-table:: Methods
      :header-rows: 0
      :widths: auto
      :class: summarytable

      * - :py:obj:`fit <rehline.plqERM_ElasticNet.fit>`\ (X, y, sample_weight)
        - Fit the model based on the given training data.
      * - :py:obj:`decision_function <rehline.plqERM_ElasticNet.decision_function>`\ (X)
        - The decision function evaluated on the given dataset


   Members
   =======

   .. py:method:: fit(X, y, sample_weight=None)

      Fit the model based on the given training data.

      Parameters
      ----------

      X: {array-like} of shape (n_samples, n_features)
          Training vector, where `n_samples` is the number of samples and
          `n_features` is the number of features.

      y : array-like of shape (n_samples,)
          The target variable.

      sample_weight : array-like of shape (n_samples,), default=None
          Array of weights that are assigned to individual
          samples. If not provided, then each sample is given unit weight.

      Returns
      -------
      self : object
          An instance of the estimator.


   .. py:method:: decision_function(X)

      The decision function evaluated on the given dataset

      Parameters
      ----------
      X : array-like of shape (n_samples, n_features)
          The data matrix.

      Returns
      -------
      ndarray of shape (n_samples, )
          Returns the decision function of the samples.


.. py:class:: plqERM_Ridge(loss, constraint=None, C=1.0, U=None, V=None, Tau=None, S=None, T=None, A=None, b=None, max_iter=1000, tol=0.0001, shrink=1, warm_start=0, verbose=0, trace_freq=100)

   Bases: :py:obj:`rehline._base._BaseReHLine`, :py:obj:`sklearn.base.BaseEstimator`

   Empirical Risk Minimization (ERM) with a piecewise linear-quadratic (PLQ) objective with a ridge penalty.

   .. math::

       \min_{\mathbf{\beta} \in \mathbb{R}^d} \sum_{i=1}^n \text{PLQ}(y_i, \mathbf{x}_i^T \mathbf{\beta}) + \frac{1}{2} \| \mathbf{\beta} \|_2^2, \ \text{ s.t. } \
       \mathbf{A} \mathbf{\beta} + \mathbf{b} \geq \mathbf{0},

   The function supports various loss functions, including:
       - 'hinge', 'svm' or 'SVM'
       - 'check' or 'quantile' or 'quantile regression' or 'QR'
       - 'sSVM' or 'smooth SVM' or 'smooth hinge'
       - 'TV'
       - 'huber' or 'Huber'
       - 'SVR' or 'svr'

   The following constraint types are supported:
       * 'nonnegative' or '>=0': A non-negativity constraint.
       * 'fair' or 'fairness': A fairness constraint.
       * 'custom': A custom constraint, where the user must provide the constraint matrix 'A' and vector 'b'.

   Parameters
   ----------
   loss : dict
       A dictionary specifying the loss function parameters.

   constraint : list of dict
       A list of dictionaries, where each dictionary represents a constraint.
       Each dictionary must contain a 'name' key, which specifies the type of constraint.

   C : float, default=1.0
       Regularization parameter. The strength of the regularization is
       inversely proportional to C. Must be strictly positive.
       `C` will be absorbed by the ReHLine parameters when `self.make_ReLHLoss` is conducted.

   verbose : int, default=0
       Enable verbose output. Note that this setting takes advantage of a
       per-process runtime setting in liblinear that, if enabled, may not work
       properly in a multithreaded context.

   max_iter : int, default=1000
       The maximum number of iterations to be run.

   _U, _V: array of shape (L, n_samples), default=np.empty(shape=(0, 0))
       The parameters pertaining to the ReLU part in the loss function.

   _Tau, _S, _T: array of shape (H, n_samples), default=np.empty(shape=(0, 0))
       The parameters pertaining to the ReHU part in the loss function.

   _A: array of shape (K, n_features), default=np.empty(shape=(0, 0))
       The coefficient matrix in the linear constraint.

   _b: array of shape (K, ), default=np.empty(shape=0)
       The intercept vector in the linear constraint.

   Attributes
   ----------
   coef\_ : array-like
       The optimized model coefficients.

   n_iter\_ : int
       The number of iterations performed by the ReHLine solver.

   opt_result\_ : object
       The optimization result object.

   dual_obj\_ : array-like
       The dual objective function values.

   primal_obj\_ : array-like
       The primal objective function values.

   Methods
   -------
   fit(X, y, sample_weight=None)
       Fit the model based on the given training data.

   decision_function(X)
       The decision function evaluated on the given dataset.

   Notes
   -----
   The `plqERM_Ridge` class is a subclass of `_BaseReHLine` and `BaseEstimator`, which suggests that it is part of a larger framework for implementing ReHLine algorithms.


   Overview
   ========


   .. list-table:: Methods
      :header-rows: 0
      :widths: auto
      :class: summarytable

      * - :py:obj:`fit <rehline.plqERM_Ridge.fit>`\ (X, y, sample_weight)
        - Fit the model based on the given training data.
      * - :py:obj:`decision_function <rehline.plqERM_Ridge.decision_function>`\ (X)
        - The decision function evaluated on the given dataset


   Members
   =======

   .. py:method:: fit(X, y, sample_weight=None)

      Fit the model based on the given training data.

      Parameters
      ----------

      X: {array-like} of shape (n_samples, n_features)
          Training vector, where `n_samples` is the number of samples and
          `n_features` is the number of features.

      y : array-like of shape (n_samples,)
          The target variable.

      sample_weight : array-like of shape (n_samples,), default=None
          Array of weights that are assigned to individual
          samples. If not provided, then each sample is given unit weight.

      Returns
      -------
      self : object
          An instance of the estimator.


   .. py:method:: decision_function(X)

      The decision function evaluated on the given dataset

      Parameters
      ----------
      X : array-like of shape (n_samples, n_features)
          The data matrix.

      Returns
      -------
      ndarray of shape (n_samples, )
          Returns the decision function of the samples.


.. py:class:: plqMF_Ridge(n_users, n_items, loss, biased=True, constraint_user=None, constraint_item=None, rank=10, C=1.0, rho=0.5, init_mean=0.0, init_sd=0.1, random_state=None, max_iter=10000, tol=0.0001, shrink=1, trace_freq=100, max_iter_CD=10, tol_CD=0.0001, verbose=0)

   Bases: :py:obj:`rehline._base._BaseReHLine`, :py:obj:`sklearn.base.BaseEstimator`

   Matrix Factorization (MF) with a piecewise linear-quadratic objective and ridge penalty.

   .. math::
       \min_{\substack{
           \mathbf{P} \in \mathbb{R}^{n \times k}\
           \pmb{\alpha} \in \mathbb{R}^n \\
           \mathbf{Q} \in \mathbb{R}^{m \times k}\
           \pmb{\beta} \in \mathbb{R}^m
       }}
       \left[
           \sum_{(u,i)\in \Omega} C \cdot \text{PLQ}(r_{ui}, \ \mathbf{p}_u^T \mathbf{q}_i + \alpha_u + \beta_i)
       \right]
       +
       \left[
           \frac{\rho}{n}\sum_{u=1}^n(\|\mathbf{p}_u\|_2^2 + \alpha_u^2)
           + \frac{1-\rho}{m}\sum_{i=1}^m(\|\mathbf{q}_i\|_2^2 + \beta_i^2)
       \right]

   .. math::
       \ \text{ s.t. } \
       \mathbf{A}_{\text{user}} \begin{pmatrix} \alpha_u \\ \mathbf{p}_u \end{pmatrix} + \mathbf{b}_{\text{user}} \geq \mathbf{0},\ u = 1,\dots,n
       \quad \text{and} \quad
       \mathbf{A}_{\text{item}} \begin{pmatrix} \beta_i \\ \mathbf{q}_i \end{pmatrix} + \mathbf{b}_{\text{item}} \geq \mathbf{0},\ i = 1,\dots,m

   The function supports various loss functions, including:
       - 'hinge', 'svm' or 'SVM'
       - 'MAE' or 'mae' or 'mean absolute error'
       - 'squared hinge' or 'squared svm' or 'squared SVM'
       - 'MSE' or 'mse' or 'mean squared error'

   The following constraint types are supported:
       * 'nonnegative' or '>=0': A non-negativity constraint.
       * 'fair' or 'fairness': A fairness constraint.
       * 'custom': A custom constraint, where the user must provide the constraint matrix 'A' and vector 'b'.

   Parameters
   ----------
   n_users : int
       Number of unique users in the dataset (or number of rows in target sparse matrix).

   n_items : int
       Number of unique items in the dataset (or number of columns in target sparse matrix).

   loss : dict
       A dictionary specifying the loss function parameters.

   constraint_user : list of dict
       A list of dictionaries, where each dictionary represents a constraint to user side parameters.
       Each dictionary must contain a 'name' key, which specifies the type of constraint.

   constraint_item : list of dict
       A list of dictionaries, where each dictionary represents a constraint to item side parameters.
       Each dictionary must contain a 'name' key, which specifies the type of constraint.

   biased : bool, default=True
           Whether to include user and item bias terms in the model.

   rank : int, default=10
       Dimensionality of the latent factor vectors (number of factors).

   C : float, default=1.0
       Regularization parameter. The strength of the regularization is
       inversely proportional to `C`. Must be strictly positive.
       `C` will be absorbed by the ReHLine parameters when `_cast_sample_weight()` is conducted.

   rho : float, default=0.5
       Regularization strength ratio between user and item factors. Must be within the range of (0,1).

   init_mean : float, default=0.0
       Mean of the Gaussian distribution for initializing latent factors.

   init_sd : float, default=0.1
       Standard deviation of the Gaussian distribution for initializing latent factors.

   random_state : int or RandomState, default=None
       Random seed for reproducible initialization of latent factors.

   max_iter : int, default=10000
       The maximum number of iterations to be run for the ReHLine solver.

   tol : float, default=1e-4
       The tolerance for the stopping criterion for the ReHLine solver.

   shrink : float, default=1
       The shrinkage of dual variables for the ReHLine solver.

   trace_freq : int, default=100
       The frequency at which to print the optimization trace for the ReHLine solver.

   max_iter_CD : int, default=10
       Maximum number of iterations for coordinate descent steps.

   tol_CD : float, default=1e-4
       The tolerance for the stopping criterion for coordinate descent steps.

   verbose : int, default=0
       Verbosity level.
         0: No output
         1: CD iteration progress information
         2: ReHLine solver optimization information
         3: All information of CD and ReHLine

   Attributes
   ----------
   n_users : int
       Number of unique users in the dataset (or number of rows in target sparse matrix).

   n_items : int
       Number of unique items in the dataset (or number of columns in target sparse matrix).

   n_ratings : int
       Number of ratings in the training data. Available after fitting.

   P : ndarray of shape (n_users, rank)
       User latent factor matrix. Learned during fitting.

   Q : ndarray of shape (n_items, rank)
       Item latent factor matrix. Learned during fitting.

   bu : ndarray of shape (n_users,) or None
       User bias terms. Learned during fitting. Only available if `biased=True`.

   bi : ndarray of shape (n_items,) or None
       Item bias terms. Learned during fitting. Only available if `biased=True`.

   Iu : list of ndarray
       List where each element contains indices of items rated by the corresponding user.
       Available after fitting.

   Ui : list of ndarray
       List where each element contains indices of users who rated the corresponding item.
       Available after fitting.

   history : ndarray of shape (max_iter_CD + 1, 2)
       Optimization history containing loss and objective values at each coordinate descent iteration.
       First column: cumulative loss term values. Second column: objective function values (with penalty).

   sample_weight : ndarray of shape (n_ratings,)
       Sample weights used during fitting. Available after fitting.

   Methods
   -------
   fit(X, y, sample_weight=None)
       Fit the model based on the given training data.

   decision_function(X)
       The decision function evaluated on the given dataset.

   obj(X, y)
       Compute the values of loss term and objective function.

   Notes
   -----
   The `plqMF_Ridge` class is a subclass of `_BaseReHLine` and `BaseEstimator`, which suggests that it is part of a larger framework for implementing ReHLine algorithms.


   Overview
   ========


   .. list-table:: Methods
      :header-rows: 0
      :widths: auto
      :class: summarytable

      * - :py:obj:`fit <rehline.plqMF_Ridge.fit>`\ (X, y, sample_weight)
        - Fit the model based on the given training data.
      * - :py:obj:`decision_function <rehline.plqMF_Ridge.decision_function>`\ (X)
        - The decision function evaluated on the given dataset
      * - :py:obj:`obj <rehline.plqMF_Ridge.obj>`\ (X, y)
        - Compute the values of loss term and objective function.


   Members
   =======

   .. py:method:: fit(X, y, sample_weight=None)

      Fit the model based on the given training data.

      Parameters
      ----------
      X : array-like of shape (n_ratings, 2)
          Input data where first column contains user ID and
          second column contains item ID.

      y : array-like of shape (n_ratings,)
          Target rating values.

      sample_weight : array-like of shape (n_samples,), default=None
          Array of weights that are assigned to individual samples.
          If not provided, then each sample is given unit weight.

      Returns
      -------
      self : object
          An instance of the estimator.


   .. py:method:: decision_function(X)

      The decision function evaluated on the given dataset

      Parameters
      ----------
      X : array-like of shape (n_samples, 2)
          Training data where first column contains user ID and
          second column contains item ID.

      Returns
      -------
      prediction : ndarray of shape (n_samples,)
          Predicted ratings for the input pairs.


   .. py:method:: obj(X, y)

      Compute the values of loss term and objective function.

      Parameters
      ----------
      X : array-like of shape (n_ratings, 2)
          User-item rating pairs.

      y : array-like of shape (n_ratings,)
          Actual rating values.

      Returns
      -------
      loss_term : float
          The data fitting term (sum of loss values).

      objective_value : float
          The total objective value including regularization.


.. py:class:: plq_ElasticNet_Classifier(loss, constraint=None, C=1.0, l1_ratio=0.5, omega=None, U=None, V=None, Tau=None, S=None, T=None, A=None, b=None, max_iter=1000, tol=0.0001, shrink=1, warm_start=0, verbose=0, trace_freq=100, fit_intercept=True, intercept_scaling=1.0, class_weight=None, multi_class=None, n_jobs=None)

   Bases: :py:obj:`rehline._class.plqERM_ElasticNet`, :py:obj:`sklearn.base.ClassifierMixin`

   Empirical Risk Minimization (ERM) Classifier with a Piecewise Linear-Quadratic (PLQ) loss
   and elastic net penalty, compatible with the scikit-learn API.

   This wrapper makes ``plqERM_ElasticNet`` behave as a classifier:
       - Accepts arbitrary binary labels in the original label space.
       - Computes class weights on original labels (if ``class_weight`` is set).
       - Encodes labels with ``LabelEncoder`` into {0,1}, then maps to {-1,+1} for training.
       - Supports optional intercept fitting (via an augmented constant feature).
       - Provides standard methods ``fit``, ``predict``, and ``decision_function``.
       - Integrates with scikit-learn ecosystem (e.g., GridSearchCV, Pipeline).
       - Supports multiclass classification via OvR or OvO method.

   Parameters
   ----------
   loss : dict
       Dictionary specifying the loss function parameters. Examples include:
       - {'name': 'svm'}
       - {'name': 'sSVM'}
       - {'name': 'huber'}
       and other PLQ losses supported by ``plqERM_ElasticNet``.

   constraint : list of dict, default=[]
       Optional constraints. Each dictionary must include a ``'name'`` key.

   C : float, default=1.0
       Inverse regularization strength (scales the loss term).

   l1_ratio : float, default=0.5
       The ElasticNet mixing parameter, 0 <= l1_ratio < 1.
       - l1_ratio = 0  → pure Ridge (equivalent to plq_Ridge_Classifier)
       - 0 < l1_ratio < 1 → combined L1 + L2 penalty
       Must be strictly less than 1.0 to avoid division by zero in rho/C_eff.

   omega : array of shape (n_features, ), default=np.empty(shape=(0, 0))
       Non-negative weight coefficients for adaptive lasso. If not provided, all non-intercept coefficients 
       receive the same L1 penalty controlled by ``l1_ratio``. The penalty for the intercept 
       can be scaled via ``intercept_scaling``.

   fit_intercept : bool, default=True
       Whether to fit an intercept term via an augmented constant feature column.

   intercept_scaling : float, default=1.0
       Value of the constant feature column when ``fit_intercept=True``.

   class_weight : dict, 'balanced', or None, default=None
       Class weights applied like in LinearSVC.

   multi_class : str or list, default=[]
       Method for multiclass classification:
       - 'ovr': One-vs-Rest
       - 'ovo': One-vs-One
       - [] or ignored when only 2 classes are present.

   n_jobs : int or None, default=None
       Number of parallel jobs for multiclass fitting.

   max_iter : int, default=1000
   tol : float, default=1e-4
   shrink : int, default=1
   warm_start : int, default=0
   verbose : int, default=0
   trace_freq : int, default=100

   Attributes
   ----------
   ``coef_`` : ndarray of shape (n_features,) for binary, (n_estimators, n_features) for multiclass
   ``intercept_`` : float for binary, ndarray of shape (n_estimators,) for multiclass
   ``classes_`` : ndarray of shape (n_classes,)
   ``estimators_`` : list, only present for multiclass
   _label_encoder : LabelEncoder


   Overview
   ========


   .. list-table:: Methods
      :header-rows: 0
      :widths: auto
      :class: summarytable

      * - :py:obj:`fit <rehline.plq_ElasticNet_Classifier.fit>`\ (X, y, sample_weight)
        - Fit the classifier to training data.
      * - :py:obj:`decision_function <rehline.plq_ElasticNet_Classifier.decision_function>`\ (X)
        - Compute the decision function for samples in X.
      * - :py:obj:`predict <rehline.plq_ElasticNet_Classifier.predict>`\ (X)
        - Predict class labels for samples in X.


   Members
   =======

   .. py:method:: fit(X, y, sample_weight=None)

      Fit the classifier to training data.

      Parameters
      ----------
      X : array-like of shape (n_samples, n_features)
      y : array-like of shape (n_samples,)
      sample_weight : array-like of shape (n_samples,), default=None

      Returns
      -------
      self


   .. py:method:: decision_function(X)

      Compute the decision function for samples in X.

      For binary: 1D array of shape (n_samples,).
      For OvR/OvO multiclass: 2D array of shape (n_samples, n_estimators).


   .. py:method:: predict(X)

      Predict class labels for samples in X.

      Binary: threshold at 0.
      OvR: argmax across K classifiers.
      OvO: majority vote + normalized confidence tie-breaking.


.. py:class:: plq_ElasticNet_Regressor(loss=None, constraint=None, C=1.0, l1_ratio=0.5, omega=None, U=None, V=None, Tau=None, S=None, T=None, A=None, b=None, max_iter=1000, tol=0.0001, shrink=1, warm_start=0, verbose=0, trace_freq=100, fit_intercept=True, intercept_scaling=1.0)

   Bases: :py:obj:`rehline._class.plqERM_ElasticNet`, :py:obj:`sklearn.base.RegressorMixin`

   Empirical Risk Minimization (ERM) regressor with a Piecewise Linear-Quadratic (PLQ) loss
   and an elastic net penalty, implemented as a scikit-learn compatible estimator.

   This wrapper makes ``plqERM_ElasticNet`` behave as a regressor:
       - Supports optional intercept fitting via an augmented constant feature column.
       - Provides standard methods ``fit``, ``predict``, and ``decision_function``.
       - Integrates with the scikit-learn ecosystem (e.g., GridSearchCV, Pipeline).

   Notes
   -----
   - **Intercept handling**: if ``fit_intercept=True``, a constant column
     (value = ``intercept_scaling``) is appended to the right of the design
     matrix before calling the base solver. The last learned coefficient is
     then split out as ``intercept_``.
     Original feature indices are therefore unaffected; ``sen_idx`` in a
     ``'fair'`` constraint continues to reference the original columns.
   - **Sparse input**: not supported. Convert to dense before fitting.

   Parameters
   ----------
   loss : dict, default={'name': 'QR', 'qt': 0.5}
       PLQ loss configuration. Examples:
       ``{'name': 'QR', 'qt': 0.5}``, ``{'name': 'huber', 'tau': 1.0}``,
       ``{'name': 'SVR', 'epsilon': 0.1}``.

   constraint : list of dict, default=[]
       Constraint specifications:
         - ``{'name': 'nonnegative'}`` or ``{'name': '>=0'}``
         - ``{'name': 'fair', 'sen_idx': list[int], 'tol_sen': list[float]}``
         - ``{'name': 'custom', 'A': ndarray[K, d], 'b': ndarray[K]}``

   C : float, default=1.0
       Regularization parameter (scales the loss term).

   l1_ratio : float, default=0.5
       The ElasticNet mixing parameter, 0 <= l1_ratio < 1.
       - l1_ratio = 0  → pure Ridge (equivalent to plq_Ridge_Regressor)
       - 0 < l1_ratio < 1 → combined L1 + L2 penalty
       Must be strictly less than 1.0 to avoid division by zero in rho/C_eff.

   omega : array of shape (n_features, ), default=np.empty(shape=(0, 0))
           Non-negative weight coefficients for adaptive lasso. If not provided, all non-intercept coefficients 
           receive the same L1 penalty controlled by ``l1_ratio``. The penalty for the intercept 
           can be scaled via ``intercept_scaling``.

   fit_intercept : bool, default=True
       If True, append a constant column (value = ``intercept_scaling``) to
       the design matrix before solving. The last learned coefficient is then
       extracted as ``intercept_``.

   intercept_scaling : float, default=1.0
       Scaling applied to the appended constant column when
       ``fit_intercept=True``.

   max_iter : int, default=1000
   tol : float, default=1e-4
   shrink : int, default=1
   warm_start : int, default=0
   verbose : int, default=0
   trace_freq : int, default=100

   Attributes
   ----------
   ``coef_`` : ndarray of shape (n_features,)
       Learned linear coefficients (excluding the intercept term).
   ``intercept_`` : float
       Intercept term. 0.0 if ``fit_intercept=False``.
   ``n_features_in_`` : int
       Number of input features seen during :meth:`fit` (before intercept
       augmentation).


   Overview
   ========


   .. list-table:: Methods
      :header-rows: 0
      :widths: auto
      :class: summarytable

      * - :py:obj:`fit <rehline.plq_ElasticNet_Regressor.fit>`\ (X, y, sample_weight)
        - Fit the regressor to training data.
      * - :py:obj:`decision_function <rehline.plq_ElasticNet_Regressor.decision_function>`\ (X)
        - Compute f(X) = X @ ``coef_`` + ``intercept_``.
      * - :py:obj:`predict <rehline.plq_ElasticNet_Regressor.predict>`\ (X)
        - Predict target values as the linear decision function.


   Members
   =======

   .. py:method:: fit(X, y, sample_weight=None)

      Fit the regressor to training data.

      If ``fit_intercept=True``, a constant column (value =
      ``intercept_scaling``) is appended to the right of ``X`` before
      calling the base solver (``plqERM_ElasticNet.fit``). After solving,
      the last coefficient is split as ``intercept_`` and removed from
      ``coef_``.

      Parameters
      ----------
      X : ndarray of shape (n_samples, n_features)
          Training design matrix (dense). Sparse inputs are not supported.
      y : ndarray of shape (n_samples,)
          Target values.
      sample_weight : ndarray of shape (n_samples,), default=None
          Optional per-sample weights; forwarded to the underlying solver.

      Returns
      -------
      self : object
          Fitted estimator.


   .. py:method:: decision_function(X)

      Compute f(X) = X @ ``coef_`` + ``intercept_``.

      Parameters
      ----------
      X : ndarray of shape (n_samples, n_features)
          Input data (dense).

      Returns
      -------
      scores : ndarray of shape (n_samples,)
          Predicted real-valued scores.


   .. py:method:: predict(X)

      Predict target values as the linear decision function.

      Parameters
      ----------
      X : ndarray of shape (n_samples, n_features)
          Input data (dense).

      Returns
      -------
      y_pred : ndarray of shape (n_samples,)
          Predicted target values (real-valued).


.. py:class:: plq_Ridge_Classifier(loss, constraint=None, C=1.0, U=None, V=None, Tau=None, S=None, T=None, A=None, b=None, max_iter=1000, tol=0.0001, shrink=1, warm_start=0, verbose=0, trace_freq=100, fit_intercept=True, intercept_scaling=1.0, class_weight=None, multi_class=None, n_jobs=None)

   Bases: :py:obj:`rehline._class.plqERM_Ridge`, :py:obj:`sklearn.base.ClassifierMixin`

   Empirical Risk Minimization (ERM) Classifier with a Piecewise Linear-Quadratic (PLQ) loss
   and ridge penalty, compatible with the scikit-learn API.

   This wrapper makes ``plqERM_Ridge`` behave as a classifier:
       - Accepts arbitrary binary labels in the original label space.
       - Computes class weights on original labels (if ``class_weight`` is set).
       - Encodes labels with ``LabelEncoder`` into {0,1}, then maps to {-1,+1} for training.
       - Supports optional intercept fitting (via an augmented constant feature).
       - Provides standard methods ``fit``, ``predict``, and ``decision_function``.
       - Integrates with scikit-learn ecosystem (e.g., GridSearchCV, Pipeline).
       - Supports multiclass classification via OvR or OvO method.

   Parameters
   ----------
   loss : dict
       Dictionary specifying the loss function parameters. Examples include:
       - {'name': 'svm'}
       - {'name': 'sSVM'}
       - {'name': 'huber'}
       and other PLQ losses supported by ``plqERM_Ridge``.

   constraint : list of dict, default=[]
       Optional constraints. Each dictionary must include a ``'name'`` key.
       Examples: {'name': 'nonnegative'}, {'name': 'fair'}, {'name': 'custom'}.

   C : float, default=1.0
       Inverse regularization strength.

   _U, _V, _Tau, _S, _T : ndarray, default empty
       Parameters for the PLQ representation of the loss function.
       Typically built internally by helper functions.

   _A : ndarray of shape (K, n_features), default empty
       Linear-constraint coefficient matrix.

   _b : ndarray of shape (K,), default empty
       Linear-constraint intercept vector.

   max_iter : int, default=1000
       Maximum number of iterations for the ReHLine solver.

   tol : float, default=1e-4
       Convergence tolerance.

   shrink : int, default=1
       Shrinkage parameter for the solver.

   warm_start : int, default=0
       Whether to reuse the previous solution for initialization.

   verbose : int, default=0
       Verbosity level for the solver.

   trace_freq : int, default=100
       Frequency (in iterations) at which solver progress is traced
       when ``verbose > 0``.

   fit_intercept : bool, default=True
       Whether to fit an intercept term. If True, a constant feature column is added
       to ``X`` during training. The last learned coefficient is extracted as
       ``intercept_``.

   intercept_scaling : float, default=1.0
       Value used for the constant feature column when ``fit_intercept=True``.
       Matches the convention used in scikit-learn's ``LinearSVC``.

   class_weight : dict, 'balanced', or None, default=None
       Class weights applied like in LinearSVC:
       - 'balanced' uses n_samples / (n_classes * n_j).
       - dict maps label -> weight in the ORIGINAL label space.

   multi_class : str or list, default=[]
       Method for multiclass classification. Options:
       - 'ovo': One-vs-One, trains K*(K-1)/2 binary classifiers.
       - 'ovr': One-vs-Rest, trains K binary classifiers.
       - [ ] or ignored when only 2 classes are present.

   n_jobs : int or None, default=None
       Number of parallel jobs for multiclass fitting.
       None means 1 (serial). -1 means use all available CPUs.
       Passed directly to joblib.Parallel.


   Attributes
   ----------
   ``coef_``: ndarray of shape (n_features,) for binary, (n_estimators, n_features) for multiclass
       Coefficients of all fitted classifiers, excluding the intercept.

   ``intercept_``: float for binary, ndarray of shape (n_estimators,) for multiclass
       Intercept term(s). 0.0 if ``fit_intercept=False``.

   ``classes_`` : ndarray of shape (n_classes,)
       Unique class labels in the original label space.

   ``estimators_`` : list, only present for multiclass
       For OvR: list of (coef, intercept) tuples, length K.
       For OvO: list of (coef, intercept, cls_i, cls_j) tuples, length K*(K-1)/2.

   _label_encoder : LabelEncoder
       Encodes original labels into {0,1} for internal training.


   Overview
   ========


   .. list-table:: Methods
      :header-rows: 0
      :widths: auto
      :class: summarytable

      * - :py:obj:`fit <rehline.plq_Ridge_Classifier.fit>`\ (X, y, sample_weight)
        - Fit the classifier to training data.
      * - :py:obj:`decision_function <rehline.plq_Ridge_Classifier.decision_function>`\ (X)
        - Compute the decision function for samples in X.
      * - :py:obj:`predict <rehline.plq_Ridge_Classifier.predict>`\ (X)
        - Predict class labels for samples in X.


   Members
   =======

   .. py:method:: fit(X, y, sample_weight=None)

      Fit the classifier to training data.

      Parameters
      ----------
      X : array-like of shape (n_samples, n_features)
          Training features.

      y : array-like of shape (n_samples,)
          Target labels.

      sample_weight : array-like of shape (n_samples,), default=None
          Per-sample weights. If None, uniform weights are used.

      Returns
      -------
      self : object
          Fitted estimator.


   .. py:method:: decision_function(X)

      Compute the decision function for samples in X.

      For binary classification, returns a 1D array of scores.
      For OvR multiclass, returns a 2D array of shape (n_samples, K).
      For OvO multiclass, returns a 2D array of shape (n_samples, K*(K-1)/2).

      Using ``coef_.T`` works uniformly for both binary (n_features,) and
      multiclass (n_estimators, n_features) shapes.

      Parameters
      ----------
      X : array-like of shape (n_samples, n_features)
          Input samples.

      Returns
      -------
      ndarray of shape (n_samples,) or (n_samples, n_estimators)
          Continuous scores for each sample.


   .. py:method:: predict(X)

      Predict class labels for samples in X.
      For binary classification, thresholds the decision score at 0.
      For OvR, takes the argmax across K classifiers.
      For OvO, uses majority voting across K*(K-1)/2 classifiers.

      Parameters
      ----------
      X : array-like of shape (n_samples, n_features)
          Input samples.

      Returns
      -------
      y_pred : ndarray of shape (n_samples,)
          Predicted class labels in the original label space.


.. py:class:: plq_Ridge_Regressor(loss=None, constraint=None, C=1.0, U=None, V=None, Tau=None, S=None, T=None, A=None, b=None, max_iter=1000, tol=0.0001, shrink=1, warm_start=0, verbose=0, trace_freq=100, fit_intercept=True, intercept_scaling=1.0)

   Bases: :py:obj:`rehline._class.plqERM_Ridge`, :py:obj:`sklearn.base.RegressorMixin`

   Empirical Risk Minimization (ERM) regressor with a Piecewise Linear-Quadratic (PLQ) loss
   and a ridge penalty, implemented as a scikit-learn compatible estimator.

   This wrapper adds standard sklearn conveniences while delegating loss/constraint construction
   to :class:`plqERM_Ridge` (via `_make_loss_rehline_param` / `_make_constraint_rehline_param`).

   Notes
   -----
   - **Intercept handling**: if ``fit_intercept=True``, a constant column (value = ``intercept_scaling``)
     is appended to the right of the design matrix before calling the base solver. The last learned
     coefficient is then split out as ``intercept_``.
     → The column indices of the original features remain; therefore, ``sen_idx`` in the constraint ``fair`` follow the original index.
   - **Constraint handling**: constraints are passed through unchanged; the base class will call
     ``_make_constraint_rehline_param(constraint, X, y)`` on the matrix given to `fit`.
     With your updated implementation, ``fair`` must be specified as
     ``{'name': 'fair', 'sen_idx': list[int], 'tol_sen': list[float]}``.

   Parameters
   ----------
   loss : dict, default={'name': 'QR', 'qt': 0.5}
       PLQ loss configuration (e.g., median Quantile Regression). Examples:
       ``{'name': 'QR', 'qt': 0.5}``, ``{'name': 'huber', 'tau': 1.0}``,
       ``{'name': 'SVR', 'epsilon': 0.1}``.
       Required keys depend on the chosen loss and are consumed by the underlying solver.
   constraint : list of dict, default=[]
       Constraint specifications. Supported by your updated `_make_constraint_rehline_param`:
         - ``{'name': 'nonnegative'}`` or ``{'name': '>=0'}``
         - ``{'name': 'fair', 'sen_idx': list[int], 'tol_sen': list[float]}``
         - ``{'name': 'custom', 'A': ndarray[K, d], 'b': ndarray[K]}``

       Note: when ``fit_intercept=True``, a constant column is appended **as the last column**;
       since you index sensitive columns by ``sen_idx`` on the *original* features, indices stay valid.
   C : float, default=1.0
       Regularization parameter (absorbed by ReHLine parameters inside the solver).
   _U, _V, _Tau, _S, _T : ndarray, default empty
       Advanced PLQ parameters for the underlying ReHLine formulation (usually left as defaults).
   _A, _b : ndarray, default empty
       Optional linear constraint matrices (used only if ``constraint`` contains ``{'name': 'custom'}``).
       (Your `_make_constraint_rehline_param` is responsible for validating their shapes.)
   max_iter : int, default=1000
       Maximum iterations for the ReHLine solver.
   tol : float, default=1e-4
       Convergence tolerance for the ReHLine solver.
   shrink : int, default=1
       Shrink parameter passed to the solver (see solver docs).
   warm_start : int, default=0
       Warm start flag passed to the solver (see solver docs).
   verbose : int, default=0
       Verbosity for the solver (0: silent).
   trace_freq : int, default=100
       Iteration frequency to trace solver internals (if ``verbose`` is enabled).
   fit_intercept : bool, default=True
       If ``True``, append a constant column (value = ``intercept_scaling``) to the design matrix
       before calling the solver. The learned last coefficient is then split as ``intercept_``.
   intercept_scaling : float, default=1.0
       Scaling applied to the appended constant column when ``fit_intercept=True``.

   Attributes
   ----------
   ``coef_`` : ndarray of shape (n_features,)
       Learned linear coefficients (excluding the intercept term).
   ``intercept_`` : float
       Intercept term extracted from the last coefficient when ``fit_intercept=True``, otherwise 0.0.
   ``n_features_in_`` : int
       Number of input features seen during :meth:`fit` (before intercept augmentation).

   Notes
   -----
   This estimator **does not support sparse input**. If you need sparse support, convert inputs to dense
   or wrap this estimator in a scikit-learn :class:`~sklearn.pipeline.Pipeline` with a transformer that
   densifies data (at the cost of memory).


   Overview
   ========


   .. list-table:: Methods
      :header-rows: 0
      :widths: auto
      :class: summarytable

      * - :py:obj:`fit <rehline.plq_Ridge_Regressor.fit>`\ (X, y, sample_weight)
        - If ``fit_intercept=True``, a constant column (value = ``intercept_scaling``) is appended
      * - :py:obj:`decision_function <rehline.plq_Ridge_Regressor.decision_function>`\ (X)
        - Compute f(X) = X @ ``coef_`` + ``intercept_``.
      * - :py:obj:`predict <rehline.plq_Ridge_Regressor.predict>`\ (X)
        - Predict targets as the linear decision function.


   Members
   =======

   .. py:method:: fit(X, y, sample_weight=None)

      If ``fit_intercept=True``, a constant column (value = ``intercept_scaling``) is appended
      to the **right** of ``X`` before calling the base solver. The base class
      (:class:`plqERM_Ridge`) will construct the loss and constraints via its internal helpers
      on the matrix passed here. After solving, the last coefficient is split as
      ``intercept_`` and removed from ``coef_``.

      Parameters
      ----------
      X : ndarray of shape (n_samples, n_features)
          Training design matrix (dense). Sparse inputs are not supported.
      y : ndarray of shape (n_samples,)
          Target values.
      sample_weight : ndarray of shape (n_samples,), default=None
          Optional per-sample weights; forwarded to the underlying solver.

      Returns
      -------
      self : object
      Fitted estimator.


   .. py:method:: decision_function(X)

      Compute f(X) = X @ ``coef_`` + ``intercept_``.

      Parameters
      ----------
      X : ndarray of shape (n_samples, n_features)
          Input data (dense). Must have the same number of features as seen in :meth:`fit`.

      Returns
      -------
      scores : ndarray of shape (n_samples,)
          Predicted real-valued scores.


   .. py:method:: predict(X)

      Predict targets as the linear decision function.

      Parameters
      ----------
      X : ndarray of shape (n_samples, n_features)
          Input data (dense).

      Returns
      -------
      y_pred : ndarray of shape (n_samples,)
          Predicted target values (real-valued).


Functions
---------
.. py:function:: ReHLine_solver(X, U, V, Tau=None, S=None, T=None, A=None, b=None, rho=None, Lambda=None, Gamma=None, xi=None, mu=None, max_iter=1000, tol=0.0001, shrink=1, verbose=1, trace_freq=100)

.. py:function:: make_mf_dataset(n_users, n_items, n_factors=20, n_interactions=None, density=0.01, noise_std=0.1, seed=None, rating_min=1.0, rating_max=5.0, return_params=True)

   Generate synthetic rating data using matrix factorization model.

   Creates synthetic user-item rating data based on the matrix factorization
   approach commonly used in recommender systems. The ratings are generated
   as: rating = mu + user_bias + item_bias + user_factor * item_factor + noise

   Parameters
   ----------
   n_users : int
       Number of users in the synthetic dataset

   n_items : int
       Number of items in the synthetic dataset

   n_factors : int, default=20
       Number of latent factors for user and item embeddings

   n_interactions : int, optional
       Exact number of user-item interactions. If None, calculated as density * total_pairs

   density : float, default=0.01
       Density of the rating matrix (ignored if n_interactions is specified)

   noise_std : float, default=0.1
       Standard deviation of Gaussian noise added to ratings

   seed : int, optional
       Random seed for reproducible results

   rating_min : float, default=1.0
       Minimum possible rating value

   rating_max : float, default=5.0
       Maximum possible rating value

   return_params : bool, default=True
       If True, returns the underlying model parameters (P, Q, bu, bi, mu)

   Returns
   -------
   dict
       Dictionary containing:

       - **X** : ndarray of shape (n_interactions, 2)
           User-item pairs where X[:, 0] are user indices and X[:, 1] are item indices
       - **y** : ndarray of shape (n_interactions,)
           Synthetic ratings for each user-item pair
       - **params** : dict, optional
           Only returned if return_params=True. Contains:

           * **P** : ndarray of shape (n_users, n_factors)
               User factor matrix
           * **Q** : ndarray of shape (n_items, n_factors)
               Item factor matrix
           * **bu** : ndarray of shape (n_users,)
               User biases
           * **bi** : ndarray of shape (n_items,)
               Item biases
           * **mu** : float
               Global mean rating

   Notes
   -----
   The rating generation follows the standard matrix factorization model:

       r_ui = μ + b_u + b_i + p_u · q_i^T + ε

       where ε ~ N(0, noise_std²)

   The generated ratings are clipped to stay within [rating_min, rating_max] range.


.. py:function:: CQR_Ridge_path_sol(X, y, *, quantiles, eps=1e-05, n_Cs=50, Cs=None, max_iter=5000, tol=0.0001, verbose=0, shrink=1, warm_start=False, return_time=True)

   Compute the regularization path for Composite Quantile Regression (CQR) with ridge penalty.

   This function fits a series of CQR models using different values of the regularization parameter `C`.
   It reuses a single estimator and modifies `C` in-place before refitting.

   Parameters
   ----------
   X : ndarray of shape (n_samples, n_features)
       Feature matrix.

   y : ndarray of shape (n_samples,)
       Response vector.

   quantiles : list of float
       Quantile levels (e.g. [0.1, 0.5, 0.9]).

   eps : float, default=1e-5
       Log-scaled lower bound for generated `C` values (used if `Cs` is None).

   n_Cs : int, default=50
       Number of `C` values to generate.

   Cs : array-like or None, default=None
       Explicit values of regularization strength. If None, use `eps` and `n_Cs` to generate them.

   max_iter : int, default=5000
       Maximum number of solver iterations.

   tol : float, default=1e-4
       Solver convergence tolerance.

   verbose : int, default=0
       Verbosity level.

   shrink : float, default=1
       Shrinkage parameter passed to solver.

   warm_start : bool, default=False
       Use previous dual solution to initialize the next fit.

   return_time : bool, default=True
       Whether to return a list of fit durations.

   Returns
   -------
   Cs : ndarray
       List of regularization strengths.

   models : list
       List of fitted model objects.

   coefs : ndarray of shape (n_Cs, n_quantiles, n_features)
       Coefficient matrices per quantile and `C`.

   intercepts : ndarray of shape (n_Cs, n_quantiles)
       Intercepts per quantile and `C`.

   fit_times : list of float, optional
       Elapsed fit times (if `return_time=True`).


   Example
   -------
   >>> from sklearn.datasets import make_friedman1
   >>> from sklearn.preprocessing import StandardScaler
   >>> import numpy as np
   >>> from rehline import CQR_Ridge_path_sol

   >>> # Generate the data
   >>> X, y = make_friedman1(n_samples=500, n_features=6, noise=1.0, random_state=42)
   >>> X = StandardScaler().fit_transform(X)
   >>> y = y / y.std()

   >>> # Set quantiles and Cs
   >>> quantiles = [0.1, 0.5, 0.9]
   >>> Cs = np.logspace(-5, 0, 30)

   >>> # Fit CQR path
   >>> Cs, models, coefs, intercepts, fit_times = CQR_Ridge_path_sol(
   ...     X, y,
   ...     quantiles=quantiles,
   ...     Cs=Cs,
   ...     max_iter=100000,
   ...     tol=1e-4,
   ...     verbose=1,
   ...     warm_start=True,
   ...     return_time=True
   ... )


.. py:function:: plqERM_Ridge_path_sol(X, y, *, loss, constraint=None, eps=0.001, n_Cs=100, Cs=None, max_iter=5000, tol=0.0001, verbose=0, shrink=1, warm_start=False, return_time=True)

   Compute the PLQ Empirical Risk Minimization (ERM) path over a range of regularization parameters.
   This function evaluates the model's performance for different values of the regularization parameter
   and provides structured benchmarking output.

   Parameters
   ----------
   X : ndarray of shape (n_samples, n_features)
       Training input samples.

   y : ndarray of shape (n_samples,)
       Target values corresponding to each input sample.

   loss : dict
       Dictionary describing the PLQ loss function parameters. Used to construct the loss object internally.

   constraint : list of dict, optional (default=[])
       List of constraints applied to the optimization problem. Each constraint should be represented
       as a dictionary compatible with the solver.


   eps : float, default=1e-3
       Defines the length of the regularization path when `Cs` is not provided.
       The values of `C` will range from `10^log10(eps)` to `10^-log10(eps)`.

   n_Cs : int, default=100
       Number of regularization values to evaluate if `Cs` is not provided.

   Cs : array-like of shape (n_Cs,), optional
       Explicit values of regularization strength `C` to use. If `None`, the values are generated
       logarithmically between 1e-2 and 1e3.

   max_iter : int, default=5000
       Maximum number of iterations allowed for the optimization solver at each `C`.

   tol : float, default=1e-4
       Tolerance for solver convergence.

   verbose : int, default=0
       Controls verbosity level of output. Set to higher values (e.g., 1 or 2) for detailed progress logs.
       When verbose = 1, only print path results table;
       when verbose = 2, print path results table and path solution plot.

   shrink : float, default=1
       Shrinkage factor for the solver, potentially influencing convergence behavior.

   warm_start : bool, default=False
       If True, reuse the previous solution to warm-start the next solver step, speeding up convergence.

   return_time : bool, default=True
       If True, return timing information for each value of `C`.

   Returns
   -------
   Cs : ndarray of shape (n_Cs,)
       Array of regularization parameters used in the path.

   times : list of float
       Time in seconds taken to fit the model at each `C`. Returned only if `return_time=True`.

   n_iters : list of int
       Number of iterations used by the solver at each regularization value.

   obj_values : list of float
       Final objective values (including loss and regularization terms) at each `C`.

   L2_norms : list of float
       L2 norm of the coefficients (excluding bias) at each `C`.

   coefs : ndarray of shape (n_features, n_Cs)
       Learned model coefficients at each regularization strength.

   Example
   -------

   >>> # generate data
   >>> np.random.seed(42)
   >>> n, d, C = 1000, 5, 0.5
   >>> X = np.random.randn(n, d)
   >>> beta0 = np.random.randn(d)
   >>> y = np.sign(X.dot(beta0) + np.random.randn(n))
   >>> # define loss function
   >>> loss = {'name': 'svm'}
   >>> Cs = np.logspace(-1,3,15)
   >>> constraint = [{'name': 'nonnegative'}]


   >>> # calculate
   >>> Cs, times, n_iters, losses, norms, coefs = plqERM_Ridge_path_sol(
   ...     X, y, loss=loss, Cs=Cs, max_iter=100000,tol=1e-4,verbose=2,
   ...     warm_start=False, constraint=constraint, return_time=True
   ... )