--- title: "Solving Linear Regression using Numeric Optimization" author: James Joseph Balamuta and Dirk Eddelbuettel abstract: | In this vignette, we describe how to use RcppEnsmallen in a standalone C++ file. output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Solving Linear Regression using Numeric Optimization} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` # Overview `RcppEnsmallen` package provides an embedded copy of the [`ensmallen`](https://www.ensmallen.org/docs.html) C++ library of optimization functions. Optimizers contained within are state of the art and possess a high level of code quality. Each optimizer must be accessed through _C++_ by implementing the appropriate objective functions and, then, surfaced into _R_ through [`RcppArmadillo`](https://cran.r-project.org/package=RcppArmadillo). Alternatively, work has been done by Dirk Schumacher in [`armacmp`](https://github.com/dirkschumacher/armacmp) to automatically create the underlying _C++_ code from _R_. **Note:** Optimizers in `RcppEnsmallen` only work with [`armadillo`](https://arma.sourceforge.net/docs.html) data structures. Thus, if using [`Eigen`](https://eigen.tuxfamily.org/index.php?title=Main_Page) through [`RcppEigen`](https://cran.r-project.org/package=RcppEigen), please consider the [`RcppNumerical`](https://cran.r-project.org/package=RcppNumerical) package. ## Linear Regression Consider the **Residual Sum of Squares**, also known as **RSS**, defined as: $$RSS\left( \beta \right) = \left( { \mathbf{y} - \mathbf{X} \beta } \right)^{\top} \left( \mathbf{y} - \mathbf{X} \beta \right)$$ The objective function we wish to minimize would be defined as: $$f(\beta) = \rVert \mathbf{y} - \mathbf{X}\beta\lVert_2$$ The gradient is defined as: $$\frac{\partial RSS}{\partial \beta} = -2 \mathbf{X}^{\top} \left(\mathbf{y} - \mathbf{X} \beta \right)$$ ### Two-Step Implementation When using `ensmallen` to solve this problem, we must create a _C++_ class that computes both the objective function value and its gradient value either together or separately under member functions named as: - `Evaluate()`: Value of the objective function under the parameters. - `Gradient()`: Convergence to the correct value under the given parameters. - `EvaluateWithGradient()`: Perform both steps at the same time. (Optional) In the Linear Regression scenario, we will define each step separately to emphasize the calculation occurring. Generally, the one step `EvaluateWithGradient()` function will be faster than the two step variant. More details on design can be found on [`ensmallen` documentation page for differentiable functions](https://www.ensmallen.org/docs.html#differentiable-functions). Before writing the class, `RcppEnsmallen` requires accessing the library in a standalone C++ file with the follow include and Rcpp Attribute declarations: ```{Rcpp header, eval = FALSE} #include // [[Rcpp::depends(RcppEnsmallen)]] ``` The overaching Linear regression class should be constructed as follows: ```{Rcpp classdef, eval = FALSE} #include // [[Rcpp::depends(RcppEnsmallen)]] // Define a differentiable objective function by implementing both Evaluate() // and Gradient() separately. class LinearRegressionFunction { public: // Construct the object with the given the design // matrix and responses. LinearRegressionFunction(const arma::mat& X, const arma::vec& y) : X(X), y(y) { } // Return the objective function for model parameters beta. double Evaluate(const arma::mat& beta) { return std::pow(arma::norm(y - X * beta), 2.0); } // Compute the gradient for model parameters beta void Gradient(const arma::mat& beta, arma::mat& g) { g = -2 * X.t() * (y - X * beta); } private: // The design matrix. const arma::mat& X; // The responses to each data point. const arma::vec& y; }; ``` From there: 1. Construct a _C++_ function that exports into _R_. 2. Within the function, determine an appropriate optimizer for the problem. 3. Combine the optimizer with the linear regression class to compute the solution to the problem. **Note:** Make sure to have the definition of the Linear Regression class in the same _C++_ file as the exported _C++_ function into _R_. ```{Rcpp linreg, eval = FALSE} // [[Rcpp::export]] arma::mat lin_reg_lbfgs(const arma::mat& X, const arma::vec& y) { // Construct the first objective function. LinearRegressionFunction lrf(X, y); // Create the L_BFGS optimizer with default parameters. // The ens::L_BFGS type can be replaced with any ensmallen optimizer that can // handle differentiable functions. ens::L_BFGS lbfgs; lbfgs.MaxIterations() = 10; // Create a starting point for our optimization randomly. // The model has p parameters, so the shape is p x 1. arma::mat beta(X.n_cols, 1, arma::fill::randn); // Run the optimization lbfgs.Optimize(lrf, beta); return beta; } ``` ### Verifying Results Prior to using the new optimizer in mission critical work, compare the results to methods already implemented in _R_. The best way to achieve this is to create an oracle model by specifying the parameters known to generate data and, then, try to recover them. Moreover, if a method is already implemented in _R_ feel free to try to check the result equality within an appropriate tolerance threshold. Following with this methodology, data must be generated. ```{r data-generation} n <- 1e6 beta <- c(-2, 1.5, 3, 8.2, 6.6) p <- length(beta) X <- cbind(1, matrix(rnorm(n), ncol = p - 1)) y <- X %*% beta + rnorm(n / (p - 1)) ``` Next, the optimization procedure is used to estimate the parameters of interest. Under this example, the results of the estimation can be compared to `lm()`. That said, `lm()` may have more precise results when compared against the optimizer as it is implemented with a closed-form solution to linear regression plus the computational is performed more rigorously. ```{r data-estimation, eval = FALSE} coefs_lbfgs <- lin_reg_lbfgs(X, y) coefs_lm <- lm.fit(X, y)$coefficients ``` ```{r echo = FALSE} coefs_lbfgs <- RcppEnsmallen::lin_reg_lbfgs(X, y) coefs_lm <- lm.fit(X, y)$coefficients compare_coefs = cbind(coefs_lbfgs, coefs_lm) colnames(compare_coefs) = c("LBFGS", "LM") rownames(compare_coefs) = paste0("Beta", seq_len(nrow(compare_coefs))) knitr::kable(compare_coefs, longtable = FALSE, caption = "Comparison of Estimated Coefficients") ```