Shape-Restricted Regression Splines with R Package splines2

Splines are important tools for the ﬂexible modeling of curves and surfaces in regression analy-ses. Functions for constructing spline basis functions are available in R through the base package splines . When the curves to be modeled have known characteristics in monotonicity or curvature, more eﬃcient statistical inferences are possible with shape-restricted splines. Such splines, however, are not available in the R package splines . The package splines2 provides easy-to-use shape-restricted spline basis functions, along with their derivatives and integrals which are important tools in many inference scenarios. It also provides additional splines and features that are not available in the splines package, such as periodic splines and generalized Bernstein polynomials. The usages of the functions are illustrated with shape-restricted regression, recurrent event data analysis, and extreme-value copulas.


Introduction
Splines are piecewise polynomials that are continuously differentiable up to a certain degree, connected at a sequence of breakpoints called knots. A spline function can be represented by a linear combination of spline basis functions. A variety of spline-based regression methods such as smoothing splines, penalized splines (Eilers and Marx, 1996), and shape-restricted splines (e.g., Meyer, 2008;Wang and Ghosh, 2012) have been widely used for nonparametric regression problems. Non-uniform B-splines are often used as the basis functions in spline-based methods. Besides B-splines, I-splines (Ramsay, 1988) as integrals of M-splines (Curry and Schoenberg, 1966) are used for monotone regressions; C-splines (Meyer, 2008) as integrals of I-splines are used for shape-restricted regressions. In some applications, such as time-to-event analysis, the derivatives or integrals of the basis functions are needed in inferences. A user-friendly and computationally efficient implementation to construct the desired spline basis functions, along with their derivatives or integrals, is necessary for routine uses of spline-based methods.
There are several existing implementations providing various spline basis functions and their derivatives/integrals in R. We focus on those R packages that are available on the Comprehensive R Archive Network (CRAN). The base R package splines (R Core Team, 2020) provides functions to construct B-splines, their derivatives, and natural cubic splines. Packages ibs (Chen, 2018) and pbs (Wang, 2013) provide the integral of B-splines and periodic B-splines, respectively. The orthogonalsplinebasis package (Redd, 2015) provides functions for orthogonal B-spline basis functions and their derivatives and integrals. See also Perperoglou et al. (2019) for a recent review of other packages that contain spline-based regression routines.
The package splines2 is intended to be a comprehensive, efficient supplement to the package splines. Since its first appearance on CRAN in September 2016, splines2 has been actively developed and maintained over the years. The package contains several attractive features. First, it provides spline basis functions that are not available in splines. Such splines include M-splines, I-splines, and C-splines for shape-restricted regression; generalized Bernstein polynomials; and periodic splines. Second, it generalizes and refines the implementation of the B-splines and natural cubic spline basis functions in splines. It allows basis functions of degree zero, which seems to be trivial but is particularly useful for certain applications (Shea and Torgovitsky, 2020;Mogstad et al., 2018). The implementation of natural cubic splines utilizes a closed-form null space, which is computationally more efficient than using the QR decomposition as done in splines. The basis functions generated this way are nonnegative within the given boundary, which makes it trivial to impose nonnegativity on the resulting spline function for some applications. The third distinct feature of splines2 is that derivatives and integrals of all the basis functions (except the integrals of C-splines) are provided, which can be necessary for some applications such as time-to-event modeling; the package splines only contains derivatives but not integrals of B-splines. Lastly, the under-the-hood C++ implementation based on the packages Rcpp (Eddelbuettel, 2013) and RcppArmadillo (Eddelbuettel and Sanderson, 2014) are made accessible from C++. See Section 3 for details. The C++ engine is also why splines2 is computationally more efficient than a few existing packages in handling common tasks such as integration.
The rest of the article is organized as follows. In Section 2, we provide a practical review to those spline basis functions implemented in splines2 with a focus on M-splines, I-splines, C-splines, and periodic splines. The usage of the main functions in the package is summarized and illustrated in Section 3. In Section 4, we demonstrate the applications of the package to shape-restricted regression, time-to-event data analysis, and extreme-value copulas, respectively. Section 5 concludes with a discussion. An introduction to generalized Bernstein polynomials, B-splines, and natural cubic splines, as well as code for micro-benchmarks, are available in the Supplementary Materials.

Knot Sequences
It is necessary to define knot sequences before introducing spline basis functions. Suppose m (m 0) distinct internal knots, ξ 1 , . . . , ξ m , are placed within the boundary knots L and U satisfying L < ξ 1 < · · · < ξ m < U. Define t 1 = · · · = t d = L, t d+j = ξ j , j ∈ {1, . . . , m}, and U = t d+m+1 = · · · = t d+p , where k and d = k + 1 represent the polynomial degree and the order of a basis function, respectively, and p = d +m represent the degrees of freedom of a basis function. Such a nondecreasing sequence t 1 , . . . , t d+p is called the simple knot sequence by Ramsay (1988) or the clamped knot vector by Piegl and Tiller (1997), where each boundary knot is repeated d times (to allow basis functions to be discontinuous at the boundary) while each interior knot appears only once. Given internal knots, ξ 1 , . . . , ξ m , and boundary knots, L and U , we denote the simple knot sequence intended for splines of degree k by s k . Splines of degree k ∈ {1, 2, . . .} from the simple knot sequence s k are continuously differentiable at internal knots up to k − 1 times, which is the most commonly used definition of splines.
More generally, a knot is called an r-fold knot (Prautzsch et al., 2002, Chapter 5) or to have multiplicity r if the knot appears r times in the knot sequence, where r k is a positive integer. Multiplicities induce discontinuities. For example, B-splines are continuous up to the (k − r)th derivative at the internal knot ξ j that has multiplicity r. Further, we can relax t 1 = · · · = t d = L and t d+m+1 = · · · = t d+p = U , respectively, to t 1 · · · t d = L and U = t d+m+1 · · · t d+p . Such a generalized knot sequence, t 1 · · · t d = L < t d+1 · · · t d+m < U = t d+m+1 · · · t d+p , is referred to as the extended partition in Schumaker (2007), where m represents the total multiplicities of internal knots instead of the number of distinct internal knots. The degrees of freedom of the resulting spline basis functions equal the length of the knot sequence minus the order.

M-Splines, I-Splines, and C-Splines
M-splines are also called Curry-Schoenberg B-splines in De Boor (1978). Given a simple knot sequence s k , the ith M-spline basis function (Curry and Schoenberg, 1966) of degree k denoted by M i,k (x | s k ) can be considered as the normalized B-spline basis function satisfying so that Furthermore, the first derivative of M i,k (x | s k ) is as follows: The M-splines integrated from t 1 to x were referred to as I-splines in Ramsay (1988). To be more specific, the ith I-spline basis function of degree k denoted by where t l+k+2 and t l+1 are defined for the knot sequence s k+1 . M-splines are nonnegative for L x U . Hence, I-splines are monotonically nondecreasing from L to U by definition. Ramsay (1988) proposed using I-splines for monotone regression. A monotonically nondecreasing (nonincreasing) function can be fitted by a linear combination of I-splines and an additional intercept term, where the monotonicity is ensured by constraining the coefficients of I-splines to be nonnegative (nonpositive).
For regressions with curvature restrictions, Meyer (2008) introduced a class of convex splines known as C-splines. By integrating both sides of (3) for I-splines of degree k + 1, we obtain the formula for the integral of the ith I-spline basis function denoted byC i,k (x | s k ) as follows: We define the ith C-spline basis function denoted by C i,k (x | s k ) to be the scaledC i,k (x | s k ), so that C i,k (U | s k ) = 1 for i ∈ {1, . . . , p}. A convex or concave function can be approximated by a linear combination of C-splines, an identity function f (x) = x, and an intercept term, where the coefficients of C-splines are constrained to be nonnegative or nonpositive to guarantee convexity or concavity, respectively. A set of M-splines and the corresponding I-splines and C-splines are visualized in the first row of Figure 1.

Periodic Splines
We define a spline function s(x) of degree k to be periodic with a cyclic interval from L to U if it satisfies that s( In applications to computer-aided design (CAD), periodic basis functions are used to form closed curves or surfaces (e.g., Farin and Hansford, 2000).
Given that m k − 1, Piegl and Tiller (1997, Chapter 12) gave two sufficient conditions for a B-spline function of degree k to be periodic with a cyclic interval from L to U as follows: 1. The underlying B-spline basis functions are based on an extended knot sequence s k satisfying where t d = L and t d+m+1 = U ; 2. The first k coefficients and the last k coefficients must coincide.
The first condition essentially provides a simple procedure to construct an extended knot vector from the given m internal knots t d+1 , . . . , t d+m , and a cyclic interval from L to U . For the B-splines {B 1 , . . . , B m+d } resulting from the extended knot sequence satisfying the first condition, we incorporate the second condition by summing up the first k and the last k basis functions, respectively, to a new set of k basis functions {B 1 + B m+2 , . . . , B k + B m+d }, which along with the remaining basis functions {B k+1 , . . . , B m+1 } (if m k) form a set of periodic B-spline basis functions. A periodic spline function can then be simply represented by a linear combination of these basis functions.
Such a procedure can also be applied directly to construct periodic M-spline basis functions. The constructed periodic M-splines inherit the normalization property of regular M-splines and have unit integrals from L to U . It is thus easier to compute the integrals of the basis functions from L to any x U . For instance, a set of periodic M-splines and the corresponding first derivatives and integrals are visualized in the second row of Figure 1.

Usage
Package splines2 provides implementation of those spline basis functions reviewed in Section 2 and the Supplementary Materials. The function interfaces are designed to be similar and consistent to the function bs() of splines so that no unnecessary learning curve is required for those users familiar with this package. The common arguments are x, df, knots, degree, intercept, and Boundary.knots, all of which have the same meaning and equivalent default values as bs() except that the default value of the argument intercept is TRUE for iSpline() and cSpline(). The x represents the predictor variable and allows NA's. Either df or knots can be specified to adjust the degrees of freedom of the basis functions given their polynomial degree specified by degree. The function bSpline() from splines2 extends the bs() function by allowing degree = 0 for piecewise constant basis functions. If intercept = TRUE, the complete set of basis functions will be returned. Otherwise, the first basis function will be excluded from the returned matrix. Table 1 gives a summary of the argument list of the main functions compared to the bs() function. The argument derivs takes a nonnegative integer representing the order of derivatives of the spline basis functions, while the argument integral takes a boolean value and the integrals of the basis functions will be returned if integral = TRUE is specified. For C-splines, the argument scale takes a boolean value indicating if the scaling of the integral of the I-spline basis functions in (4) is needed.
The package also provides several S3 methods for the basis matrix returned by any of the main functions. More specifically, we implemented a series of predict() methods to evaluate the basis functions at any (new) x specified by its second argument newx following the method predict.bs() of splines. In addition, the package provides deriv() methods for the derivatives of each basis function. The order of the derivatives can be specified by its second argument derivs. The knots() methods retrieve the placed internal or boundary knots, which can be useful when the knots are determined automatically based on the input x.
We created various example spline basis functions in the following code chunk by using the package splines2. We fixed the degree of the basis functions to be 3 and placed the boundary knots at 0 and 2. For all the spline basis functions, we placed the internal knots at 0.3, 0.7, 1.2, and 1.5. The resulting basis functions are visualized by distinct colors and line types in Figure 1, where the placement of the internal knots is indicated by gray dashed vertical lines.
In addition to the interface in R, the package splines2 includes a C++ header-only library, which allows the construction of spline basis matrices directly in C++ with the help of the packages Rcpp and RcppArmadillo. The Rcpp interface is intended for package developers who would like to use the package at C++ level. An introduction is available in one of the package vignettes (accessible by vignette("splines2-wi-rcpp")).

Shape-Restricted Regression
Shape constraints in terms of monotonicity or curvature arise naturally in many applications. Examples are growth curves in population health, dose-response models in phase I clinical trials, and utility functions in economics. Consider a general nonparametric regression model E(Y | X) = f (X), where X and Y are R-valued predictor and response variables, respectively. The observed data are independent pairs of (x i , y i ), i ∈ {1, . . . , n}. The true function f (·) is assumed to be continuous with certain known shape constraints. We illustrate how to apply I-splines and C-splines to shape-restricted regression following Ramsay (1988) and Meyer (2008), respectively, to estimate f (·).
We applied mono_fit() to the age-income data used in Ruppert et al. (2003) and Meyer (2008). The dataset consists of 205 Canadian workers and is available in the SemiPar package (Wand, 2018). A nondecreasing relationship between the age of workers and the expected logarithm of the income was considered. In the following code chunk, we fitted three nondecreasing curves by I-splines and specified the degrees of freedom of the basis functions to be 6, 10, and 14, respectively. data(age.income, package = "SemiPar") dfs <-c(6, 10, 14) mfits <-lapply(dfs, function(d) { with(age.income, mono_fit(age, log.income, df = d)) }) A convex or concave curve can be fitted by a linear combination of C-spline basis functions, an identity function, and an intercept term, where the coefficients of the C-splines are constrained to be nonnegative or nonpositive, respectively (Meyer, 2008). For convex regression, we estimate the coefficients by constrained least squares to minimize the squared error loss, where the y i 's are the realizations of the Y i 's, θ = (α 0 , α 1 , β ) , α 0 is the intercept, α 1 is the coefficient of the linear term, and C(x i ) is a vector that consists of C-splines evaluated at x i . Define x = (x 1 , . . . , x n ) and X = 1 x C(X) , where C(X) = C(x 1 ) · · · C(x n ) . Problem (7), again, is a quadratic programming problem with d = y X, b 0 = 0, D = X X, and A = 0 0 I p , where I p is an identity matrix of size p × p and p is the degrees of freedom of the C-spline basis functions. The sign of A is flipped for concave regression. A combination of the curvature constraint and the monotonicity constraint can be achieved by introducing an additional constraint in terms of the first derivative at one of the boundary knots. For instance, to fit a nondecreasing convex (or nonincreasing concave) curve, it suffices to additionally impose the constraint that the first derivative of the curve is nonnegative (or nonpositive) at the left boundary. Similarly, to fit a nondecreasing concave (or nonincreasing convex) curve, we can additionally impose the constraint that the first derivative of the curve is nonnegative (or nonpositive) at the right boundary. Given the first derivatives of the Csplines, it is straightforward to add the derivative constraint at the boundary to the original quadratic programming problem for the curvature constraints only. We applied cSpline() and implemented the estimation procedure named curv_fit() in the following code chunk: ## curvature-restricted regression curv_fit <-function(x, y, monotone = c("none", "increasing", "decreasing"), curvature = c("convex", "concave"), ...) { monotone <-match.arg(monotone); curvature <-match.arg(curvature) min_x <-min(x); range_x <-max(x) -min_x x_mat <-cSpline(x, ..., intercept = TRUE) # create C-splines if (monotone != "none") { right_side <-(monotone == "decreasing" && curvature == "convex") || (monotone == "increasing" && curvature == "concave") side_knot <-knots(x_mat, "boundary")[right_side + 1] dx_mat <-cbind(1 / range_x, deriv(predict(x_mat, side_knot))) } x_mat <-cbind("(Linear)" = (x -min_x) / range_x, x_mat) standardized <-standardize(x_mat, y) A_mat <-rbind(0, diag(ncol(x_mat) -1L)) if (curvature == "concave") A_mat <--A_mat if (monotone == "decreasing") dx_mat <--dx_mat if (monotone != "none") { for (j in seq_len(ncol(dx_mat))) dx_mat[, j] <-dx_mat[, j] / standardized$x_sd[j] A_mat <-cbind(A_mat, as.numeric(dx_mat)) } qp_solve(standardized, A_mat, x_mat) } Similarly as before, we standardized each C-spline basis function and the identity term by standardize(). The linear constraints for the desired curvature (and monotonicity) can be applied to the standardized basis functions and the standardized linear term without any changes. The coefficient estimates are obtained by solving the quadratic programming problem and scaling in the helper function qp_solve(). Now suppose there is a concave relationship between age and the expected logarithm of the income. We can apply the function curv_fit() to the age-income data as follows: cfits <-lapply(dfs, function(d) { with(age.income, curv_fit(age, log.income, "none", "concave", df = d)) }) The fitted nondecreasing and concave curves are visualized, respectively, in the left and right panel of Figure 2, from which we may find that despite having different degrees of freedom, the fitted curves stay closely to each other in each panel. It implies that with shape constraints, the fitted curves are robust with respect to the choice of the degrees of freedom and the placement of the internal knots. We also fitted a natural cubic smoothing spline function (via stats::smooth.spline()) to the age-income data for comparison. The tuning parameter of the smoothing spline was selected by generalized cross-validation. The fitted smoothing spline is visualized in both panels of Figure 2, which is neither nondecreasing nor convex without shape constraints.
To demonstrate simultaneous monotone and curvature constraints, as a second example, we applied mono_fit() and curv_fit() to the onions dataset that is also available in the SemiPar package. The degrees of freedom of the I-splines and C-splines considered were similarly 6, 10, and 14. data(onions, package = "SemiPar") mfits <-lapply(dfs, function(d) { with(onions, mono_fit(dens, log(yield), "decreasing", df = d)) }) cfits <-lapply(dfs, function(d) { with(onions, curv_fit(dens, log(yield), "decreasing", "convex", df = d)) }) It is assumed that the logarithm of the onion yield (grams per plant) decreases as the areal density of plants (plants per square meter) increases. We additionally assumed a convex relationship between log(yield) and density, which means that the decreasing rate of log(yield) did not increase as the density increased. The fitted curves with monotonicity constraint are only visualized in the left panel of Figure 3 and the fitted curves with additional convex constraints are visualized in the right panel of Figure 3. With additional curvature constraints, the convex curves fitted by C-splines of different degrees of freedom stay more closely to each other than the nonincreasing curves fitted by I-splines. Similarly, we fitted a natural cubic smoothing spline and visualized it in both panels of Figure 3 for comparison, from which we found that the fitted convex curves were very close to the fitted smoothing spline function overall.

Time-to-Event Data Analysis
Splines are natural choices to model the baseline hazard function h 0 (t), the cumulative hazard function H 0 (t), or their logarithms, respectively, for time-to-event data. From t 0 h 0 (s) ds = H 0 (t), either the derivatives or the integrals of the spline basis functions are usually needed to construct the likelihood function. Similar to the shape-restricted regression, constraints can be imposed to ensure that the estimated baseline cumulative hazard function is nondecreasing over time and the corresponding baseline hazard estimates are nonnegative. A simple example was given by Rosenberg (1995), where the baseline hazard function is modeled by B-splines as where B i,k (t) represents the ith B-spline basis of degree k, and p denotes the degrees of freedom of the basis functions.
Spline-based methods have been proposed for modeling recurrent event data as well. For instance, Fu et al. (2016) proposed a gamma frailty model with a piecewise constant rate function as follows: where the r i 's are i.i.d. frailty effects from a gamma distribution with unit mean, ρ 0 (·) is the baseline rate function estimated by piecewise constant basis functions, the x i 's represent covariates, and β is the covariate coefficient vector of interest. Wang et al. (2020) extended the proposed model by modeling the baseline rate function by M-splines in the R package reda.
The model estimation is based on the maximization of the marginal likelihood function, which involves integrating the rate function. For some applications such as seasonal illness or warranty claims of snowmobiles, the rate function can be periodic, in which case, the periodic splines introduced in Section 2.3 become of interest. For illustration, we generated cyclic recurrent events from 500 processes with a periodic baseline rate function as follows:

## simulate recurrent event data from a periodic rate function library(reda) set.seed(123)
# set random number seed n <-500 # number of event process beta <-0.2 # covariate coefficient tau <-36 # end of follow-up right <-12 # end of cyclic interval foo <-function(x) # the example periodic function sin(x * pi / 6 + 1) / 50 + 0.05 dat <-simEventData(n, z = matrix(rnorm(n)), zCoef = beta, endTime = tau, rho = foo, frailty = rgamma, arguments = list(frailty = list(shape = 2, scale = 0.5))) The generation utilized the function simEventData() of the reda package. For simplicity, we considered one covariate generated from the standard normal distribution and a frailty term from the gamma distribution with shape parameter 2 and scale parameter 0.5. To resemble the seasonality trend of some practical examples, we set the time unit to be one month and let the cyclic period last 12 months. We considered the cyclic function f (x) = sin(πx/6 + 1)/50 + 0.05. The end of follow-up was fixed to be 36 months to include 3 cyclic intervals. The histogram of the simulated event times is given in Figure 4, from which we may find a clear periodic pattern of the event density. To visually compare the histogram and underlying baseline rate function, we scaled the latter so that its integral attained one at the end of the follow-up. The scaled baseline rate function represented by the red dashed line looks similar to the histogram of the event times in Figure 4 regarding the seasonality trend, which suggests that the reda::simEventData() works as expected. The splines2::mSpline() returns the integrals of the periodic basis functions when both periodic and integral are specified to be TRUE, which makes it straightforward to obtain the maximum likelihood estimates as if the piecewise constant basis functions were used to model the baseline rate function. We assumed that the length of each cyclic interval was known to be 12 months. For the simulated recurrent event data, we applied the ordinary M-splines and periodic M-splines to fit the baseline rate function, respectively, and obtained the model estimation results with the help of the function rateReg() of the reda package, which builds on the periodic splines from the splines2 package. We considered quadratic basis functions. For ordinary M-splines, we placed eight internal knots at 4, 8, . . ., 32, and placed boundary knots at 0 and 36 (the end of follow-up). For periodic M-splines, the boundary knots define the cyclic interval and thus were placed at 0 and 12. Given the boundary knots, we placed three internal knots at the 25%, 50%, and 75% quantiles of the observed event times relative to the beginning of the cyclic interval that they belong to, which turned out to be approximately 2.2, 4.8, and 9.2 for the simulated dataset.
The estimated baseline rate functions and their integrals, the estimated baseline mean cumulative functions (MCF) are visualized, respectively, in the upper panel and the lower panel of Figure 5, where the point estimates are represented by the black solid lines, the pointwise 95% confidence interval estimates are indicated by the dash-dotted lines, while the underlying true baseline rate functions and mean cumulative function are indicated by the red dashed lines, respectively.
As we expected, the estimated baseline rate function (by using periodic M-splines) followed the seasonality trend of the true baseline rate function closer over time than using ordinary M-splines. Despite the bias around the nadir and the peak of the true baseline rate function, the pointwise confidence interval (CI) estimates of both approaches contained the true baseline rate function at each time point. However, the width of the CI estimates by using periodic Msplines was smaller than when using ordinary M-splines, especially around the boundary. The comparison of the estimated models was similar in terms of the baseline MCF. The estimated mean cumulative functions were both close to the true curve over time. However, using periodic M-splines resulted in smaller bias and narrower pointwise CI estimates.

Extreme-Value Copulas
Consider two random variables X and Y with continuous cumulative distribution functions F and G, respectively. Let H (x, y) = P (X x, Y y) denote their joint distribution function. By Sklar's Theorem, there exists a two-dimensional copula C such that H (x, y) = C(F (x), G(y)) and C (u, v) Pickands (1981), a copula C is a bivariate extreme-value copula if and only if there exists a convex function A : [0, 1] → [1/2, 1] such that for any u, v ∈ (0, 1), The convex function A(t) is called the Pickands dependence function associated with C and satisfies and A(0) = A(1) = 1. Each A(t) uniquely defines an extreme-value copula. Estimation of A(t) can be done by with a nonparametric regression. For a random bivariate sample {(x i , y i )} from (X, Y ), let F n and G n denote the empirical cumulative distribution functions of X and Y , respectively. Define u i = nF n (x i )/(n + 1) and v i = nG n (y i )/(n + 1), where the asymptotically negligible scaling factor n/(n + 1) was introduced to avoid problems with density evaluation at the boundaries of [0, 1] 2 . Let C n denote the empirical copula and define Then, A(t) can be estimated by nonparametrically regressing the z i 's on the t i 's. Cormier et al. (2014) fit a regression model with the quantile smoothing splines (Koenker et al., 1994;Ng and Maechler, 2007) subject to the convexity, boundary, and endpoint constraints of A(t) with the R package cobs (Ng and Maechler, 2020). Nonetheless, the boundary constraints in (8) were imposed by imposing thatÂ(j/N ) max(j/N, 1 − j/N), where j ∈ {1, . . . , N − 1} for some large N such as N = 100. In fact, given that A(t) is convex and passes through 1 when t = 0 and t = 1, an elegant and efficient way to impose the constraints is to require where A (t) is the derivative of A(t).
To demonstrate the usage of fitA() and to compare the fitted with the true Pickands dependence function A KGH θ (t), we wrote two helper functions named kc() and getZ() to set

Discussion
The shape-restricted splines are the main part of the functionalities provided by the package splines2. Additional features that are not reviewed here include generalized Bernstein polynomials, B-splines, natural cubic splines, and an Rcpp interface; see package vignettes for details. The R interface of the main functions in splines2 follows that of the function bs() in the package splines for consistency. The derivatives and integrals of the spline basis functions are handy in applications when they are needed. The Rcpp interface allows direct access to the C++ implementation for R package developers. Flexible and powerful as the splines2 package is, it is the users' responsibility in practice to be aware of the applicability and limitations of the spline basis functions. For example, for monotone regression in general, an intercept term needs to be considered in addition to the I-spline basis functions. For a general regression problem without shape constraints, C-splines should not be used without a good reason.
The efficiency of the splines2 package was compared with other packages for the same tasks. We conducted micro-benchmarks to investigate the relative performance of our implementation with the help of the package microbenchmark (Mersmann, 2019). The tasks were to evaluate the B-splines (and their derivatives and integrals), Bernstein polynomials, natural cubic splines, and periodic splines at a sequence of equally spaced points in [0, 1] with step size 0.001. The boundary knots were set to be 0 and 1, respectively, and the internal knots were set to be {0.1, . . . , 0.9} when they are applicable. The micro-benchmark results from 1000 replicates are visualized in Figure 7. While the micro-benchmarks are by no means comprehensive, they suggest the high efficiency of the implementation in splines2. For instance, the package ibs implements the integral of the B-splines by directly following the recursive formula, while the implementation in the package splines2 takes advantage of the local support property of B-splines. Consequently, the function ibs::ibs() is much slower than the function splines2::ibs(), even though both are based on C++ behind the scene.
We restricted our attention to implementations in R. There are of course similar implementations in other programming languages. For example, the modules scipy.signal and scipy.interpolate of Scipy of Python, and the Curve Fitting Toolbox of MATLAB provide B-splines func-

Supplementary Material
We review the generalized Bernstein polynomials, B-splines, and Natural cubic splines implemented in the package splines2 but not covered in the main text. We also provide the R code that produced the micro-benchmark results.