Subsampling is an effective way to deal with big data problems and many subsampling approaches have been proposed for different models, such as leverage sampling for linear regression models and local case control sampling for logistic regression models. In this article, we focus on optimal subsampling methods, which draw samples according to optimal subsampling probabilities formulated by minimizing some function of the asymptotic distribution. The optimal subsampling methods have been investigated to include logistic regression models, softmax regression models, generalized linear models, quantile regression models, and quasi-likelihood estimation. Real data examples are provided to show how optimal subsampling methods are applied.
This paper proposes a procedure to execute external source codes from a LATEX document and include the calculation outputs in the resulting Portable Document Format (pdf) file automatically. It integrates programming tools into the LATEX writing tool to facilitate the production of reproducible research. In our proposed approach to a LATEX-based scientific notebook the user can easily invoke any programming language or a command-line program when compiling the LATEX document, while using their favorite LATEX editor in the writing process. The required LATEX setup, a new Python package, and the defined preamble are discussed in detail, and working examples using R, Julia, and MatLab to reproduce existing research are provided to illustrate the proposed procedure. We also demonstrate how to include system setting information in a paper by invoking shell scripts when compiling the document.