An S-Curve Method for Abrupt and Gradual Changepoint Analysis

Jiang, Lan; Kennedy, Collin; Matloff, Norman

doi:10.6339/24-JDS1137

Journal of Data Science

An S-Curve Method for Abrupt and Gradual Changepoint Analysis

Volume 23, Issue 1 (2025), pp. 225–242

Lan Jiang Collin Kennedy Norman Matloff

https://doi.org/10.6339/24-JDS1137

Pub. online: 9 July 2024 Type: Statistical Data Science

Open Access

Received
18 April 2024

Accepted
19 April 2024

Published
9 July 2024

Abstract

Changepoint analysis has had a striking variety of applications, and a rich methodology has been developed. Our contribution here is a new approach that uses nonlinear regression analysis as an intermediate computational device. The tool is quite versatile, covering a number of different changepoint scenarios. It is largely free of parametric model assumptions, and has the major advantage of providing standard errors for formal statistical inference. Both abrupt and gradual changes are covered.

Supplementary material

Supplementary Material

The ZIP file contains all code needed to reproduce the figures and results of the experiments.

References

Aggarwal R, Inclan C, Leal R (1999). Volatility in emerging stock markets. Journal of Financial and Quantitative Analysis, 34(1): 33–55. https://doi.org/10.2307/2676245

Aminikhanghahi S, Cook D (2017). A survey of methods for time series change point detection. Knowledge and Information Systems, 51: 339–367. https://doi.org/10.1007/s10115-016-0987-z

Angrist J, Pischke J (2008). Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton University Press.

Bai J, Perron P (2003). Computation and analysis of multiple structural change models. Journal of Applied Econometrics, 18(1): 1–22. https://doi.org/10.1002/jae.659

Berk R, Brown L, Buja A, Zhang K, Zhao L (2013). Valid post-selection inference. The Annals of Statistics, 41(2): 802–837. https://doi.org/10.1214/12-AOS1077

Bhaduri R, Roy S, Pal S (2022). Rough-fuzzy cpd: a gradual change point detection algorithm. Journal of Data, Information and Management, 4: 1–24. https://doi.org/10.1007/s42488-022-00077-3

Boe LA, Lumley T, Shaw PA (2024). Practical considerations for sandwich variance estimation in two-stage regression settings. American Journal of Epidemiology, 193(5): 798–810. https://doi.org/10.1093/aje/kwad234

Callaway B, Sant’Anna PH (2021). did: Difference in differences. R package version 2.1.2.

Chang ST, Lu KP, Yang MS (2015). Fuzzy change-point algorithms for regression models. IEEE Transactions on Fuzzy Systems, 23(6): 2343–2357. https://doi.org/10.1109/TFUZZ.2015.2421072

Chen C, Chan J, Gerlach R, Hsieh W (2011). A comparison of estimators for regression models with change points. Statistics and Computing, 21: 395–414. https://doi.org/10.1007/s11222-010-9177-0

Chen H, Zhang N (2015). Graph-based change-point detection. The Annals of Statistics, 43(1): 139–176.

DasGupta A (2008). Asymptotic Theory of Statistics and Probability Springer Texts in Statistics. Springer, New York.

Erdman C, Emerson JW (2007). bcp: An R package for performing a Bayesian analysis of change point problems. Journal of Statistical Software, 23(3): 1–13. https://doi.org/10.18637/jss.v023.i03

Faraway J (2016). Linear Models with R. Chapman & Hall/CRC Texts in Statistical Science. CRC Press.

Fong Y (2019). Fast bootstrap confidence intervals for continuous threshold linear regression. Journal of Computational and Graphical Statistics, 28: 466–470. https://doi.org/10.1080/10618600.2018.1537927

Fong Y, Huang Y, Gilbert P, Permar S (2017). chngpt: Threshold regression model estimation and inference. BMC Bioinformatics, 18: 454. https://doi.org/10.1186/s12859-017-1863-x

Fryzlewicz P (2014). Wild binary segmentation for multiple change-point detection. The Annals of Statistics, 42(6): 2243–2281. https://doi.org/10.1214/14-AOS1245

Hsu J (1996). Multiple Comparisons: Theory and Methods. CRC Press.

Hušková M (1999). Gradual changes versus abrupt changes. Journal of Statistical Planning and Inference, 76(1): 109–125. https://doi.org/10.1016/S0378-3758(98)00173-6

Jennrich RI (1969). Asymptotic Properties of Non-Linear Least Squares Estimators. The Annals of Mathematical Statistics, 40(2): 633–643. https://doi.org/10.1214/aoms/1177697731

Jiang L, Kennedy C, Matloff N (2024). changeS: S-curve fit for changepoint analysis. R package version 1.0.1.

Killick R, Eckley IA (2014). changepoint: An R package for changepoint analysis. Journal of Statistical Software, 58(3): 1–19. https://doi.org/10.18637/jss.v058.i03

Killick R, Fearnhead P, Eckley I (2012a). Optimal detection of changepoints with a linear computational cost. Journal of the American Statistical Association, 107: 1590–1598. https://doi.org/10.1080/01621459.2012.737745

Killick R, Nam CF, Aston J, Eckley I (2012b). changepoint.info: The changepoint repository.

Kim HJ (1996). Change-point detection for correlated observations. Statistica Sinica, 6(1): 275–287.

Knight K (2000). Mathematical Statistics. Chapman & Hall/CRC Press.

Kuchibhotla AK, Kolassa JE, Kuffner TA (2022). Post-selection inference. Annual Review of Statistics and Its Application, 9(1): 505–527. https://doi.org/10.1146/annurev-statistics-100421-044639

Liao X, Meyer MC (2023). ShapeChange: Change-point estimation using shape-restricted splines.

Lindeløv JK (2020). mcp: An R package for regression with multiple change points. OSF Preprints.

Lindeløv JK (2023). An overview of change point packages in R. https://lindeloev.github.io/mcp/articles/packages.html.

Lu KP, Chang ST (2016). Detecting change-points for shifts in mean and variance using fuzzy classification maximum likelihood change-point algorithms. Journal of Computational and Applied Mathematics, 308: 447–463. https://doi.org/10.1016/j.cam.2016.06.006

Lu KP, Chang ST, Yang MS (2016). Change-point detection for shifts in control charts using fuzzy shift change-point algorithms. Computers & Industrial Engineering, 93: 12–27. https://doi.org/10.1016/j.cie.2015.12.002

Matloff N (1981). Use of regression functions for improved estimation of means. Biometrika, 68(3): 685–689. https://doi.org/10.1093/biomet/68.3.685

Matloff NS (2017). Statistical Regression and Classification: From Linear Models to Machine Learning. CRC Press.

Muggeo VM (2017). Interval estimation for the breakpoint in segmented regression: A smoothed score-based approach. Australian & New Zealand Journal of Statistics, 59(3): 311–322. https://doi.org/10.1111/anzs.12200

Muggeo VMR (2003). Estimating regression models with unknown break-points. Statistics in Medicine, 22(19): 3055–3071. https://doi.org/10.1002/sim.1545

Muggeo VMR (2008). segmented: An R package to fit regression models with broken-line relationships. R News, 8(1): 20–25.

Muggeo VMR, Adelfio G (2010). Efficient change point detection for genomic sequences of continuous measurements. Bioinformatics, 27(2): 161–166. https://doi.org/10.1093/bioinformatics/btq647

Padfield D, Matheson G (2023). nls.multstart: Robust non-linear regression using AIC scores. R package version 1.3.0.

Pawitan Y (2005). Change-point problem. In: Encyclopedia of Biostatistics (P Armitage, T Colton, eds.), John Wiley & Sons, Ltd.

Pollard D, Radchenko P (2006). Nonlinear least-squares estimation. Journal of Multivariate Analysis, 97(2): 548–562. https://doi.org/10.1016/j.jmva.2005.04.002

Rogers EM (1962). Diffusion of Innovations. Free Press.

Román-Román P, Serrano-Pérez J, Torres-Ruiz F (2019). A note on estimation of multi-sigmoidal Gompertz functions with random noise. Mathematics, 7(6): 541. https://doi.org/10.3390/math7060541

Sablik T (2013). Recession of 1981–82. https://www.federalreservehistory.org/essays/recession-of-1981-82.

Sidik K, Jonkman JN (2016). A comparison of the variance estimation methods for heteroscedastic nonlinear models. Statistics in Medicine, 35(26): 4856–4874. https://doi.org/10.1002/sim.7024

Song H, Chen H (2021). Asymptotic distribution-free changepoint detection for data with repeated observations. Biometrika, 109(3): 783–798. https://doi.org/10.1093/biomet/asab048

Truong C, Oudre L, Vayatis N (2020). Selective review of offline change point detection methods. Signal Processing, 167: 107299. https://doi.org/10.1016/j.sigpro.2019.107299

Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, et al. (2020). SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nature Methods, 17: 261–272. https://doi.org/10.1038/s41592-019-0686-2

Wang X, Erdman C, Emerson JW (2018). bcp: Bayesian Analysis of Change Point Problems.

Wu CF (1981). Asymptotic theory of nonlinear least squares estimation. The Annals of Statistics, 9(3): 501–513.

Wu H, Schafer TLJ, Ryan S, Matteson DS (2024). Drift vs Shift: Decoupling Trends and Changepoint Analysis.

Yao YC, Au ST (1989). Least-squares estimation of a step function. Sankhya. The Indian Journal of Statistics, 51(3): 370–381.

Yau CY, Zhao Z (2015). Inference for multiple change points in time series via likelihood ratio scan statistics. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 78(4): 895–916. https://doi.org/10.1111/rssb.12139

Zhou H, Liang KY (2008). On estimating the change point in generalized linear models. In: IMS Collections Beyond Parametrics in Interdisciplinary Research: Festschrift in Honor of Professor Pranab K. Sen (N Balakrishnan, EA Peña, MJ Silvapulle, eds.), 305–320. Institute of Mathematical Statistics.

2025 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.

Open access article under the CC BY license.

Keywords

difference in difference identifiability nonlinear regression models standard errors

Metrics

since February 2021

323

Article info
views

118

PDF
downloads

RSS

Authors

Abstract

Supplementary material

References

Export citation

Copy and paste formatted citation

Download citation in file