Home
Search

Journal of Data Science

Submit your article Information

Journal home
To appear
Current issue
All issues
More
Journal home To appear Current issue All issues

Detailed search

Title

Author

Types

Abstract

Keywords

Published

Pages

Volumes

Issues

DOI

Affiliation

Search results 892

Order by:

Select: All None Download:

Discussion of “Power Priors for Leveraging Historical Data: Looking Back and Looking Forward”

Minge Xie

https://doi.org/10.6339/25-JDS1161E

Pub. online: 4 Feb 2025 Type: Discussion

Open Access

Journal: Journal of Data Science Volume 23, Issue 1 (2025), pp. 59–61

Discussion of “Power Priors for Leveraging Historical Data: Looking Back and Looking Forward”

Guohui Wu

https://doi.org/10.6339/25-JDS1161F

Pub. online: 4 Feb 2025 Type: Discussion

Open Access

Journal: Journal of Data Science Volume 23, Issue 1 (2025), pp. 56–58

Discussion of “Power Priors for Leveraging Historical Data: Looking Back and Looking Forward”

Chenguang Wang

https://doi.org/10.6339/25-JDS1161G

Pub. online: 4 Feb 2025 Type: Discussion

Open Access

Journal: Journal of Data Science Volume 23, Issue 1 (2025), pp. 52–55

Discussion of “Power Priors for Leveraging Historical Data: Looking Back and Looking Forward”

Lei Nie

https://doi.org/10.6339/25-JDS1161A

Pub. online: 4 Feb 2025 Type: Discussion

Open Access

Journal: Journal of Data Science Volume 23, Issue 1 (2025), pp. 48–51

Discussion of “Power Priors for Leveraging Historical Data: Looking Back and Looking Forward”

Margaret Gamalo Heliang Shi Yuxi Zhao All authors (4)

https://doi.org/10.6339/25-JDS1161C

Pub. online: 4 Feb 2025 Type: Discussion

Open Access

Journal: Journal of Data Science Volume 23, Issue 1 (2025), pp. 38–47

Discussion of “Power Priors for Leveraging Historical Data: Looking Back and Looking Forward”

Fang Chen

https://doi.org/10.6339/25-JDS1161D

Pub. online: 4 Feb 2025 Type: Discussion

Open Access

Journal: Journal of Data Science Volume 23, Issue 1 (2025), pp. 31–37

Mortgage Prepayment Modeling via a Smoothing Spline State Space Model

Haoran Lu Huimin Cheng Ye Wang All authors (8)

https://doi.org/10.6339/25-JDS1165

Pub. online: 30 Jan 2025 Type: Statistical Data Science

Open Access

Journal: Journal of Data Science

Abstract

Loan behavior modeling is crucial in financial engineering. In particular, predicting loan prepayment based on large-scale historical time series data of massive customers is challenging. Existing approaches, such as logistic regression or nonparametric regression, could only model the direct relationship between the features and the prepayments. Motivated by extracting the hidden states of loan behavior, we propose the smoothing spline state space (QuadS) model based on a hidden Markov model with varying transition and emission matrices modeled by smoothing splines. In contrast to existing methods, our method benefits from capturing the loans’ unobserved state transitions, which not only increases prediction performances but also provides more interpretability. The overall model is learned by EM algorithm iterations, and within each iteration, smoothing splines are fitted with penalized least squares. Simulation studies demonstrate the effectiveness of the proposed method. Furthermore, a real-world case study using loan data from the Federal National Mortgage Association illustrates the practical applicability of our model. The QuadS model not only provides reliable predictions but also uncovers meaningful, hidden behavior patterns that can offer valuable insights for the financial industry.

A Conversation with Dr. David S. Salsburg

Haim Bar Naitee Ting

https://doi.org/10.6339/25-JDS1171

Pub. online: 29 Jan 2025 Type: Data Science Conversation

Open Access

Journal: Journal of Data Science Volume 23, Issue 1 (2025), pp. 70–89

Abstract

Dr. David S. Salsburg’s career has been an exceptional one. He was the first statistician to work in Pfizer, Inc., and later became the first statistician from the pharmaceutical industry to be elected as an ASA fellow. He played a vital role as a statistician in Pfizer, Inc. at a time when the drug approval process was developed. For his contributions, Dr. Salsburg was awarded the Career Achievement Award of the Biostatistics Section of the Pharmaceutical Research and Manufacturers of America in 1994, for “significant contributions to the advancement of biostatistics in the pharmaceutical industry”. Dr. Salsburg also managed to achieve something rare among scientists, which is to popularize his field of research and make it accessible and enjoyable to laypeople. Dr. Salsburg is possibly best known for his book “The Lady Tasting Tea – How Statistics Revolutionized the 20th Century Science”, in which he combines simple and engaging explanations of statistical methods, and why they are needed, along with personal stories told with a great deal of generosity, fondness, and humor about the people who developed them. Dr. Salsburg’s admiration for the those statisticians shines through. In this interview, Dr. Salsburg shares his own stories and perspectives, from his childhood, through his service in the Navy and his long and productive career in Pfizer, Inc. to his equally productive retirement, in which he authored “The Lady Tasting Tea” and other books.

An Innovative Method of Singular Spectrum Analysis to Conduct Gap-filling and Denoising on Time Series Data

James J. Yang

Anne Buu

https://doi.org/10.6339/25-JDS1164

Pub. online: 28 Jan 2025 Type: Statistical Data Science

Open Access

Journal: Journal of Data Science

Abstract

Heart rate data collected from wearable devices – one type of time series data – could provide insights into activities, stress levels, and health. Yet, consecutive missing segments (i.e., gaps) that commonly occur due to improper device placement or device malfunction could distort the temporal patterns inherent in the data and undermine the validity of downstream analyses. This study proposes an innovative iterative procedure to fill gaps in time series data that capitalizes on the denoising capability of Singular Spectrum Analysis (SSA) and eliminates SSA’s requirement of pre-specifying the window length and number of groups. The results of simulations demonstrate that the performance of SSA-based gap-filling methods depends on the choice of window length, number of groups, and the percentage of missing values. In contrast, the proposed method consistently achieves the lowest rates of reconstruction error and gap-filling error across a variety of combinations of the factors manipulated in the simulations. The simulation findings also highlight that the commonly recommended long window length – half of the time series length – may not apply to time series with varying frequencies such as heart rate data. The initialization step of the proposed method that involves a large window length and the first four singular values in the iterative singular value decomposition process not only avoids convergence issues but also facilitates imputation accuracy in subsequent iterations. The proposed method provides the flexibility for researchers to conduct gap-filling solely or in combination with denoising on time series data and thus widens the applications.

Analysis of Bilateral and Unilateral Data: A Comparative Review of Model-Based and MLE-Based Methods for the Homogeneity Test of Proportions

Xueqing Zhang Chang-Xing Ma

https://doi.org/10.6339/25-JDS1168

Pub. online: 28 Jan 2025 Type: Statistical Data Science

Open Access

Journal: Journal of Data Science

Abstract

In many medical comparative studies, subjects may provide either bilateral or unilateral data. While numerous testing procedures have been proposed for bilateral data that account for the intra-class correlation between paired organs of the same individual, few studies have thoroughly explored combined correlated bilateral and unilateral data. Ma and Wang (2021) introduced three test procedures based on the maximum likelihood estimation (MLE) algorithm for general g groups. In this article, we employ a model-based approach that treats the measurements from both eyes of each subject as repeated observations. We then compare this approach with Ma and Wang’s Score test procedure. Monte Carlo simulations demonstrate that the MLE-based Score test offers certain advantages under specific conditions. However, this model-based method lacks an explicit form for the test statistic, limiting its potential for further development of an exact test.

2 3 4 5 6

Items per page

Export citation

Copy and paste formatted citation

Formatted citation

Placeholder

Citation style

Download citation in file

Export format

Authors

Placeholder

RSS

Journal of data science

Online ISSN: 1683-8602
Print ISSN: 1680-743X

About

About journal

For contributors

Submit
OA Policy
Become a Peer-reviewer

Contact us

JDS@ruc.edu.cn
No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China