Home
Search

Journal of Data Science

Submit your article Information

Journal home
To appear
Current issue
All issues
More
Journal home To appear Current issue All issues

Detailed search

Title

Author

Types

Abstract

Keywords

Published

Pages

Volumes

Issues

DOI

Affiliation

Search results 892

Order by:

Select: All None Download:

Francis Erebholo Victor Apprey Paul Bezandry All authors (4)

https://doi.org/10.6339/JDS.201604_14(2).0008

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 14, Issue 2 (2016), pp. 365–382

Abstract

Abstract: Incomplete data are common phenomenon in research that adopts the longitudinal design approach. If incomplete observations are present in the longitudinal data structure, ignoring it could lead to bias in statistical inference and interpretation. We adopt the disposition model and extend it to the analysis of longitudinal binary outcomes in the presence of monotone incomplete data. The response variable is modeled using a conditional logistic regression model. The nonresponse mechanism is assumed ignorable and developed as a combination of Markov’s transition and logistic regression model. MLE method is used for parameter estimation. Application of our approach to rheumatoid arthritis clinical trials is presented.

Estimating Bivariate Survival Function by Volterra Estimator Using Dynamic Programming Techniques

Jiantian Wang Pablo Zafra

https://doi.org/10.6339/JDS.2009.07(3).489

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 7, Issue 3 (2009), pp. 365–380

Abstract

Abstract: For estimating bivariate survival function under random censor ship, it is commonly believed that the Dabrowska estimator is among the best ones while the Volterra estimator is far from being computational ef ficiency. As we will see, the Volterra estimator is a natural extension of the Kaplan-Meier estimator to bivariate data setting. We believe that the computational ‘inefficiency’ of the Volterra estimator is largely due to the formidable computational complexity of the traditional recursion method. In this paper, we show by numerical study as well as theoretical analysis that the Volterra estimator, once computed by dynamic programming technique, is more computationally efficient than the Dabrowska estimator. Therefore, the Volterra estimator with dynamic programming would be quite recom mendable in applications owing to its significant computational advantages.

On the Generalized Exponentiated Exponential Lindley Distribution

Hok Shing Kwong Saralees Nadarajah

https://doi.org/10.6339/JDS.201904_17(2).0007

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 17, Issue 2 (2019), pp. 363–382

Abstract

The generalized exponentiated exponential Lindley distribution is a novel three parameter distribution due to Hussain et al. (2017). They studied its properties including estimation issues and illustrated applications to four datasets. Here, we show that several known distributions including those having two parameters can provide better fits. We also correct errors in the derivatives of the likelihood function.

A Bimodal Spike and Slab Model for Variable Selection and Model Exploration

Tanujit Dey

https://doi.org/10.6339/JDS.201207_10(3).0002

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 10, Issue 3 (2012), pp. 363–383

Abstract

Abstract: We have developed an enhanced spike and slab model for variable selection in linear regression models via restricted final prediction error (FPE) criteria; classic examples of which are AIC and BIC. Based on our proposed Bayesian hierarchical model, a Gibbs sampler is developed to sample models. The special structure of the prior enforces a unique mapping between sampling a model and calculating constrained ordinary least squares estimates for that model, which helps to formulate the restricted FPE criteria. Empirical comparisons are done to the lasso, adaptive lasso and relaxed lasso; followed by a real life data example.

Imputation Methods for Missing Categorical Questionnaire Data: A Comparison of Approaches

W. Holmes Finch

https://doi.org/10.6339/JDS.2010.08(3).612

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 8, Issue 3 (2010), pp. 361–378

Abstract

Abstract: Missing data are a common problem for researchers working with surveys and other types of questionnaires. Often, respondents do not respond to one or more items, making the conduct of statistical analyses, as well as the calculation of scores difficult. A number of methods have been developed for dealing with missing data, though most of these have focused on continuous variables. It is not clear that these techniques for imputation are appropriate for the categorical items that make up surveys. However, methods of imputation specifically designed for categorical data are either limited in terms of the number of variables they can accommodate, or have not been fully compared with the continuous data approaches used with categorical variables. The goal of the current study was to compare the performance of these explicitly categorical imputation approaches with the more well established continuous method used with categorical item responses. Results of the simulation study based on real data demonstrate that the continuous based imputation approach and a categorical method based on stochastic regression appear to perform well in terms of creating data that match the complete datasets in terms of logistic regression results.

The Matrix Expression, Topological Index and Atomic Attribute of Molecular Topological Structure

Qian-Nan Hu Yi-Zeng Liang Kai-Tai Yi-Zeng

https://doi.org/10.6339/JDS.2003.01(4).172

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 1, Issue 4 (2003), pp. 361–389

Nonparametric inference for P(X < Y ) with paired variables

Jos´e Arturo Montoya Francisco Javier Rubio

https://doi.org/10.6339/JDS.201404_12(2).0009

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 12, Issue 2 (2014), pp. 359–375

Abstract

Abstract: We propose two classes of nonparametric point estimators of θ = P(X < Y ) in the case where (X, Y ) are paired, possibly dependent, absolutely continuous random variables. The proposed estimators are based on nonparametric estimators of the joint density of (X, Y ) and the distri bution function of Z = Y − X. We explore the use of several density and distribution function estimators and characterise the convergence of the re sulting estimators of θ. We consider the use of bootstrap methods to obtain confidence intervals. The performance of these estimators is illustrated us ing simulated and real data. These examples show that not accounting for pairing and dependence may lead to erroneous conclusions about the rela tionship between X and Y .

Measuring Local Influential Observations in Modified Ridge Regression

Aboobacker Jahufer Jianbao Chen

https://doi.org/10.6339/JDS.201107_09(3).0004

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 9, Issue 3 (2011), pp. 359–372

Abstract

Abstract: In this paper, we use generalized influence function and generalized Cook distance to measure the local influence of minor perturbation on the modified ridge regression estimator in ridge type linear regression model. The diagnostics under the perturbation of constant variance and individual explanatory variables are obtained when multicollinearity presents among the regressors. Also we proposed a statistic that reveals the influential cases for Mallow’s method which is used to choose modified ridge regression estimator biasing parameter. Two real data sets are used to illustrate our methodologies.

Inference and Optimal Design of Accelerated Life Test using Geometric Process for Generalized Half-Logistic Distribution under Progressive Type-II Censoring

H. M. Aly S. O. Bleed H. Z. Muhammed

https://doi.org/10.6339/JDS.202004_18(2).0008

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 18, Issue 2 (2020), pp. 358–375

Abstract

In this paper, the geometric process model is used for analyzing constant stress accelerated life testing. The generalized half logistic lifetime distribution is considered under progressive type-II censoring. Statistical inference is developed on the basis of maximum likelihood approach for estimating the unknown parameters and getting both the asymptotic and bootstrap confidence intervals. Besides, the predictive values of the reliability function under usual conditions are found. Moreover, the method of finding the optimal value of the ratio of the geometric process is presented. Finally, a simulation study is presented to illustrate the proposed procedures and to evaluate the performance of the geometric process model.

The Time Resolution in Lag-Sequential Analysis: A Choice with Consequences

Andr´e Berchtold Gene P. Sackett

https://doi.org/10.6339/JDS.2007.05(3).340

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 5, Issue 3 (2007), pp. 357–378

Abstract

Abstract: The creation of data sets using observational methods for the lag-sequential study of behavior requires selection of a recording time unit. This is an important issue, because standard methods such as momentary sampling and partial-interval sampling, for instance, consistently underestimate the frequency of some behaviors. This leads to inaccurate estimation of both unconditional and conditional probabilities of the different behaviors, the basic descriptive and analytic tools of sequential analysis methodology. The purpose of this paper is to investigate the creation of data sets usable for the purpose of sequential analysis. We show that such data vary depending on the time resolution and that inaccurate choices lead to biased estimations of transition probabilities.

44 45 46 47 48

Items per page

Export citation

Copy and paste formatted citation

Formatted citation

Placeholder

Citation style

Download citation in file

Export format

Authors

Placeholder

RSS

Journal of data science

Online ISSN: 1683-8602
Print ISSN: 1680-743X

About

About journal

For contributors

Submit
OA Policy
Become a Peer-reviewer

Contact us

JDS@ruc.edu.cn
No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China