Home
Search

Journal of Data Science

Submit your article Information

Journal home
To appear
Current issue
All issues
More
Journal home To appear Current issue All issues

Detailed search

Title

Author

Types

Abstract

Keywords

Published

Pages

Volumes

Issues

DOI

Affiliation

Search results 877

Order by:

Select: All None Download:

Modeling on Generalized Extended Inverse Weibull Software Reliability Growth Model

David D. Hanagal Nileema N. Bhalerao

https://doi.org/10.6339/JDS.201907_17(3).0007

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 17, Issue 3 (2019), pp. 575–592

Abstract

In this paper we introduce the generalized extended inverse Weibull finite failure software reliability growth model which includes both increasing/decreasing nature of the hazard function. The increasing/decreasing behavior of failure occurrence rate fault is taken into account by the hazard of the generalized extended inverse Weibull distribution. We proposed a finite failure non-homogeneous Poisson process (NHPP) software reliability growth model and obtain unknown model parameters using the maximum likelihood method for interval domain data. Illustrations have been given to estimate the parameters using standard data sets taken from actual software projects. A goodness of fit test is performed to check statistically whether the fitted model provides a good fit with the observed data. We discuss the goodness of fit test based on the Kolmogorov-Smirnov (K-S) test statistic. The proposed model is compared with some of the standard existing models through error sum of squares, mean sum of squares, predictive ratio risk and Akaikes information criteria using three different data sets. We show that the observed data fits the proposed software reliability growth model. We also show that the proposed model performs satisfactory better than the existing finite failure category models

Using Informative Prior from Meta-Analysis in Bayesian Approach

Esin AVCI

https://doi.org/10.6339/JDS.201710_15(4).00001

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 15, Issue 4 (2017), pp. 575–588

Abstract

In a Bayesian approach, uncertainty explained by a prior distribution that contains information about an uncertain parameter. Determination of the prior distribution is important in because it impacts the posterior inference. The objective of this study is to use metaanalysis for proportion to obtain prior information about patients with breast cancer stage I who undergoing modified radical mastectomy treatment and applied Bayesian approach. R and WinBUGS programs are performed for meta-analysis and Bayesian approach respectively.

On Bayesian Analysis of a General Class of Randomized Response Models In Social Surveys about Stigmatized Traits

Zawar Hussain Muhammad Abid Nasir Abbas

https://doi.org/10.6339/JDS.201410_12(4).0003

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 12, Issue 4 (2014), pp. 575–612

Abstract

Abstract: While conducting a social survey on stigmatized/sensitive traits, obtaining efficient (truthful) data is an intricate issue and estimates are generally biased in such surveys. To obtain trustworthy data and to reduce false response bias, a technique, known as randomized response technique, is now being used in many surveys. In this study, we performed a Bayesian analysis of a general class of randomized response models. Suitable simple Beta prior and mixture of Beta priors are used in a common prior structure to obtain the Bayes estimates for the proportion of a stigmatized/sensitive attributes in the population of interest. We also extended our proposal to stratified random sampling. The Bayes and the maximum likelihood estimators are compared. For further understanding of variability, we have also compared the prior and posterior distributions for different values of the design constants through graphs and credible intervals. The condition to develop a new randomized response model is also discussed.

Robust Methods in Event Studies: Empirical Evidence and Theoretical Implications

Nonna Sorokina David E. Booth John H. Thornton

https://doi.org/10.6339/JDS.2013.11(3).1166

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 11, Issue 3 (2013), pp. 575–606

Abstract

Abstract: We apply methodology robust to outliers to an existing event study of the effect of U.S. financial reform on the stock markets of the 10 largest world economies, and obtain results that differ from the original OLS results in important ways. This finding underlines the importance of han dling outliers in event studies. We further review closely the population of outliers identified using Cook’s distance and find that many of the out liers lie within the event windows. We acknowledge that those data points lead to inaccurate regression fitting; however, we cannot remove them since they carry valuable information regarding the event effect. We study further the residuals of the outliers within event windows and find that the resid uals change with application of M-estimators and MM-estimators; in most cases they became larger, meaning the main prediction equation is pulled back towards the main data population and further from the outliers and indicating more proper fitting. We support our empirical results by pseudo simulation experiments and find significant improvement in determination of both types of the event effect − abnormal returns and change in systematic risk. We conclude that robust methods are important for obtaining accurate measurement of event effects in event studies.

Analysis of Covariance Structures in Time Series

Jennifer S. K. Chan S. T. Boris Choy

https://doi.org/10.6339/JDS.2008.06(4).432

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 6, Issue 4 (2008), pp. 573–589

Abstract

Abstract: Longitudinal data often arise in clinical trials when measure ments are taken from subjects repeatedly over time so that data from each subject are serially correlated. In this paper, we seek some covariance matri ces that make the regression parameter estimates robust to misspecification of the true dependency structure between observations. Moreover, we study how this choice of robust covariance matrices is affected by factors such as the length of the time series and the strength of the serial correlation. We perform simulation studies for data consisting of relatively short (N=3), medium (N=6) and long time series (N=14) respectively. Finally, we give suggestions on the choice of robust covariance matrices under different situ ations.

Statistical Analysis of Survival Times Based on Proportional Generalized Odds Models

Xiaohu Li Linxiong Li Rui Fang

https://doi.org/10.6339/JDS.201610_14(4).0001

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 14, Issue 4 (2016), pp. 571–584

Abstract

Abstract: In the area of survival analysis the most popular regression model is the Cox proportional hazards (PH) model. Unfortunately, in practice not all data sets satisfy the PH condition and thus the PH model cannot be used. To overcome the problem, the proportional odds (PO) model ( Pettitt 1982 and Bennett 1983a) and the generalized proportional odds (GPO) model ( Dabrowska and Doksum, 1988) were proposed, which can be considered in some sense generalizations of the PH model. However, there are examples indicating that the use of the PO or GPO model is not appropriate. As a consequence, a more general model must be considered. In this paper, a new model, called the proportional generalized odds (PGO) model, is introduced, which covers PO and GPO models as special cases. Estimation of the regression parameters as well as the underlying survival function of the GPO model is discussed. An application of the model to a data set is presented.

Inverse Gaussian Shared Frailty Models with Generalized Exponential and Generalized Inverted Exponential as Baseline Distributions

David D. Hanagal Arvind Pandey

https://doi.org/10.6339/JDS.201507_13(3).0009

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 13, Issue 3 (2015), pp. 569–602

Abstract

The unknown or unobservable risk factors in the survival analysis cause heterogeneity between individuals. Frailty models are used in the survival analysis to account for the unobserved heterogeneity in individual risks to disease and death. To analyze the bivariate data on related survival times, the shared frailty models were suggested. The most common shared frailty model is a model in which frailty act multiplicatively on the hazard function. In this paper, we introduce the shared inverse Gaussian frailty model with the reversed hazard rate and the generalized inverted exponential distribution and the generalized exponential distribution as baseline distributions. We introduce the Bayesian estimation procedure using Markov Chain Monte Carlo(MCMC) technique to estimate the parameters involved in the models. We present a simulation study to compare the true values of the parameters with the estimated values. Also we apply the proposed models to the Australian twin data set and a better model is suggested.

Estimating the Parameters of Azzalini Model by Bayesian Approach Under Symmetric and Asymmetric Loss Functions

Janardan Mahanta Soma Chowdhury Biswas Manindra Kumar Roy

https://doi.org/10.6339/JDS.201807_16(3).0007

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 16, Issue 3 (2018), pp. 567–592

Abstract

Abstract:This paper has been proposed to estimate the parameters of Markov based logistic model by Bayesian approach for analyzing longitudinal binary data. In Bayesian estimation selection of appropriate loss function and prior density are most important ingredient. Symmetric and asymmetric loss functions have been used for estimating parameters of two state Markov model and better performance has been observed by Bayesian estimate under squared error loss function.

Predicting Bankruptcy with Robust Logistic Regression

Richard P. Hauser David Booth

https://doi.org/10.6339/JDS.201110_09(4).0006

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 9, Issue 4 (2011), pp. 565–584

Abstract

Abstract: Using financial ratio data from 2006 and 2007, this study uses a three-fold cross validation scheme to compare the classification and pre diction of bankrupt firms by robust logistic regression with the Bianco and Yohai (BY) estimator versus maximum likelihood (ML) logistic regression. With both the 2006 and 2007 data, BY robust logistic regression improves both the classification of bankrupt firms in the training set and the prediction of bankrupt firms in the testing set. In an out of sample test, the BY robust logistic regression correctly predicts bankruptcy for Lehman Brothers; however, the ML logistic regression never predicts bankruptcy for Lehman Brothers with either the 2006 or 2007 data. Our analysis indicates that if the BY robust logistic regression significantly changes the estimated regression coefficients from ML logistic regression, then the BY robust logistic regression method can significantly improve the classification and prediction of bankrupt firms. At worst, the BY robust logistic regression makes no changes in the estimated regression coefficients and has the same classification and prediction results as ML logistic regression. This is strong evidence that BY robust logistic regression should be used as a robustness check on ML logistic regression, and if a difference exists, then BY robust logistic regression should be used as the primary classifier.

Softmax Model as Generalization upon Logistic Discrimination Suffers from Overfitting

F. Mohammadi Basatini Rahim Chinipardaz

https://doi.org/10.6339/JDS.201410_12(4).0002

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 12, Issue 4 (2014), pp. 563–574

Abstract

Abstract: The motivation behind this paper is to investigate the use of Softmax model for classification. We show that Softmax model is a nonlinear generalization for the logistic discrimination, that can approximate the posterior probabilities of classes where other Artificial neural network (ANN) models don't have this ability. We show that Softmax model has more flexibility than logistic discrimination in terms of correct classification. To show the performance of Softmax model a medical data set on thyroid gland state is used. The result is that Softmax model may suffer from overfitting.

24 25 26 27 28

Items per page

Export citation

Copy and paste formatted citation

Formatted citation

Placeholder

Citation style

Download citation in file

Export format

Authors

Placeholder

RSS

Journal of data science

Online ISSN: 1683-8602
Print ISSN: 1680-743X

About

About journal

For contributors

Submit
OA Policy
Become a Peer-reviewer

Contact us

JDS@ruc.edu.cn
No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China