Home
Search

Journal of Data Science

Submit your article Information

Journal home
To appear
Current issue
All issues
More
Journal home To appear Current issue All issues

Detailed search

Title

Author

Types

Abstract

Keywords

Published

Pages

Volumes

Issues

DOI

Affiliation

Search results 892

Order by:

Select: All None Download:

Training Students and Researchers in Bayesian Methods

Bruno Lecoutre

https://doi.org/10.6339/JDS.2006.04(2).246

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 4, Issue 2 (2006), pp. 207–232

Abstract

Abstract: Frequentist Null Hypothesis Significance Testing (NHST) is so an integral part of scientists’ behavior that its uses cannot be discontinued by flinging it out of the window. Faced with this situation, the suggested strategy for training students and researchers in statistical inference methods for experimental data analysis involves a smooth transition towards the Bayesian paradigm. Its general outlines are as follows. (1) To present natural Bayesian interpretations of NHST outcomes to draw attention to their shortcomings. (2) To create as a result of this the need for a change of emphasis in the presentation and interpretation of results. (3) Finally to equip users with a real possibility of thinking sensibly about statistical inference problems and behaving in a more reasonable manner. The conclusion is that teaching the Bayesian approach in the context of experimental data analysis appears both desirable and feasible. This feasibility is illustrated for analysis of variance methods.

Comparing Pre-Post Change Across Groups: Guidelines for Choosing between Difference Scores, ANCOVA, and Residual Change Scores

Megan A. Jennings Robert A. Cribbie

https://doi.org/10.6339/JDS.201604_14(2).0002

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 14, Issue 2 (2016), pp. 205–230

On the Extension of Inverse Lindley Distribution

Vikas Kumar Sharma Pragya Khandelwal

https://doi.org/10.6339/JDS.201704_15(2).0002

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 15, Issue 2 (2017), pp. 205–220

Abstract

In this paper, we proposed another extension of inverse Lindley distribution, called extended inverse Lindley and studied its fundamental properties such as moments, inverse moments, mean deviation, stochastic ordering and entropy. The flexibility of the proposed distribution is shown by studying monotonicity properties of density and hazard functions. It is shown that the distribution belongs to the family of upside-down bathtub shaped distributions. Maximum likelihood estimators along with asymptotic confidence intervals are constructed for estimating the unknown parameters. An algorithm is presented for random number generation form the distribution. The property of consistency of MLEs has been verified on the basis of simulated samples. The applicability of the extended inverse Lindley distribution is illustrated by means of real data analysis.

Is the Banker a Myth?

Mike Cox

https://doi.org/10.6339/JDS.201104_09(2).0005

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 9, Issue 2 (2011), pp. 205–219

Abstract

Abstract: The actions of the anonymous banker in the highstake television gambling programme Deal or No Deal is examined. If a model can successfully predict his behaviour it might suggest that an automatic process is employed to reach his decisions. Potential strategies associated with a number of games are investigated and a model developed for the offers the anonymous banker makes to buy out the player. This approach is developed into a selection strategy of the optimum stage at which a player should accept the money offered. This is reduced to a simple table, by knowing their current position players can rapidly arrive at an appropriate decision strategy with associated probabilities. These probabilities give a guide as to the confidence to be placed in the choice adopted.

Multiple Taxicab Correspondence Analysis of a Survey Related to Health Services

Vartan Choulakian Jacques Allard Biagio Simonetti

https://doi.org/10.6339/JDS.2013.11(2).1113

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 11, Issue 2 (2013), pp. 205–229

Abstract

Abstract: We present an analysis of a health survey data by multiple cor respondence analysis (MCA) and multiple taxicab correspondence analysis (MTCA), MTCA being a robust L1 variant of MCA. The survey has one passive item, gender, and 22 active substantive items representing health services offered by municipal authorities; each active item has four answer categories: this service is used, never tried, tried with no access, non re sponse. We show that the first principal MTCA factor is perfectly charac terized by the sum score of the category this service is used over all service items. Further, we prove that such a sum score characterization always exists for any survey data.

Estimating Age-specific Prevalence of Testosterone Deficiency in Men Using Normal Mixture Models

Yungtai Lo

https://doi.org/10.6339/JDS.2009.07(2).463

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 7, Issue 2 (2009), pp. 203–217

Abstract

Abstract: Testosterone levels decline as men age. There is little consensus on what testosterone levels are normal for aging men. In this paper, we estimate age-specific prevalence of testosterone deficiency in men using nor mal mixture models when no generally agreed on cut-off value for defining testosterone deficiency is available. The Box-Cox power transformation is used to determine which transformation is most appropriate for correcting skewness in data and best suits normal mixture distributions. Parametric bootstrap tests are used to determine the number of components in a normal mixture.

Small F-ratios: Red Flags in the Linear Model

Gary E. Meek Ceyhun Ozgur Kenneth A. Dunning

https://doi.org/10.6339/JDS.2007.05(2).339

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 5, Issue 2 (2007), pp. 199–215

General Marginal Regression Models for the Joint Modeling of Event Frequency and Correlated Severities with Applications to Clinical Trials

Andrew S. Allen Huiman X. Barnhart

https://doi.org/10.6339/JDS.2005.03(2).195

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 3, Issue 2 (2005), pp. 199–219

Abstract

Abstract: In many clinical trials, information is collected on both the frequency of event occurrence and the severity of each event. For example, in evaluating a new anti-epileptic medication both the total number of seizures a patient has during the study period as well as the severity (e.g., mild, severe) of each seizure could be measured. In order to arrive at a full picture of drug or treatment performance, one needs to jointly model the number of events and their correlated ordinal severity measures. A separate analysis is not recommended as it is inefficient and can lead to what we define as “zero length bias” in estimates of treatment effect on severity. This paper proposes a general, likelihood based, marginal regression model for jointly modeling the number of events and their correlated ordinal severity measures. We describe parameter estimation issues and derive the Fisher information matrix for the joint model in order to obtain the asymptotic covariance matrix of the parameter estimates. A limited simulation study is conducted to examine the asymptotic properties of the maximum likelihood estimators. Using this joint model, we propose tests that incorporate information from both the number of events and their correlated ordinal severity measures. The methodology is illustrated with two examples from clinical trials: the first concerning a new drug treatment for epilepsy; the second evaluating the effect of a cholesterol lowering medication on coronary artery disease.

Analysis of Bank Failure Using Published Financial Statements: The Case of Indonesia (Part 1)

Loso Judijanto E. V. Khmaladze

https://doi.org/10.6339/JDS.2003.01(2).118

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 1, Issue 2 (2003), pp. 199–230

Machine Learning Algorithms To Predict The Childhood Anemia In Bangladesh

Jahidur Rahman Khan Srizan Chowdhury Humayera Islam All authors (4)

https://doi.org/10.6339/JDS.201901_17(1).0009

Pub. online: 4 Aug 2022 Type: Research Article

Open Access

Journal: Journal of Data Science Volume 17, Issue 1 (2019), pp. 195–218

Abstract

Anemia, especially among children, is a serious public health problem in Bangladesh. Apart from understanding the factors associated with anemia, it may be of interest to know the likelihood of anemia given the factors. Prediction of disease status is a key to community and health service policy making as well as forecasting for resource planning. We considered machine learning (ML) algorithms to predict the anemia status among children (under five years) using common risk factors as features. Data were extracted from a nationally representative cross-sectional survey- Bangladesh Demographic and Health Survey (BDHS) conducted in 2011. In this study, a sample of 2013 children were selected for whom data on all selected variables was available. We used several ML algorithms such as linear discriminant analysis (LDA), classification and regression trees (CART), k-nearest neighbors (k-NN), support vector machines (SVM), random forest (RF) and logistic regression (LR) to predict the childhood anemia status. A systematic evaluation of the algorithms was performed in terms of accuracy, sensitivity, specificity, and area under the curve (AUC). We found that the RF algorithm achieved the best classification accuracy of 68.53% with a sensitivity of 70.73%, specificity of 66.41% and AUC of 0.6857. On the other hand, the classical LR algorithm reached a classification accuracy of 62.75% with a sensitivity of 63.41%, specificity of 62.11% and AUC of 0.6276. Among all considered algorithms, the k-NN gave the least accuracy. We conclude that ML methods can be considered in addition to the classical regression techniques when the prediction of anemia is the primary focus.

60 61 62 63 64

Items per page

Export citation

Copy and paste formatted citation

Formatted citation

Placeholder

Citation style

Download citation in file

Export format

Authors

Placeholder

RSS

Journal of data science

Online ISSN: 1683-8602
Print ISSN: 1680-743X

About

About journal

For contributors

Submit
OA Policy
Become a Peer-reviewer

Contact us

JDS@ruc.edu.cn
No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China