Abstract: Suppose that an order restriction is imposed among several means in time series. We are interested in testing the homogeneity of these unknown means under this restriction. In the present paper, a test based on the isotonic regression is done for monotonic ordered means in time series with stationary process and short range dependent sequences errors. A test statistic is proposed using the penalized likelihood ratio (PLR) approach. Since the asymptotic null distribution of test statistic is complicated, its critical values are computed by using Monte Carlo simulation method for some values of sample sizes at different significance levels. The power study of our test statistic is provided which is more powerful than that of the test proposed by Brillinger (1989). Finally, to show the application of the proposed test, it is applied to real dataset contains monthly Iran rainfall records.
A new four parameter extreme value distribution is defined and studied. Various structural properties of the proposed distribution including ordinary and incomplete moments, generating functions, residual and reversed residual life functions, order statistics are investigated. Some useful characterizations based on two truncated moments as well as based on the reverse hazard function and on certain functions of the random variable are presented. The maximum likelihood method is used to estimate the model parameters. Further, we propose a new extended regression model based on the logarithm of the new distribution. The new distribution is applied to model three real data sets to prove empirically its flexibility.
In the linear regression setting, we propose a general framework, termed weighted orthogonal components regression (WOCR), which encompasses many known methods as special cases, including ridge regression and principal components regression. WOCR makes use of the monotonicity inherent in orthogonal components to parameterize the weight function. The formulation allows for efficient determination of tuning parameters and hence is computationally advantageous. Moreover, WOCR offers insights for deriving new better variants. Specifically, we advocate assigning weights to components based on their correlations with the response, which may lead to enhanced predictive performance. Both simulated studies and real data examples are provided to assess and illustrate the advantages of the proposed methods.
Abstract: Cancer is a complex disease where various types of molecular aber rations drive the development and progression of malignancies. Among the diverse molecular aberrations, inherited and somatic mutations on DNA se quences are considered as major drivers for oncogenesis. The complexity of somatic alterations is revealed from large-scale investigations of cancer genomes and robust methods for interring the function of genes. In this review, we will describe sequence mutations of several cancer-related genes and discuss their functional implications in cancer. In addition, we will in troduce the on-line resources for accessing and analyzing sequence mutations in cancer. We will also provide an overview of the statistical and computa tional approaches and future prospects to conduct comprehensive analyses of the somatic alterations in cancer genomes.
Abstract: A randomly truncated sample appears when the independent variables T and L are observable if L < T. The truncated version Kaplan-Meier estimator is known to be the standard estimation method for the marginal distribution of T or L. The inverse probability weighted (IPW) estimator was suggested as an alternative and its agreement to the truncated version Kaplan-Meier estimator has been proved. This paper centers on the weak convergence of IPW estimators and variance decomposition. The paper shows that the asymptotic variance of an IPW estimator can be decom posed into two sources. The variation for the IPW estimator using known weight functions is the primary source, and the variation due to estimated weights should be included as well. Variance decomposition establishes the connection between a truncated sample and a biased sample with know prob abilities of selection. A simulation study was conducted to investigate the practical performance of the proposed variance estimators, as well as the relative magnitude of two sources of variation for various truncation rates. A blood transfusion data set is analyzed to illustrate the nonparametric inference discussed in the paper.
A graphical tool for choosing the number of nodes for a neural network is introduced. The idea is to fit the neural network with a range of numbers of nodes at first, and then generate a jump plot using a transformation of the mean square errors of the resulting residuals. A theorem is proven to show that the jump plot will select several candidate numbers of nodes among which one is the true number of nodes. Then a single node only test, which has been theoretically justified, is used to rule out erroneous candidates. The method has a sound theoretical background, yields good results on simulated datasets, and shows wide applicability to datasets from real research.
In this paper, we advance new families of bivariate copulas constructed by distributional distortions of existing bivariate copulas. The distortions under consideration are based on the unit gamma distribution of two forms. When the initial copula is Archimedean, the induced copula is also Archimedean under the admissible parameter space. Properties such as Kendall’s tau coefficient, tail dependence coefficients and tail orders for the new families of copulas are derived. An empirical application to economic indicator data is presented.
Abstract:Air pollution shows itself as a serious problem in big cities in Turkey, especially for winter seasons. Particulate atmospheric pollution in urban areas is considered to have significant impact on human health. Therefore, the ability to make accurate predictions of particulate ambient concentrations is important to improve public awareness and air quality management. Ambient PM10 (i.e particulate diameter less than 10um in size) pollution has negative impacts on human health and it is influenced by meteorological conditions. In this study, partial least squares regression, principal component regression, ridge regression and multiple linear regression methods are compared in modeling and predicting daily mean PM10 concentrations on the base of various meteorological parameters obtained for the city of Ankara, in Turkey. The analysed period is February 2007. The results show that while multiple linear regression and ridge regression yield somewhat better results for fitting to this dataset, principal component regression and partial least squares regression are better than both of them in terms of prediction of PM10 values for future datasets. In addition, partial least squares regression is the remarkable method in terms of predictive ability as it has a close performance with principal component regression even with less number of factors.
Abstract: A new extension of the generalized gamma distribution with six parameter called the Kummer beta generalized gamma distribution is introduced and studied. It contains at least 28 special models such as the beta generalized gamma, beta Weibull, beta exponential, generalized gamma, Weibull and gamma distributions and thus could be a better model for analyzing positive skewed data. The new density function can be expressed as a linear combination of generalized gamma densities. Various mathematical properties of the new distribution including explicit expressions for the ordinary and incomplete moments, generating function, mean deviations, entropy, density function of the order statistics and their moments are derived. The elements of the observed information matrix are provided. We discuss the method of maximum likelihood and a Bayesian approach to fit the model parameters. The superiority of the new model is illustrated by means of three real data sets.
We propose distributed generalized linear models for the purpose of incorporating lagged effects. The model class provides a more accurate statistical measure of the relationship between the dependent variable and a series of covariates. The estimators from the proposed procedure are shown to be consistent. Simulation studies not only confirm the asymptotic properties of the estimators, but exhibit the adverse effects of model misspecification in terms of accuracy of model estimation and prediction. The application is illustrated by analyzing the presidential election data of 2016.