Abstract. Unemployment is one of the most important issues in macro economics. Unemployment creates many economic and social problems in the economy. The condition and qualification of labor force in a country show economical developments. In the light of these facts, a developing country should overcome the problem of unemployment. In this study, the performance of robust biased Robust Ridge Regression (RRR), Robust Principal Component Regression (RPCR) and RSIMPLS methods are compared with each other and their classical versions known as Ridge Regression (RR), Principal Component Regression (PCR) and Partial Least Squares Regression (PLSR) in terms of predictive ability by using trimmed Root Mean Squared Error (TRMSE) statistic in case of both of multicollinearity and outliers existence in an unemployment data set of Turkey. Analysis results show that RRR model is chosen as the best model for determining unemployment rate in Turkey for the period of 1985-2012. Robust biased RRR method showed that the most important independent variable effecting the unemployment rate is Purchasing Power Parities (PPP). The least important variables effecting the unemployment rate are Import Growth Rate (IMP) and Export Growth Rate (EXP). Hence, any increment in PPP cause an important increment in unemployment rate, however, any increment in IMP causes an unimportant increase in unemployment rate. Any increment in EXP causes an unimportant decrease in unemployment rate.
We propose distributed generalized linear models for the purpose of incorporating lagged effects. The model class provides a more accurate statistical measure of the relationship between the dependent variable and a series of covariates. The estimators from the proposed procedure are shown to be consistent. Simulation studies not only confirm the asymptotic properties of the estimators, but exhibit the adverse effects of model misspecification in terms of accuracy of model estimation and prediction. The application is illustrated by analyzing the presidential election data of 2016.
Abstract: In this study, the data based on nucleic acid amplification tech niques (Polymerase chain reaction) consisting of 23 different transcript vari ables which are involved to investigate genetic mechanism regulating chlamy dial infection disease by measuring two different outcomes of muring C. pneumonia lung infection (disease expressed as lung weight increase and C. pneumonia load in the lung), have been analyzed. A model with fewer reduced transcript variables of interests at early infection stage has been obtained by using some of the traditional (stepwise regression, partial least squares regression (PLS)) and modern variable selection methods (least ab solute shrinkage and selection operator (LASSO), forward stagewise regres sion and least angle regression (LARS)). Through these variable selection methods, the variables of interest are selected to investigate the genetic mechanisms that determine the outcomes of chlamydial lung infection. The transcript variables Tim3, GATA3, Lacf, Arg2 (X4, X5, X8 and X13) are being detected as the main variables of interest to study the C. pneumonia disease (lung weight increase) or C. pneumonia lung load outcomes. Models including these key variables may provide possible answers to the problem of molecular mechanisms of chlamydial pathogenesis.
Abstract: A basic assumption concerned with general linear regression model is that there is no correlation (or no multicollinearity) between the explana tory variables. When this assumption is not satisfied, the least squares estimators have large variances and become unstable and may have a wrong sign. Therefore, we resort to biased regression methods, which stabilize the parameter estimates. Ridge regression (RR) and principal component regression (PCR) are two of the most popular biased regression methods which can be used in case of multicollinearity. But the r-k class estimator, which is composed by combining the RR estimator and the PCR estimator into a single estimator gives the better estimates of the regression coefficients than the RR estimator and PCR estimator. This paper explores the multiple regression technique using r-k class estimator between TFR and other socio-economic and demographic variables and the data has been taken from the National Family Health Survey-III (NFHS-III): 29 states of India. The analysis shows that use of contraceptive devices shares the greatest impact on fertility rate followed by maternal care, use of improved water, female age at marriage and spacing between births.
Influential observations do posed a major threat on the performance of regression model. Different influential statistics including Cook’s Distance and DFFITS have been introduced in literatures using Ordinary Least Squares (OLS). The efficiency of these measures will be affected with the presence of multicollinearity in linear regression. However, both problems can jointly exist in a regression model. New diagnostic measures based on the Two-Parameter Liu-Ridge Estimator (TPE) defined by Ozkale and Kaciranlar (2007) was proposed as alternatives to the existing ones. Approximate deletion formulas for the detection of influential cases for TPE are proposed. Finally, the diagnostic measures are illustrated with two real life dataset.