Abstract: The modified autoregressive (mAR) index has been proposed as a description of the clustering of shots of similar duration in a motion picture. In this paper we derive robust estimates of the mAR index for high grossing films at the US box office using a rank-based autocorrelation function resis tant to the influence of outliers and compare this to estimates obtained using the classical, moment-based autocorrelation function. The results show that (1) The classical mAR index underestimates both the level of shot clustering in a film and the variation in style among the films in the sample; (2) there is a decline in shot clustering from 1935 to the 1950s followed by an increase from the 1960s to the 1980s and a levelling off thereafter rather than the monotonic trend indicated by the classical index, and this is mirrored in the trend of the median shot lengths and interquartile range; and (3) the rank mAR index identifies differences between genres overlooked when using the classical index.
Abstract: In this note a new method of comparing component structural importance is introduced and compared to other existing ones. Especially, relationships of the new comparison method to the H-importance due to Hwang (2001,2005), the criticality ordering due to Boland et al. (1989) and Birnbaum importance are obtained. Illustrative examples are given.
In semiparametric regression it is of interest to detect anomalous observations that exert an unduly large influence on the parameter’s esti-mate and fitted values. Usually the existence of influential observations is complicated by the presence of collinearity. However no method of influ-ence diagnostics available for the possible effects that collinearity can have on the influence of an observation on the estimates of parametric and non-parametric component of semiparametric regression models. In this paper we show when Liu estimators are used to mitigate the effects of collinearity the influence of some observations can be drastically modified. We propose a case deletion formula to detect influential points in Liu estimators of semi-parametric regression models . As an illustrative example a real data set are analysed.
Abstract: The traditional method for processing functional magnetic resonance imaging (FMRI) data is based on a voxel-wise, general linear model. For experiments conducted using a block design, where periods of activation are interspersed with periods of rest, a haemodynamic response function (HRF) is convolved with the design function and, for each voxel, the convolution is regressed on prewhitened data. An initial analysis of the data often involves computing voxel-wise two-sample t-tests, which avoids a direct specification of the HRF. Assuming only the length of the haemodynamic delay is known, scans acquired in transition periods between activation and rest are omitted, and the two-sample t-test is used to compare mean levels during activation versus mean levels during rest. However, the validity of the two-sample t-test is based on the assumption that the data are Gaussian with equal variances. In this article, we consider the Wilcoxon rank test as well as modified versions of the classical t-test that correct for departures from these assumptions. The relative performance of the tests are assessed by applying them to simulated data and comparing their size and power; one of the modified tests (the CW test) is shown to be superior.
Abstract: Accelerated life testing (ALT) has gained greater importance because of dealing with high reliability units. As a result, there is a big need to use a goodness of fit (GOF) technique for testing the underlying lifetime distribution. But there is a difficulty due to the existence of several stress levels with different samples of units at each level. Then, the choice of a certain GOF technique is based on its capability to combine the failure times from all stress levels to reach a conclusion about the adequacy of a certain lifetime distribution at each stress level. In this paper, the extended Neyman’s smooth test (ENST) is chosen. It is then modified in order to be used in validating the distributional assumption of accelerated failure time (AFT) model. This modified method is called; the adapted extended Neyman’s smooth test (AENST). It is applied to test for both Weibull and exponential distributions in case of constant stress under complete sampling. To check the performance of the AENST, a comparison is made with the conditional probability integral transformation test (CPITT) via a simulation study. Moreover, a real data set is provided to illustrate the application of the introduced AENST. The results revealed that the AENST is a powerful test comparing with the CPITT. Thus, the AENST is recommended for testing the AFT models.
Abstract: This paper describes and compares three clustering techniques: traditional clustering methods, Kohonen maps and latent class models. The paper also proposes some novel measures of the quality of a clustering. To the best of our knowledge, this is the first contribution in the literature to compare these three techniques in a context where the classes are not known in advance.
Abstract: This article concerns the Bayesian estimation of interest rate mod els based on Euler-Maruyama approximation. Assume the short term inter est rate follows the CIR model, an iterative method of Bayesian estimation is proposed. Markov Chain Monte Carlo simulation based on Gibbs sam pler is used for the posterior estimation of the parameters. The maximum A-posteriori estimation using the genetic algorithm is employed for finding the Bayesian estimates of the parameters. The method and the algorithm are calibrated with the historical data of US Treasury bills.
Abstract: Information fusion has become a powerful tool for challenging applications such as biological prediction problems. In this paper, we apply a new information-theoretical fusion technique to HIV-1 protease cleavage site prediction, which is a problem that has been in the focus of much interest and investigation of the machine learning community recently. It poses a difficult classification task due to its high dimensional feature space and a relatively small set of available training patterns. We also apply a new set of biophysical features to this problem and present experiments with neural networks, support vector machines, and decision trees. Application of our feature set results in high recognition rates and concise decision trees, producing manageable rule sets that can guide future experiments. In particular, we found a combination of neural networks and support vector machines to be beneficial for this problem.