The assignment problem, crucial in various real-world applications, involves optimizing the allocation of agents to tasks for maximum utility. While it has been well-studied in the optimization literature when the underlying utilities between all agent-task pairs are known, research is sparse when the utilities are unknown and need to be learned from data on the fly. This paper addresses this gap, as motivated by mentor-mentee matching programs at many U.S. universities. We develop an efficient sequential assignment algorithm, with the aim of nearly maximizing the overall utility simultaneously over different time periods. Our proposed algorithm is to use stochastic bandit feedback to adaptively estimate the unknown utilities through linear regression models, integrating the Upper Confidence Bound (UCB) algorithm in the multi-armed bandit problem with the Hungarian algorithm in the assignment problem. We provide theoretical bounds of our algorithm for both the estimation error and the total regret. Additionally, numerical studies are also conducted to demonstrate the practical effectiveness of our algorithm.
An exponentiated Weibull-geometric distribution is defined and studied. A new count data regression model, based on the exponentiated Weibull-geometric distribution, is also defined. The regression model can be applied to fit an underdispersed or an over-dispersed count data. The exponentiated Weibull-geometric regression model is fitted to two numerical data sets. The new model provided a better fit than the fit from its competitors.
Abstract: In this study, we compared various block bootstrap methods in terms of parameter estimation, biases and mean squared errors (MSE) of the bootstrap estimators. Comparison is based on four real-world examples and an extensive simulation study with various sample sizes, parameters and block lengths. Our results reveal that ordered and sufficient ordered non-overlapping block bootstrap methods proposed by Beyaztas et al. (2016) provide better results in terms of parameter estimation and its MSE compared to conventional methods. Also, sufficient non-overlapping block bootstrap method and its ordered version have the smallest MSE for the sample mean among the others.
The Lindley distribution has been generalized by many authors in recent years. However, all of the known generalizations so far have restricted tail behaviors. Here, we introduce the most flexible generalization of the Lindley distribution with its tails controlled by two independent parameters. Various mathematical properties of the generalization are derived. Maximum likelihood estimators of its parameters are derived. Fisher’s information matrix and asymptotic confidence intervals for the parameters are given. Finally, a real data application shows that the proposed generalization performs better than all known ones
In this work, we introduce a new distribution for modeling the extreme values. Some important mathematical properties of the new model are derived. We assess the performance of the maximum likelihood method in terms of biases and mean squared errors by means of a simulation study. The new model is better than some other important competitive models in modeling the repair times data and the breaking stress data.
The present paper addresses computational and numerical challenges when working with t copulas and their more complicated extensions, the grouped t and skew t copulas. We demonstrate how the R package nvmix can be used to work with these copulas. In particular, we discuss (quasi-)random sampling and fitting. We highlight the difficulties arising from using more complicated models, such as the lack of availability of a joint density function or the lack of an analytical form of the marginal quantile functions, and give possible solutions along with future research ideas.
Pub. online:16 Sep 2021Type:Statistical Data ScienceOpen Access
Journal:Journal of Data Science
Volume 21, Issue 3 (2023): Special Issue: Advances in Network Data Science, pp. 446–469
Abstract
In this paper, we study macroscopic growth dynamics of social network link formation. Rather than focusing on one particular dataset, we find invariant behavior in regional social networks that are geographically concentrated. Empirical findings suggest that the startup phase of a regional network can be modeled by a self-exciting point process. After the startup phase ends, the growth of the links can be modeled by a non-homogeneous Poisson process with a constant rate across the day but varying rates from day to day, plus a nightly inactive period when local users are expected to be asleep. Conclusions are drawn based on analyzing four different datasets, three of which are regional and a non-regional one is included for contrast.