Pub. online:31 Mar 2023Type:Computing In Data ScienceOpen Access
Journal:Journal of Data Science
Volume 21, Issue 2 (2023): Special Issue: Symposium Data Science and Statistics 2022, pp. 333–353
Abstract
High-Order Markov Chains (HOMC) are conventional models, based on transition probabilities, that are used by the United States Department of Agriculture (USDA) National Agricultural Statistics Service (NASS) to study crop-rotation patterns over time. However, HOMCs routinely suffer from sparsity and identifiability issues because the categorical data are represented as indicator (or dummy) variables. In fact, the dimension of the parametric space increases exponentially with the order of HOMCs required for analysis. While parsimonious representations reduce the number of parameters, as has been shown in the literature, they often result in less accurate predictions. Most parsimonious models are trained on big data structures, which can be compressed and efficiently processed using alternative algorithms. Consequently, a thorough evaluation and comparison of the prediction results obtain using a new HOMC algorithm and different types of Deep Neural Networks (DNN) across a range of agricultural conditions is warranted to determine which model is most appropriate for operational crop specific land cover prediction of United States (US) agriculture. In this paper, six neural network models are applied to crop rotation data between 2011 and 2021 from six agriculturally intensive counties, which reflect the range of major crops grown and a variety of crop rotation patterns in the Midwest and southern US. The six counties include: Renville, North Dakota; Perkins, Nebraska; Hale, Texas; Livingston, Illinois; McLean, Illinois; and Shelby, Ohio. Results show the DNN models achieve higher overall prediction accuracy for all counties in 2021. The proposed DNN models allow for the ingestion of long time series data, and robustly achieve higher accuracy values than a new HOMC algorithm considered for predicting crop specific land cover in the US.
Abstract: A total of 1094 HIV patients were involved in a cohort study (from January-December 2010) with follow-up in their CD4 cell transition counts and grouped according to their immunological states into five(5) states developed by Guiseppe Di Biase et al (2007). The five states (5) considered were: State one (CD4 > 500 cells/mm3 ), State two (350 < CD4 500 cells /mm3 ) State three(200 < CD4 350 cells/mm3 ), State four(CD4 200 cells/mm3 ), State five(Death). These states de ne the seriousness of the sickness based on the epidemiological states of the patients CD4 cell counts. We use the non-stationary Markov chain model for the prediction. The estimation of the non-stationary probabilities were done using the exponential smoothing technique. The result of the prediction showed a gradual decrease of the CD4 cells as we move from Jan-Dec. Furthermore, the result showed that the patients in the study cannot survive death from the month Dec. 2011, if they are not subjected to therapy, using highly active antiretrovirals (HAART). The results also showed that the model can be used for the testing of the drug e efficacy administered to patients within a given period.