README of gPOCRE 
================

This code is written in MATLAB. Please contact 
            Dabao Zhang (dabao.zhang@uci.edu) 
for any relevant questions.


First Time to Run gPOCRE
========================
Compile the C codes for using 'ebtz':
     > CompileGEB;


Simulation Study:
=================

1. For Model 1, run
     > n = 200;
     > BatchSimuStudyCVGPOCRE(1,n);
     > n = 500;
     > BatchSimuStudyCVGPOCRE(1,n);

2. For Model 2, run
     > n = 200;
     > BatchSimuStudyCVGPOCRE(2,n);
     > n = 500;
     > BatchSimuStudyCVGPOCRE(2,n);

3. For Model 3, run
     > n = 200;
     > BatchSimuStudyCVGPOCRE(3,n);
     > n = 500;
     > BatchSimuStudyCVGPOCRE(3,n);

4. For Model 4, run
     > n = 200;
     > BatchSimuStudyCVGPOCRE(4,n);
     > n = 500;
     > BatchSimuStudyCVGPOCRE(4,n);


Real Data Analysis:
===================

1. The ISOLET data from Fanty and Cole (1990) can be downloaded from (https://www.openml.org/search?type=data&sort=version&status=any&order=asc&exact_name=isolet&id=41966). The file "trainselct_iso.csv" contains the indicators of each training set in the 50 repeated analyses.

Application of GPOCRE method > gpocre_iso;
Application of LASSO method > lasso_iso;
Application of SGPLS method: See R codes in sgpls_iso.R

2. The breast cancer data from TCGA can be found in the R package mixOmics (https://mixomics.org/). The file "trainselct_br.csv" contains the indicators of each training set in the 50 repeated analyses.

Application of GPOCRE method > gpocre_breast;
Application of LASSO method > lasso_breast;
Application of SGPLS method: See R codes in sgpls_breast.R


How to Use gPOCRE in General:
=============================

1. Use gPOCREScreen() to screen variables first, fit with gPOCREPath(),
   select the optimal model with SelectModel(),  and calculate
   p-values using RBootsGPOCRE(). Try
     > [sres,pvalue,gppres] = Ex4gPOCRE;
   Note that calculating p-values with the bootstrap method is problematic, 
   so it is recommended to obtain p-values with multivariate analysis.

2. To use cross-validation method to choose the tuning parameter,
   read and try Ex4CVGPOCRE.m and try
     > res = Ex4CVGPOCRE;

3. Code a similar .m file to analyze your own data. You may want
   to change the settings in option according to your data
   analysis.

4. You can also use the function gPOCRE(...) if you want to try a
   particular lambda value.


Note
====

1. The optimal tuning parameter can be determined using either cross-validation or EBIC (by Chen & Chen, 2010) or BIC or AIC or AICC.

2. Note that gPOCRE(...) always assumes a fixed weight matrix W. If W is not provided, an identity matrix is used. So, it is recommended to run gPOCRE(...) to get an initial beta, call CalcOptW(...) to update W, then run gPOCRE(...) with the updated W. gPOCREPath(...) follows this strategy.
