README FOR "Hierarchical ridge regression for incorporating prior information in genomic studies"
Manuscript ID: JDS207-007

Contact Info: Eric S. Kawaguchi 
Please send any questions to eric.kawaguchi@med.usc.edu 


Code to reproduce simulation results are located in this folder.

Before running the R scripts please use installNecessaryPackages.R located in the /code/ to install the necessary packages.

Load the R project (JDS2107-007.Rproj) since it will automatically set the working directory. (Working directory should look like "~/.../JDS2107-007/"

#######################################################################################
# Simulations are stored in the /code/ directory. From here on out it is assumed that we will be working in this directory.


1. simulationList.xlsx: Holds all simulation parameters that were performed in the main manuscript (and supplemental material).
	a. Sheet 1: Continuous y and binary Z  (Section 3.1)
	b. Sheet 2: Continuous y and continuous Z (Section 3.2)
	c. Sheet 3: Binary y and continuous Z (Section 3.3)

2. simulations/:  Directory that contains the code to run the simulations found in the text (more details below).

3. results/: Directory that contains ALL simulation results. The R scripts to perform other simulation studies are outputted as .rds files and are housed in this directory. S

4. eval/: Directory that contains the code to produce plots found in text. ggplot2 and dplyr are necessary packages. eval-xy.R denotes the code used to create the plot corresponding to Section x.y. (e.g. eval-31.R and Figure 1 in Section 3.1)

5. sourceFiles/: Directory that contains additional R scripts to compute correlation matrices, generating the outcome, etc. This folder is typically sourced before the simulations are performed.

Note: Simulations were performed using the high-performance computing (HPC) clusters at USC. These simulations can take >24 hours if run on a local machine. Please see the paper_results.zip file that contain the simulation results that are presented in the paper. The R scripts in the eval folder source the .rds files in the paper_results directory (once unzipped). Remember to comment (and uncomment) the correct directory if you plan on running these results manually.

#######################################################################################
REAL DATA APPLICATION
# The real data application can be found in the RDA/ directory: Methyl (Section 4.1) and METABRIC (Section 4.2)

- run.R: R code to run the analysis.
- eval.R: R code to produce results in Table 1 and Figure 4.

Similar to the simulations, running eval.R for both analyses will be computationally demanding. We include the necessary results to produce the outputs (Figures and Tables) in the Methyl/results and METABRIC/results directories. Thus, to reproduce Table 1 and Figure 4, one only needs to source the respective eval.R files.



