<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1019</article-id>
<article-id pub-id-type="doi">10.6339/21-JDS1019</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Statistical Data Science</subject></subj-group></article-categories>
<title-group>
<article-title>A Pan-Cancer Network Analysis with Integration of miRNA-Gene Targeting for Multiomics Datasets</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Linder</surname><given-names>Henry</given-names></name><xref ref-type="aff" rid="j_jds1019_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0001-8986-0354</contrib-id>
<name><surname>Zhang</surname><given-names>Yuping</given-names></name><email xlink:href="mailto:yuping.zhang@uconn.edu">yuping.zhang@uconn.edu</email><xref ref-type="aff" rid="j_jds1019_aff_001">1</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<aff id="j_jds1019_aff_001"><label>1</label>Department of Statistics, <institution>University of Connecticut</institution>, Storrs, Connecticut, <country>USA</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author. Email: <ext-link ext-link-type="uri" xlink:href="mailto:yuping.zhang@uconn.edu">yuping.zhang@uconn.edu</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2021</year></pub-date><pub-date pub-type="epub"><day>16</day><month>8</month><year>2021</year></pub-date><volume>19</volume><issue>4</issue><fpage>555</fpage><lpage>568</lpage><supplementary-material id="S1" content-type="document" xlink:href="jds1019_s001.pdf" mimetype="application" mime-subtype="pdf">
<caption>
<title>Supplementary Material</title>
<p>Supplementary Materials include descriptions for data and software.</p>
</caption>
</supplementary-material><history><date date-type="received"><day>12</day><month>3</month><year>2021</year></date><date date-type="accepted"><day>17</day><month>7</month><year>2021</year></date></history>
<permissions><copyright-statement>2021 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2021</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>Large-scale genomics studies provide researchers with access to extensive datasets with extensive detail and unprecedented scope that encompasses not only genes, but also more experimental functional units, including non-coding microRNAs (miRNAs). In order to analyze these high-fidelity data while remaining faithful to the underlying biology, statistical methods are necessary that can reflect the full range of understanding in contemporary molecular biology, while remaining flexible enough to analyze a wide range of data and complex phenomena. Leveraging multiple omics datasets, miRNA-gene targets as well as signaling pathway topology, we present an integrative linear model to analyze signaling pathways. Specifically, we use a mixed linear model to characterize tumor and healthy tissue, and execute statistical significance testing to identify pathway disturbances. In this paper, pan-cancer analysis is performed for a wide range of signaling pathways. We discuss specific findings from this analysis, as well as an interactive data visualization available for public consumption that contains the full range of our analytic findings.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>hypothesis testing</kwd>
<kwd>integrative statistical learning</kwd>
<kwd>large-scale inference</kwd>
<kwd>multi-view data integration</kwd>
</kwd-group>
</article-meta>
</front>
<back>
<ref-list id="j_jds1019_reflist_001">
<title>References</title>
<ref id="j_jds1019_ref_001">
<mixed-citation publication-type="journal"> <string-name><surname>Agarwal</surname> <given-names>V</given-names></string-name>, <string-name><surname>Bell</surname> <given-names>GW</given-names></string-name>, <string-name><surname>Nam</surname> <given-names>JW</given-names></string-name>, <string-name><surname>Bartel</surname> <given-names>DP</given-names></string-name> (<year>2015</year>). <article-title>Predicting effective microRNA target sites in mammalian mRNAs</article-title>. <source>eLife</source>, <volume>4</volume>: <elocation-id>e05005</elocation-id>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_002">
<mixed-citation publication-type="journal"> <string-name><surname>Backes</surname> <given-names>C</given-names></string-name>, <string-name><surname>Kehl</surname> <given-names>T</given-names></string-name>, <string-name><surname>Stöckel</surname> <given-names>D</given-names></string-name>, <string-name><surname>Fehlmann</surname> <given-names>T</given-names></string-name>, <string-name><surname>Schneider</surname> <given-names>L</given-names></string-name>, <string-name><surname>Meese</surname> <given-names>E</given-names></string-name>, <etal>et al.</etal> (<year>2016</year>). <article-title>miRPathDB: A new dictionary on microRNAs and target pathways</article-title>. <source>Nucleic Acids Research</source>, <volume>45</volume>(<issue>D1</issue>): <fpage>D90</fpage>–<lpage>D96</lpage>. <elocation-id>gkw926</elocation-id>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_003">
<mixed-citation publication-type="journal"> <string-name><surname>Benjamini</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Hochberg</surname> <given-names>Y</given-names></string-name> (<year>1995</year>). <article-title>Controlling the false discovery rate: A practical and powerful approach to multiple testing</article-title>. <source>Journal of the Royal Statistical Society, Series B, Methodological</source>, <volume>57</volume>(<issue>1</issue>): <fpage>289</fpage>–<lpage>300</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_004">
<mixed-citation publication-type="journal"> <string-name><surname>Cai</surname> <given-names>T</given-names></string-name>, <string-name><surname>Cai</surname> <given-names>TT</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>A</given-names></string-name> (<year>2016</year>). <article-title>Structured matrix completion with applications to genomic data integration</article-title>. <source>Journal of the American Statistical Association</source>, <volume>111</volume>(<issue>514</issue>): <fpage>621</fpage>–<lpage>633</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_005">
<mixed-citation publication-type="journal"> <string-name><surname>Dhawan</surname> <given-names>A</given-names></string-name>, <string-name><surname>Scott</surname> <given-names>JG</given-names></string-name>, <string-name><surname>Harris</surname> <given-names>AL</given-names></string-name>, <string-name><surname>Buffa</surname> <given-names>FM</given-names></string-name> (<year>2018</year>). <article-title>Pan-cancer characterisation of microrna across cancer hallmarks reveals microrna-mediated downregulation of tumour suppressors</article-title>. <source>Nature Communications</source>, <volume>9</volume>(<issue>1</issue>): <fpage>5228</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_006">
<mixed-citation publication-type="journal"> <string-name><surname>Falzone</surname> <given-names>L</given-names></string-name>, <string-name><surname>Scola</surname> <given-names>L</given-names></string-name>, <string-name><surname>Zanghì</surname> <given-names>A</given-names></string-name>, <string-name><surname>Biondi</surname> <given-names>A</given-names></string-name>, <string-name><surname>Di Cataldo</surname> <given-names>A</given-names></string-name>, <string-name><surname>Libra</surname> <given-names>M</given-names></string-name>, <etal>et al.</etal> (<year>2018</year>). <article-title>Integrated analysis of colorectal cancer microRNA datasets: Identification of microRNAs associated with tumor development</article-title>. <source>Aging (Albany NY)</source>, <volume>10</volume>(<issue>5</issue>): <fpage>1000</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_007">
<mixed-citation publication-type="journal"> <string-name><surname>Grossman</surname> <given-names>RL</given-names></string-name>, <string-name><surname>Heath</surname> <given-names>AP</given-names></string-name>, <string-name><surname>Ferretti</surname> <given-names>V</given-names></string-name>, <string-name><surname>Varmus</surname> <given-names>HE</given-names></string-name>, <string-name><surname>Lowy</surname> <given-names>DR</given-names></string-name>, <string-name><surname>Kibbe</surname> <given-names>WA</given-names></string-name>, <etal>et al.</etal> (<year>2016</year>). <article-title>Toward a shared vision for cancer genomic data</article-title>. <source>The New England Journal of Medicine</source>, <volume>375</volume>(<issue>12</issue>): <fpage>1109</fpage>–<lpage>1112</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_008">
<mixed-citation publication-type="journal"> <string-name><surname>Gurbuz</surname> <given-names>N</given-names></string-name>, <string-name><surname>Ozpolat</surname> <given-names>B</given-names></string-name> (<year>2019</year>). <article-title>MicroRNA-based targeted therapeutics in pancreatic cancer</article-title>. <source>Anticancer Research</source>, <volume>39</volume>(<issue>2</issue>): <fpage>529</fpage>–<lpage>532</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_009">
<mixed-citation publication-type="journal"> <string-name><surname>Hamilton</surname> <given-names>MP</given-names></string-name>, <string-name><surname>Rajapakshe</surname> <given-names>K</given-names></string-name>, <string-name><surname>Hartig</surname> <given-names>SM</given-names></string-name>, <string-name><surname>Reva</surname> <given-names>B</given-names></string-name>, <string-name><surname>McLellan</surname> <given-names>MD</given-names></string-name>, <string-name><surname>Kandoth</surname> <given-names>C</given-names></string-name>, <etal>et al.</etal> (<year>2013</year>). <article-title>Identification of a pan-cancer oncogenic microRNA superfamily anchored by a central core seed motif</article-title>. <source>Nature Communications</source>, <volume>4</volume>: <fpage>2730</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_010">
<mixed-citation publication-type="journal"> <string-name><surname>Kim</surname> <given-names>S</given-names></string-name> (<year>2015</year>). <article-title>ppcor: An R package for a fast calculation to semi-partial correlation coefficients</article-title>. <source>Communications for Statistical Applications and Methods</source>, <volume>22</volume>(<issue>6</issue>): <fpage>665</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_011">
<mixed-citation publication-type="journal"> <string-name><surname>Kim</surname> <given-names>YH</given-names></string-name>, <string-name><surname>Goh</surname> <given-names>TS</given-names></string-name>, <string-name><surname>Lee</surname> <given-names>CS</given-names></string-name>, <string-name><surname>Oh</surname> <given-names>SO</given-names></string-name>, <string-name><surname>Kim</surname> <given-names>JI</given-names></string-name>, <string-name><surname>Jeung</surname> <given-names>SH</given-names></string-name>, <etal>et al.</etal> (<year>2017</year>). <article-title>Prognostic value of microRNAs in osteosarcoma: A meta-analysis</article-title>. <source>Oncotarget</source>, <volume>8</volume>(<issue>5</issue>): <fpage>8726</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_012">
<mixed-citation publication-type="journal"> <string-name><surname>Krämer</surname> <given-names>N</given-names></string-name>, <string-name><surname>Schäfer</surname> <given-names>J</given-names></string-name>, <string-name><surname>Boulesteix</surname> <given-names>AL</given-names></string-name> (<year>2009</year>). <article-title>Regularized estimation of large-scale gene association networks using graphical Gaussian models</article-title>. <source>BMC Bioinformatics</source>, <volume>10</volume>(<issue>1</issue>): <fpage>384</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_013">
<mixed-citation publication-type="journal"> <string-name><surname>Kwon</surname> <given-names>H</given-names></string-name>, <string-name><surname>Song</surname> <given-names>K</given-names></string-name>, <string-name><surname>Han</surname> <given-names>C</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>J</given-names></string-name>, <string-name><surname>Lu</surname> <given-names>L</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>W</given-names></string-name>, <etal>et al.</etal> (<year>2017</year>). <article-title>Epigenetic silencing of miRNA-34a in human cholangiocarcinoma via EZH2 and DNA methylation: Impact on regulation of Notch pathway</article-title>. <source>The American Journal of Pathology</source>, <volume>187</volume>(<issue>10</issue>): <fpage>2288</fpage>–<lpage>2299</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_014">
<mixed-citation publication-type="journal"> <string-name><surname>Lewis</surname> <given-names>BP</given-names></string-name>, <string-name><surname>Burge</surname> <given-names>CB</given-names></string-name>, <string-name><surname>Bartel</surname> <given-names>DP</given-names></string-name> (<year>2005</year>). <article-title>Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets</article-title>. <source>Cell</source>, <volume>120</volume>(<issue>1</issue>): <fpage>15</fpage>–<lpage>20</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_015">
<mixed-citation publication-type="journal"> <string-name><surname>Linder</surname> <given-names>H</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Y</given-names></string-name> (<year>2019</year>). <article-title>Iterative integrated imputation for missing data and pathway models with applications to breast cancer subtypes</article-title>. <source>Communications for Statistical Applications and Methods</source>, <volume>26</volume>(<issue>4</issue>): <fpage>411</fpage>–<lpage>430</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_016">
<mixed-citation publication-type="chapter"> <string-name><surname>Loh</surname> <given-names>PL</given-names></string-name>, <string-name><surname>Wainwright</surname> <given-names>MJ</given-names></string-name> (<year>2012</year>). <chapter-title>Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses</chapter-title>. In: <source>Advances in Neural Information Processing Systems</source>, <fpage>2087</fpage>–<lpage>2095</lpage>. <publisher-name>Curran Associates, Inc.</publisher-name></mixed-citation>
</ref>
<ref id="j_jds1019_ref_017">
<mixed-citation publication-type="journal"> <string-name><surname>Schaefer</surname> <given-names>CF</given-names></string-name>, <string-name><surname>Anthony</surname> <given-names>K</given-names></string-name>, <string-name><surname>Krupa</surname> <given-names>S</given-names></string-name>, <string-name><surname>Buchoff</surname> <given-names>J</given-names></string-name>, <string-name><surname>Day</surname> <given-names>M</given-names></string-name>, <string-name><surname>Hannay</surname> <given-names>T</given-names></string-name>, <etal>et al.</etal> (<year>2008</year>). <article-title>PID: The pathway interaction database</article-title>. <source>Nucleic Acids Research</source>, <volume>37</volume>(<issue>suppl_1</issue>): <fpage>D674</fpage>–<lpage>D679</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_018">
<mixed-citation publication-type="journal"> <string-name><surname>Shojaie</surname> <given-names>A</given-names></string-name>, <string-name><surname>Michailidis</surname> <given-names>G</given-names></string-name> (<year>2009</year>). <article-title>Analysis of gene sets based on the underlying regulatory network</article-title>. <source>Journal of Computational Biology</source>, <volume>16</volume>(<issue>3</issue>): <fpage>407</fpage>–<lpage>426</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_019">
<mixed-citation publication-type="journal"> <string-name><surname>Shojaie</surname> <given-names>A</given-names></string-name>, <string-name><surname>Michailidis</surname> <given-names>G</given-names></string-name> (<year>2010</year>). <article-title>Network enrichment analysis in complex experiments</article-title>. <source>Statistical Applications in Genetics and Molecular Biology</source>, <volume>9</volume>(<issue>1</issue>): Article <elocation-id>22</elocation-id>, <comment>34 pages</comment>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_020">
<mixed-citation publication-type="journal"> <string-name><surname>Stokowy</surname> <given-names>T</given-names></string-name>, <string-name><surname>Eszlinger</surname> <given-names>M</given-names></string-name>, <string-name><surname>Świerniak</surname> <given-names>M</given-names></string-name>, <string-name><surname>Fujarewicz</surname> <given-names>K</given-names></string-name>, <string-name><surname>Jarząb</surname> <given-names>B</given-names></string-name>, <string-name><surname>Paschke</surname> <given-names>R</given-names></string-name>, <etal>et al.</etal> (<year>2014</year>). <article-title>Analysis options for high-throughput sequencing in miRNA expression profiling</article-title>. <source>BMC Research Notes</source>, <volume>7</volume>(<issue>1</issue>): <fpage>144</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_021">
<mixed-citation publication-type="journal"> <string-name><surname>Sun</surname> <given-names>M</given-names></string-name>, <string-name><surname>Song</surname> <given-names>H</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>S</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>C</given-names></string-name>, <string-name><surname>Zheng</surname> <given-names>L</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>F</given-names></string-name>, <etal>et al.</etal> (<year>2017</year>). <article-title>Integrated analysis identifies microRNA-195 as a suppressor of Hippo-YAP pathway in colorectal cancer</article-title>. <source>Journal of Hematology &amp; Oncology</source>, <volume>10</volume>(<issue>1</issue>): <fpage>79</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_022">
<mixed-citation publication-type="journal"> <string-name><surname>Tokar</surname> <given-names>T</given-names></string-name>, <string-name><surname>Pastrello</surname> <given-names>C</given-names></string-name>, <string-name><surname>Rossos</surname> <given-names>AE</given-names></string-name>, <string-name><surname>Abovsky</surname> <given-names>M</given-names></string-name>, <string-name><surname>Hauschild</surname> <given-names>AC</given-names></string-name>, <string-name><surname>Tsay</surname> <given-names>M</given-names></string-name>, <etal>et al.</etal> (<year>2017</year>). <article-title>mirDIP 4.1-integrative database of human microRNA target predictions</article-title>. <source>Nucleic Acids Research</source>, <volume>46</volume>(<issue>D1</issue>): <fpage>D360</fpage>–<lpage>D370</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_023">
<mixed-citation publication-type="journal"> <string-name><surname>Tomczak</surname> <given-names>K</given-names></string-name>, <string-name><surname>Czerwińska</surname> <given-names>P</given-names></string-name>, <string-name><surname>Wiznerowicz</surname> <given-names>M</given-names></string-name> (<year>2015</year>). <article-title>The Cancer Genome Atlas (TCGA): An immeasurable source of knowledge</article-title>. <source>Contemporary Oncology</source>, <volume>19</volume>(<issue>1A</issue>): <fpage>A68</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_024">
<mixed-citation publication-type="journal"> <string-name><surname>Wei</surname> <given-names>L</given-names></string-name>, <string-name><surname>Jin</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>S</given-names></string-name>, <string-name><surname>Xu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Ji</surname> <given-names>Y</given-names></string-name> (<year>2017</year>). <article-title>TCGA-assembler 2: Software pipeline for retrieval and processing of TCGA/CPTAC data</article-title>. <source>Bioinformatics</source>, <volume>34</volume>(<issue>9</issue>): <fpage>1615</fpage>–<lpage>1617</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_025">
<mixed-citation publication-type="journal"> <string-name><surname>Wong</surname> <given-names>NW</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>S</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>X</given-names></string-name> (<year>2017</year>). <article-title>OncomiR: An online resource for exploring pan-cancer microRNA dysregulation</article-title>. <source>Bioinformatics</source>, <volume>34</volume>(<issue>4</issue>): <fpage>713</fpage>–<lpage>715</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_026">
<mixed-citation publication-type="journal"> <string-name><surname>Zhang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Linder</surname> <given-names>MH</given-names></string-name>, <string-name><surname>Shojaie</surname> <given-names>A</given-names></string-name>, <string-name><surname>Ouyang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>R</given-names></string-name>, <string-name><surname>Baggerly</surname> <given-names>KA</given-names></string-name>, <etal>et al.</etal> (<year>2018</year>). <article-title>Dissecting pathway disturbances using network topology and multi-platform genomics data</article-title>. <source>Statistics in Biosciences</source>, <volume>10</volume>: <fpage>86</fpage>–<lpage>106</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1019_ref_027">
<mixed-citation publication-type="journal"> <string-name><surname>Zhu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Qiu</surname> <given-names>P</given-names></string-name>, <string-name><surname>Ji</surname> <given-names>Y</given-names></string-name> (<year>2014</year>). <article-title>TCGA-assembler: Open-source software for retrieving and processing TCGA data</article-title>. <source>Nature Methods</source>, <volume>11</volume>(<issue>6</issue>): <fpage>599</fpage>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
