Pan-cancer subtyping in a 2D-map shows substructures that are driven by specific combinations of molecular characteristics

Erdogan Taskesen; Sjoerd Huisman; Ahmed Mahfouz; Jesse Krijthe; Jeroen de Ridder; A. van de Stolpe; Erik van den Akker; Wim Verhaegh; Marcel Reinders

doi:10.1038/srep24949

Pan-cancer subtyping in a 2D-map shows substructures that are driven by specific combinations of molecular characteristics

Erdogan Taskesen, Sjoerd Huisman, Ahmed Mahfouz, Jesse Krijthe, Jeroen de Ridder, A. van de Stolpe, Erik van den Akker, Wim Verhaegh, Marcel Reinders

Pattern Recognition and Bioinformatics

Research output: Contribution to journal › Article › Scientific › peer-review

18 Citations (Scopus)

70 Downloads (Pure)

Abstract

The use of genome-wide data in cancer research, for the identification of groups of patients with similar molecular characteristics, has become a standard approach for applications in therapy-response, prognosis-prediction, and drug-development. To progress in these applications, the trend is to move from single genome-wide measurements in a single cancer-type towards measuring several different molecular characteristics across multiple cancer-types. Although current approaches shed light on molecular characteristics of various cancer-types, detailed relationships between patients within cancer clusters are unclear. We propose a novel multi-omic integration approach that exploits the joint behavior of the different molecular characteristics, supports visual exploration of the data by a two-dimensional landscape, and inspection of the contribution of the different genome-wide data-types. We integrated 4,434 samples across 19 cancer-types, derived from TCGA, containing gene expression, DNA-methylation, copy-number variation and microRNA expression data. Cluster analysis revealed 18 clusters, where three clusters showed a complex collection of cancer-types, squamous-cell-carcinoma, colorectal cancers, and a novel grouping of kidney-cancers. Sixty-four samples were identified outside their tissue-of-origin cluster. Known and novel patient subgroups were detected for Acute Myeloid Leukemia’s, and breast cancers. Quantification of the contributions of the different molecular types showed that substructures are driven by specific (combinations of) molecular characteristics.

Original language	English
Article number	24949
Number of pages	13
Journal	Scientific Reports
Volume	6
DOIs	https://doi.org/10.1038/srep24949
Publication status	Published - 25 Apr 2016

Bibliographical note

Concerning DOI 10.1038/s41598-018-35518-w:This Article contains a typographical error in the spelling of the author Wim Verhaegh, which is incorrectly given as Wim Verheagh.
Correct version has been uploaded

Keywords

Cancer
Data integration
Data mining
Functional clustering
OA-Fund TU Delft

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1038/srep24949

s41598-018-35518-wFinal published version, 653 KBLicence: CC BY
s41598-018-35518-wFinal published version, 653 KBLicence: CC BY

Cite this

@article{51e6cd5ee77c4bc7abad4b0c9d290c08,

title = "Pan-cancer subtyping in a 2D-map shows substructures that are driven by specific combinations of molecular characteristics",

abstract = "The use of genome-wide data in cancer research, for the identification of groups of patients with similar molecular characteristics, has become a standard approach for applications in therapy-response, prognosis-prediction, and drug-development. To progress in these applications, the trend is to move from single genome-wide measurements in a single cancer-type towards measuring several different molecular characteristics across multiple cancer-types. Although current approaches shed light on molecular characteristics of various cancer-types, detailed relationships between patients within cancer clusters are unclear. We propose a novel multi-omic integration approach that exploits the joint behavior of the different molecular characteristics, supports visual exploration of the data by a two-dimensional landscape, and inspection of the contribution of the different genome-wide data-types. We integrated 4,434 samples across 19 cancer-types, derived from TCGA, containing gene expression, DNA-methylation, copy-number variation and microRNA expression data. Cluster analysis revealed 18 clusters, where three clusters showed a complex collection of cancer-types, squamous-cell-carcinoma, colorectal cancers, and a novel grouping of kidney-cancers. Sixty-four samples were identified outside their tissue-of-origin cluster. Known and novel patient subgroups were detected for Acute Myeloid Leukemia{\textquoteright}s, and breast cancers. Quantification of the contributions of the different molecular types showed that substructures are driven by specific (combinations of) molecular characteristics.",

keywords = "Cancer, Data integration, Data mining, Functional clustering, OA-Fund TU Delft",

author = "Erdogan Taskesen and Sjoerd Huisman and Ahmed Mahfouz and Jesse Krijthe and {de Ridder}, Jeroen and {van de Stolpe}, A. and {van den Akker}, Erik and Wim Verhaegh and Marcel Reinders",

note = "Concerning DOI 10.1038/s41598-018-35518-w:This Article contains a typographical error in the spelling of the author Wim Verhaegh, which is incorrectly given as Wim Verheagh. Correct version has been uploaded",

year = "2016",

month = apr,

day = "25",

doi = "10.1038/srep24949",

language = "English",

volume = "6",

journal = "Scientific Reports",

issn = "2045-2322",

publisher = "Nature",

}

TY - JOUR

T1 - Pan-cancer subtyping in a 2D-map shows substructures that are driven by specific combinations of molecular characteristics

AU - Taskesen, Erdogan

AU - Huisman, Sjoerd

AU - Mahfouz, Ahmed

AU - Krijthe, Jesse

AU - de Ridder, Jeroen

AU - van de Stolpe, A.

AU - van den Akker, Erik

AU - Verhaegh, Wim

AU - Reinders, Marcel

N1 - Concerning DOI 10.1038/s41598-018-35518-w:This Article contains a typographical error in the spelling of the author Wim Verhaegh, which is incorrectly given as Wim Verheagh. Correct version has been uploaded

PY - 2016/4/25

Y1 - 2016/4/25

N2 - The use of genome-wide data in cancer research, for the identification of groups of patients with similar molecular characteristics, has become a standard approach for applications in therapy-response, prognosis-prediction, and drug-development. To progress in these applications, the trend is to move from single genome-wide measurements in a single cancer-type towards measuring several different molecular characteristics across multiple cancer-types. Although current approaches shed light on molecular characteristics of various cancer-types, detailed relationships between patients within cancer clusters are unclear. We propose a novel multi-omic integration approach that exploits the joint behavior of the different molecular characteristics, supports visual exploration of the data by a two-dimensional landscape, and inspection of the contribution of the different genome-wide data-types. We integrated 4,434 samples across 19 cancer-types, derived from TCGA, containing gene expression, DNA-methylation, copy-number variation and microRNA expression data. Cluster analysis revealed 18 clusters, where three clusters showed a complex collection of cancer-types, squamous-cell-carcinoma, colorectal cancers, and a novel grouping of kidney-cancers. Sixty-four samples were identified outside their tissue-of-origin cluster. Known and novel patient subgroups were detected for Acute Myeloid Leukemia’s, and breast cancers. Quantification of the contributions of the different molecular types showed that substructures are driven by specific (combinations of) molecular characteristics.

AB - The use of genome-wide data in cancer research, for the identification of groups of patients with similar molecular characteristics, has become a standard approach for applications in therapy-response, prognosis-prediction, and drug-development. To progress in these applications, the trend is to move from single genome-wide measurements in a single cancer-type towards measuring several different molecular characteristics across multiple cancer-types. Although current approaches shed light on molecular characteristics of various cancer-types, detailed relationships between patients within cancer clusters are unclear. We propose a novel multi-omic integration approach that exploits the joint behavior of the different molecular characteristics, supports visual exploration of the data by a two-dimensional landscape, and inspection of the contribution of the different genome-wide data-types. We integrated 4,434 samples across 19 cancer-types, derived from TCGA, containing gene expression, DNA-methylation, copy-number variation and microRNA expression data. Cluster analysis revealed 18 clusters, where three clusters showed a complex collection of cancer-types, squamous-cell-carcinoma, colorectal cancers, and a novel grouping of kidney-cancers. Sixty-four samples were identified outside their tissue-of-origin cluster. Known and novel patient subgroups were detected for Acute Myeloid Leukemia’s, and breast cancers. Quantification of the contributions of the different molecular types showed that substructures are driven by specific (combinations of) molecular characteristics.

KW - Cancer

KW - Data integration

KW - Data mining

KW - Functional clustering

KW - OA-Fund TU Delft

UR - http://resolver.tudelft.nl/uuid:51e6cd5e-e77c-4bc7-abad-4b0c9d290c08

U2 - 10.1038/srep24949

DO - 10.1038/srep24949

M3 - Article

SN - 2045-2322

VL - 6

JO - Scientific Reports

JF - Scientific Reports

M1 - 24949

ER -

Pan-cancer subtyping in a 2D-map shows substructures that are driven by specific combinations of molecular characteristics

Abstract

Bibliographical note

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this