Meta-analysis of gene-expression profiles in breast cancer: towards a unified understanding of breast cancer sub-typing and prognosis signatures

C. Sotiriou, P. Wirapati, S. Kunkel, P. Farmer, S. Pradervand, Haibe-Kains B, C. Desmedt, T. Sengstag, F. Schütz, D. R. Goldstein, M. Delorenzi, M. Piccart

Background: Breast cancer sub-typing and prognosis have been extensively studied by gene expression profiling, resulting in disparate signatures with little overlap in their constituent genes. The biological roles of individual genes in a signature, the equivalence of several signatures and their relation to conventional prognostic factors are still unclear.

Methods: Here we undertook a comprehensive meta-analysis of publicly available gene-expression and clinical data from 18 studies totaling 2833 breast tumor samples. The concept of co-expression modules (comprehensive lists of genes with highly correlated expression) was used extensively to reveal the common thread connecting molecular sub-typing and several prognostic signatures, as well as conventional clinico-pathological prognostic factors.

Results: Breast tumors were consistently grouped into three main subtypes corresponding roughly to ER-/ERBB2- (basal), ERBB2+ and ER+ (luminal) tumors. ERBB2+ tumors showed an intermediate estrogen receptor module score which is not obvious from the traditional ER and ERBB2 marker status combination. Both, ER-/ERBB2- and ERBB2+ subtypes were characterized by high proliferation, whereas the ER+ subtype appeared to be more heterogeneous. Using our meta-analytical approach we were able to identify 524 genes which were significantly associated with survival. Of the 524 prognostic genes, 65% were strongly co-expressed with proliferation, 14% with ER, 0.6% with ERBB2, 2.7% with tumor invasion, 1.5% with immune response and 16% with none of our co-expression modules. All previously reported prognostics signatures examined in this meta-analysis (N=9) , despite the disparity in their gene lists, carried similar information with regard to prognostication, with proliferation genes being the common driving force. They were all very useful for determining the risk of recurrence in the ER+ subgroup and much less informative for ER- and ERBB2+ disease. Combining the signatures did not improve their performances. Finally, in multivariate analysis nodal status and tumor size still retained independent prognostic information.

Conclusions: This meta-analysis unifies various results of previous gene-expression studies in breast cancer. It reveals connections between traditional prognostic factors, expression-based sub-typing and prognostic signatures, highlighting the important role of proliferation in breast cancer prognosis.