Oral Presentation 31st Lorne Cancer Conference 2019

Chromatin interactome mapping identifies target genes at breast cancer risk signals (#12)

Stacey Edwards 1
  1. Queensland Institute of Medical Research, Herston, QLD, Australia

Genome-wide association studies (GWAS) for breast cancer have identified 196 independent signals associated with increased risk. The majority of risk-associated variants within these signals fall in regulatory sequences, such as enhancers, that control gene expression. Here, we perform in situ Capture Hi-C using a high-resolution breast cancer susceptibility Variant Capture array (VCHi-C), which includes probes to cover all credible risk variants. We apply VCHi-C and Promoter Capture Hi-C (PCHi-C) to link risk variants to their target genes in six human mammary epithelial and breast cancer cell lines. We use the CHiCAGO pipeline to assign confidence scores to interactions and identify between 10-27,000 high-confidence interactions per cell type. Hierarchical clustering of CHiCAGO interaction scores stratifies cell lines by estrogen receptor status, suggesting cell-type specificity of the interactomes. Global analysis of promoter-interacting regions (PIRs) shows strong enrichment for cell-type specific accessible chromatin (ATAC-seq, DNAse-seq), histone marks for active enhancers (e.g. H3K27ac, H3K4me1) and transcription factor binding motifs (e.g. GATA3, FOXA1), supporting the regulatory potential of many PIRs. Similarly, analysis of variant-interacting regions (VIRs) shows enrichment of expressed genes in the relevant cell type. In total, validated CHiCAGO-identified interactions results in 651 target genes at 139 breast cancer risk signals. To further prioritise the CHi-C-derived chromatin interactions, we use a recently developed Bayesian framework, to fine-map the direct contacts. Importantly, the combined PCHi-C and VCHi-C fine-mapping enables us to prioritise 839 out of 4208 highly-correlated risk variants, including 33 signals which are potentially reduced to less than five risk variants, and lowers the total number of target genes to 181. Gene ontology analyses revealed that the prioritised target genes are enriched for known cancer drivers and transcription factors as well as genes in the developmental, immune-system and DNA-integrity checkpoint pathways. Our results demonstrate the power of combining genetics, computational genomics and molecular studies to rationalise the identification of key variants and target genes at GWAS-identified risk regions.