Poster Presentation 31st Lorne Cancer Conference 2019

Using protein structure and function information to predict and understand variant pathogenicity (#120)

David B Ascher 1
  1. University of Melbourne, Parkville, VICTORIA, Australia

One of the challenges for the integration of genomic information into clinical oncology is the characterisation of novel variants, with many variants of uncertain significance. This is further complicated by the multitude of effects a mutation may have. We have developed a suite of programs that uses protein structural information to predict the molecular consequences of coding variants on protein structure and function.

 

Focussing on the InSiGHT database for hereditary colorectal cancer and related diseases, over 30% of the collated variants are of uncertain significance. By mapping characterised variants onto the protein structures, we showed that likely-pathogenic variants were situated in regions with lower tolerance to missense mutation, had larger destabilising effects on protein structure, and were located closer to ligand binding sites. Using a Random Forest algorithm we trained a predictive model capable of accurately classifying variants between the likely-pathogenic group and the likely-benign group (ROC AUC = 0.94) and the population group (ROC AUC = 0.99).

 

We used this approach to accurately identify the risk of patients with mutations in the genes VHL and CDKN2B developing renal carcinoma. This revealed that the effects of variants in these genes on protein stability and interactions with specific molecular partners could be used to accurately classify an individual’s disease risk- even for novel variants with no previous clinical data. In a large prospective trial (n=3620), no patients classified as low risk developed renal carcinoma; whilst 92% of patients classified as high risk went on to develop renal carcinoma. Similarly, the molecular consequences of succinate dehydrogenase variants were highly predictive of both the chance of developing a malignant paraganglioma (p = 0.032) and patient life expectancy (p = 0.002).

 

We have demonstrated how protein structural and functional features are highly discriminatory between, and highly predictive of, pathogenicity class. This information can provide a powerful and scalable approach to interpret genomic data, how they relate to clinical outcomes and guiding future drug development.