Increasingly the molecular characterization of tumors is playing a crucial role in the diagnosis and treatment of Colorectal Cancer (CRC). The focus of this review is on the state of the art clinical genomic testing of CRCs and the implications of this type of testing for developing individualized treatment strategies.
Increasingly the molecular characterization of tumors is playing a crucial role in the diagnosis and treatment of CRC. The genomic landscape of a tumor can define its tissue of origin, the key driver genes leading to the onset and progression of the tumor, the cancer pathways that are activated or suppressed, and the prognosis for individual patients. The cost of generating the DNA and RNA sequencing data that enable this detailed portrait of a tumor, has decreased at a super Moore’s law rate, providing for the possibility of generating this information routinely for all cancer patients. Whole genome and transcriptome sequencing casts a wide net over the landscape of genomic alterations that cause cancer and that define the broad spectrum of tumor expressivity leading to a diversity of outcomes. In this review, I will focus on the state of the art clinical genomic testing of CRCs and the implications of this type of testing for developing individualized treatment strategies.
Eric E. Schadt, Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY
Colorectal (CRC) and other types of cancer represent diseases of the genome, where small nucleotide variations, larger insertions and deletions, large structural variations, and epigenetic changes accumulate in ways that activate and/or suppress critical molecular pathways that in turn alter cellular function, ultimately resulting in unconstrained cellular proliferation. Every cancer patient harbors a unique constellation of germline and somatic DNA variants and changes in their epigenome that at least partially define their risk of developing cancer, the severity and rate of progression of their tumor(s), and ultimate the ability of their tumor and system more generally to respond to specific therapies. Given the unique constellation of mutations underlying any individual tumor and the array of molecular pathways that can be affected as a result, personalizing cancer therapy has not only become a well-established concept, it continues to make its way into the standard of care for a great diversity of cancer types.1
Today we are at the beginning stages of developing personalized therapy strategies that aim to ensure an optimal outcome for individual cancer patients, developments that are enabled by the rapid progress in basic cancer research at the molecular level, the rapid advancement of cost effective technologies that make it possible to comprehensively characterize tumors at the molecular level, and an expanding repertoire of targeted cancer therapeutics. An increasing number of FDA-approved targeted cancer drugs are associated with companion biomarkers that are predictive for drug response in certain types of cancers, in addition to germline variants that are associated with drug metabolism that may also impact treatment response.2 In cases where a patient tests positive for a specific biomarker that indicates an FDA-approved therapy for the given tumor type, developing a personalized therapeutic strategy is straightforward. For example, vemurafenib is indicated as a treatment for metastatic melanoma tumors harboring the BRAF p.V600E mutation. However, for the vast majority of tumor types and available therapeutics, a biomarker-therapeutic link is not so straightforward. In this review, I will focus on the state of the art clinical genomic testing of CRCs and the implications of this type of testing for developing individualized treatment strategies.
A Revolution in Genomics Transforms the Molecular Characterization Of Tumors
Nucleic acid sequencing technologies have evolved at an astonishing pace, one of the few, if only, technologies to move at a super Moore’s Law rate with respect to sequencing a human genome.3 The current generation of DNA sequencers, commonly referred to as next- generation sequencing (NGS) technologies, deliver a high-throughput, low-cost way of generating whole-genome (WGS), whole-exome (WES) sequence, and whole transcriptome RNA sequence data. These technologies are enabling a new paradigm in precision medicine for oncology, driven by large-scale NGS studies carried out over the last several years, which have uncovered novel oncogenic drivers and started to depict genetic landscapes across a number of cancer types.4-6 This research has not only advanced our understanding of the underlying genetics of cancer, but it has accelerated us towards personalized cancer therapy.7 Retrospective analyses of archived tumor samples using targeted gene panels or WES have been recently reported.8-11 Across these different studies, mutations that had the potential to impact clinical decision making (that is, mutations that were actionable) were identified in 80-90% of the tumor samples profiled. A number of prospective cancer genetic sequencing studies have also demonstrated this same level of clinical utility.12-15 These studies are among the first to directly demonstrate that comprehensive genomic characterization of an individual’s tumor can have a direct impact on their treatment choices.
Strategies for Personalizing the Treatment of CRC
We are at the beginning stages of determining the best strategies for genomic characterization of tumors to maximize impact on clinical decision making relating to the treatment of cancer patients. Presently a wide spectrum of alternatives exist to profile tumors at the genomic level in order to understand the pathways that have been activated or suppressed in individual cancer patients, knowledge that can directly impact the choice of therapy. Foundation Medicine offers sequencing of tumor DNA using a targeted panel of a few hundred genes that cover the most commonly mutated genes across a diversity of cancer types. While such targeted panels may inform on a majority of cases, they fall short in a number of ways. First, targeted panels sequenced only on tumor DNA do not capture variants in the germline that may aid in the interpretation of somatic variants. Second, they do not capture all genes and regulatory regions that may harbor mutations that impact key driver genes. Third, sequencing of targeted panels does not involve RNAseq data generated on RNA derived from a patient’s tumor, and thus such strategies will miss potentially important functional information on the tumor that may directly indicate the activation or suppression status of cancer pathways, even in the absence of key driver mutations in those pathways. Finally, targeted panel strategies do not capture the thousands of other genes that may be relevant to therapy choices, such as genes involved in metabolizing drugs.
At the other end of the cancer molecular profiling spectrum are companies like NantOmics that generate orders of magnitude more data and more diverse data than strategies based on DNA sequencing of targeted gene panels. NantOmics’ full suite of assays include WGS carried out on tumor and germline DNA, RNA sequencing of tumor-derived RNA, and metabolomics and proteomics assays derived from the tumor to comprehensively functionally characterize individual tumors. Such comprehensive molecular profiles generated on cancer patients are the most extensive to date, although the cost for generating such extensive data is prohibitive for most, and whether the cost of such data is justified given the actionability of the findings that are specific to those data, remain to be shown. For example, whether WGS offers significant advantages over WES in personalizing cancer therapy for patients remains to be proven. Beyond the direct characterization of tumor samples are sequence-based assays run on circulating tumor cells and cell-free DNA, technologies that promise to facilitate early detection of cancer, comprehensive characterization of the heterogeneity of tumors in a given individual, and enable long-term monitoring of cancer progression and remission.16,17 The oncology field more generally is actively engaged in building up different lines of evidence to support what these different technologies bring to the table in terms of diagnosing and treating cancer patients. Reimbursement for the application of these different tests is also not at all routine, with health insurers and health management systems waiting for the evidence to accumulate that generating this type of information can have a meaningful impact on outcomes. However, given waves of new therapies targeting specific cancer pathways are making their way through clinical trials, understanding the state of these pathways that can be targeted in individual cancer patients will largely come to determine treatment choices in the near future.
To provide more insight into the type of testing that can be carried out today to impact personalized CRC therapy choices, consider the workflow we employ to provide for comprehensive molecular characterization of tumors (Figure 1). Our workflow is representative of a current state of the art process that provides for multi-lab, multi-assay molecular profiling of tumor specimens, generation of genomic findings and interpretations relevant to clinical decisions, and their subsequent delivery to cancer patients and their treating physicians. The first step of this workflow is the isolation of normal DNA from peripheral blood or uninvolved normal tissue, tumor DNA isolated from FFPE or fresh frozen tumor samples, and total RNA isolated from fresh frozen tumor and adjacent normal tissue when available. The DNA and RNA are then interrogated with a variety of genomic technologies. The choice of assays to run on any given case can depend on the quantity and quality of the available material, the purity of the available tumor specimen, the heterogeneity of the tumor type, and cost.
The primary goal of generating and processing these data is to take an integrative approach that utilizes multi-platform genomic profiling data to generate, at the cellular level, molecular portraits of the oncogenic signaling networks underlying individual cancer cases. Recommendations of appropriate targeted therapeutics can then follow through the execution of manual or automated review processes performed in a case- by-case manner. DNA alterations that have clinical implications are identified as actionable alterations, while clinical trial connections, inclusion/exclusion criteria, trial location and open/close date information can be assembled from ClinicalTrials.gov and used to direct patients to the most appropriate clinical trials.
For CRC in particular, the data generated on any particular case can be interpreted in the context of all that is known in the available literature, genomic repositories, and clinical reports to make actionable recommendations given the molecular make up of a tumor (Figure 2). The most straightforward utility of genomic CRC data is for predicting insensitivity to anti- EGFR antibodies.18 Additionally, selective targeting of the ERK pathway through inhibition of either BRAF or MEK can be well informed by the genomic data. More generally, the data can be interpreted in the context of common cancer pathways; the state of one pathway may be inferred given the state of another. For example, we may infer PI3K activation given the absence of ERK pathway activation, even though activating mutations are not directly observed in PIK3CA (one of the primary ways in which PI3K is activated). This inference would allow for the consideration of additional targeted strategies utilizing AKT/mTOR inhibition. Of course, observing driver mutations in well-established CRC- associated genes such as TP53, KRAS, NRAS, BRAF, and PIK3CA mutations provides a primary framework from which a tumors can be classified,19 including assessing the risk of developing metastatic disease (or having metastatic disease), given genes such as TP53 are known to promote metastasis in multiple cancer types.20
Beyond small nucleotide variants in cancer driver genes, somatic copy number alterations (CNAs) affect a greater portion of cancer genomes than SNVs and play a critical role in activating oncogenes or inactivating tumor suppressors.21 A major challenge in CNA analysis is to differentiate driver CNAs that contribute to oncogenesis and cancer progression from those passenger CNAs that are acquired during cancer development but do not have functional consequences. Common criteria for driver CNA predictions include amplicon size and association of gene expression with copy number alterations. In order to determine potential oncogenic driver CNA events with high confidence, the relationship between CNAs and gene expression based on RNAseq data must be examined. Gene fusions represent another key oncogenic event in many cancer types.22 Becoming standard in most cancer pipelines today are comprehensive computational components that incorporate multiple programs for detecting gene fusions from RNAseq data, given the driver role gene fusions can play in many types of cancer. Finally, in cases where a tumor identified in a patient has an unknown origin, the genomic sequence data can well complement pathologic assessments to deliver a more accurate diagnosis, given certain types of mutations or other genomic alterations are known to be specific to certain types of cancer.
Once the processed genomic data, patient medical history and available pathology reports are in hand, today they are manually reviewed by a team of bioinformaticians, cancer molecular biologists, a medical oncologist, and a genetic counselor, to produce an electronic document that summarizes clinically- relevant findings. The results of this testing will include a list of relevant somatic mutations/alterations, drugs whose benefit may be altered given the patient’s somatic or germline variant makeup, a prognostic biomarker summary, clinical trial recommendations (with an emphasis on those where enrollment criteria include variants detected in the patient), and a cancer pathway perturbation summary (similar to what is depicted in Figure 2). The information in these reports can then be considered by the patient and his/her treating physician in determining treatment strategies going forward. The genomic reports are just one of many dimensions of information regarding a patient’s condition that can be considered by the treating physician in treating the patient.
Moving Towards the Future of Genomic Cancer Testing
Despite the rapid advancement of cancer genomic testing, a number of obstacles remain in order to make this testing routine. The cost of WES and RNAseq remain an issue given their substantially higher price compared to targeted panels and the fact that today there is not a clear reimbursement mechanism for generating such data. Sample availability and quality also pose a barrier to perform genome-wide profiling, where obtaining high purity tumor samples from FFPE specimens is still challenging. The tumor purity of the specimen delivered for profiling has a dramatic impact on the ability to identify somatic alterations with confidence. The extent to which targeting of sub-clonal alterations can achieve clinical benefit is also still under investigation, so that balancing the costs and benefits for these different personalized genomics strategies is an evolving process.
Beyond these issues, however, lies perhaps the most important issue to address in order to maximize the informativeness and accuracy of cancer genomic testing: how to better leverage the digital universe of information to construct predictive models of cancer to more accurately diagnose and treat cancer. Most cancer genomic tests today focus more naively on individual somatic variants and their known impact on cancer driver genes. Advanced cognitive systems such as IBM’s Watson attempt to leverage the published literature and existing databases to automate the annotation of somatic alterations in ways that inform on treatment choices. However, such approaches still fail to recognize cancer as a complex disease, with genetic and environmental forces impacting highly interconnected molecular networks that in turn alter cellular behavior, tissue and organ functioning, and ultimately system-wide behavior, whether it is a tumor metastasizing from one organ to another, or the tumor evading the immune system to grow and metastasize in unconstrained ways.
In order to achieve an understanding of the changes we observe in the vast sea of genomic information we generate today in disease contexts, including cancer, we must employ advanced computational frameworks capable of simultaneously organizing and modeling these big data, in order to learn from the data how to accurately classify individuals along wellness and cancer disease spectra, to accurately identify the most appropriate interventions to change their trajectories in order to treat or provide protection against disease. The focus today in cancer genomic testing on identification of obvious mutations in protein coding sequence that activate oncogenes or inactivate tumor suppressor genes, leaves unexplored the variations in DNA in the germline and somatic genomes that affect the regulation of genes with respect to their role in cancer. But even with knowledge of such variations we would not necessarily understand the genes that are impacted, the pathways those genes operate in, the larger networks in which the pathways operate, whether you activate or suppress such genes and pathways for treatment, and so on.
Changes in DNA do not directly lead to changes in disease related phenotypes, but instead lead to changes in molecular phenotypes that in turn affect molecular and cellular processes that in turn lead to higher order changes in tissues, organs and entire systems that ultimately lead to changes in physiological states.23-26 While DNA information on its own may not reflect the dynamic, fluid nature of biological systems that are able to reconfigure themselves as conditions demand, integrating this information with transcriptional, metabolite, protein, and methylation data under different contexts can elucidate the regulatory networks that define cancer risk, onset, progression and response to treatments (Figure 3). Networks provide a convenient graphical framework for organizing vast amounts of data and representing the relationships among features in them. Modeling biological systems as a network provides a more holistic model of the system under study, providing for a far richer context within which to understand single genes and pathways, thereby well complementing reductionist methodologies by providing a more integrated, holistic view. Once constructed and validated, system-wide network models can be systematically mined to identify key drivers of cancer, interpret the molecular impact of cancer-causing perturbagens (whether genetic or environmental), and identify the best targets for therapeutic intervention. However, unlike traditional biological research where modifications to genes are engineered experimentally to make such identifications, with a predictive computational model we can carry out such explorations on super computers in seconds.
Ultimately, our ability to construct the most predictive models of diseases such as CRC will equate with our ability to master the extremely large-scale, high- dimensional information being collected on systems relevant to disease. Without sophisticated mathematical algorithms capable of appropriately integrating the large-scale data, achieving a true understanding of cancer and the most accurate ways to diagnose and treat it will be difficult to achieve. Mature quantitative disciplines such as high energy physics, climatology, and quantitative finance have all evolved to depend on mathematical models as repositories of knowledge and understanding. The complexity of common human diseases such as CRC demand that we employ this same model-based approach as a matter of necessity, if we hope to realize the vision of precision medicine that ensures the right drug is delivered to the right person at the right time.