Whole exome sequencing overview
Exons represent the protein-coding regions within the genome, and whole-exome sequencing (WES) entails sequencing all these regions. Research indicates that over 85% of base mutations occur within exonic regions, underscoring the crucial role of WES in unraveling genetic variations. Given that exons constitute only 1.5%-2% of the entire genome, WES yields relatively smaller datasets. However, it offers broad coverage, high depth, and cost-effectiveness. WES finds primary application in studies related to genetic diseases, single nucleotide variant investigations, and cancer research.
Whole-exome sequencing introduction
What is whole-exome sequencing?
Exons and introns collectively constitute the coding region of a gene. Exons refer to the gene sequences that are retained and appear in mature RNA after preRNA undergoes modification and splicing during transcription. The exome encompasses all exons within the genome, representing approximately 1.5% to 2% of the entire genome. Despite its small proportion within the genome, research has shown that over 85% of human disease genes are caused by mutations in exonic bases. Hence, WES holds particular significance in clinical research.WES entails the sequencing of all exonic regions within the genome using next-generation sequencing (NGS) technology. This sequencing approach includes the capture enrichment of exon sequences, exome sequencing, and subsequent data analysis.
1)Whole exome sequencing workflow
Whole exome capture and enrichment
- Genomic DNA Extraction
- The genomic DNA is randomly fragmented into 200-300 bp fragments, followed by end repair and addition of polyA tails. Subsequently, adapter sequences are ligated to both ends of the fragments, and PCR amplification is performed to generate DNA fragments.
- The DNA fragments are hybridized with capture probes to enrich for exonic DNA fragments.
- The enriched exonic DNA fragments are subjected to PCR amplification. After quality assessment, the exome library is ready for sequencing.
Next- Generation Sequencing
- Next-generation sequencing, such as Illumina
Bioinformatics analysis
- Data quality control
- Sequence assembly and alignment
- Variant detection, including single nucleotide polymorphism site (SNPs), insertion and deletion (Indels), copy number variation site (CNVs)
- Annotation, filtering, validation, etc.
2)Advantage of whole exome sequencing
Exons contain vital information necessary for protein synthesis, and WES effectively detects variations in protein-coding regions, including single nucleotide variants, insertions, and deletions. Given the significance of exons within the genome, WES is deemed crucial and essential. Focusing solely on exons dramatically reduces both sequencing volume and time, with sequencing volume comprising only 1.5%-2% of that of whole genome sequencing, consequently lowering sequencing costs. Moreover, the generated data volume is smaller compared to whole-genome sequencing (WGS), facilitating easier analysis and management. Additionally, WES achieves high depth, often exceeding 100x, ensuring extensive sequencing coverage and aiding in the detection of tumor genes.
Differences among WES 、WGS and Microarray
WGS, WES, and microarray sequencing are pivotal tools in the realm of large-scale genetic research. While WGS and WES focus primarily on sequencing DNA segments across the genome to delve deep into genetic variations, microarray, or chip technology, zeroes in on specific known genes for analysis. Remarkably, microarray extends its utility to both DNA and RNA sequencing. DNA sequencing contributes to the scrutiny of genetic variation points, while RNA sequencing plays a vital role in gauging gene expression levels. Across a spectrum of parameters, these three sequencing methods manifest noteworthy distinctions.
Principles: WGS and WES utilize NGS technology to sequence target DNA segments. Prior to sequencing, WES requires the capture of exons from genomic DNA fragments. Microarray involves the immobilization of nucleic acid fragments on a chip surface as solid-phase probes. Subsequently, the target DNA or RNA segments hybridize with the probes, and information about the target segments is obtained based on the different signals generated post-hybridization.
Sequencing Size: WGS provides sequencing data for the entire genome, resulting in sequencing sizes ranging from several gigabases (Gb) to tens of Gb, depending on the species under study. In contrast, WES primarily sequences the exonic regions of the genome, making the sequencing size relatively smaller. Since exons constitute only about 1%-2% of the entire genome, the sequencing size of WES typically ranges from tens of megabases (Mb) to several hundred Mb, depending on the capture method and probe design used. Microarray usually based on chip technology, utilizes probes fixed on the chip surface for sequencing. Therefore, the sequencing size is determined by the number and design of probes on the chip.
Sequencing Depth: Sequencing depth refers to the average number of times each base is sequenced. Common sequencing depths for WGS range from 30x to 50x. WES typically has higher sequencing depth compared to WGS because it focuses primarily on the exonic regions, which are relatively smaller. Common sequencing depths for WES range from 100x to 200x, sometimes even higher. The sequencing depth of microarray sequencing depends on the sensitivity of the probes and the signal intensity, typically representing the sequencing depth of individual bases. Due to lower signal intensity, microarrays usually have lower sequencing depth.
Data Analysis: The data analysis workflows for both WGS and WES entail a meticulous series of steps, including sequence alignment, variant detection, and functional annotation. It’s worth noting that WGS yields vast datasets, often laden with repetitive sequences, necessitating the implementation of sophisticated bioinformatics methodologies to ensure thorough analysis. On the contrary, WES typically offers smaller datasets, streamlining the analytical process for researchers.On a different note, microarray data analysis revolves around two pivotal facets: signal intensity analysis and differential expression analysis. Signal intensity analysis seeks to decode the signal strengths observed on the microarray chip, shedding light on the relative expression levels of genes within the biological samples. In parallel, differential expression analysis delves into comparing gene expression patterns across diverse conditions, pinpointing crucial genes linked to specific biological processes or pathological conditions.
Application of whole exome sequencing
The exome, harboring crucial protein-coding sequences, represents a pivotal domain in genomic research focused on human disease. WES technology has been instrumental in identifying pathogenic and susceptibility genes associated with a myriad of complex diseases. These advancements hold significant implications for the development of effective preventive and therapeutic strategies aimed at combating diseases.
Genetic Disease Diagnosis:Genetic disorders present numerous challenges in clinical diagnosis and treatment, characterized by diverse clinical manifestations, significant genetic heterogeneity, and diagnostic complexities. The identification and clarification of pathogenic gene mutations are crucial for the accurate diagnosis and effective management of these diseases. In recent years, with the rapid advancement of genetic testing technologies, WES has emerged as a widely employed approach in clinical settings, offering a rapid and effective method for precise diagnosis of genetic disorders. In a study conducted by Christina et al., the association between LRIT3 gene mutations and autosomal-recessive complete congenital stationary night blindness was discovered through WES. This finding provides a pivotal clue for the accurate diagnosis of the disease and facilitates the development of improved treatment strategies for affected patients.
Tumor Diagnosis:Tumorigenesis, a multifaceted process, arises from a myriad of genetic mutations. WES emerges as a pivotal tool aiding scientists and clinicians in uncovering these variations, facilitating precise tumor diagnosis and prognostication. For instance, Zoran leveraged WES data from immunotherapy patients to discern an association between KRAS gene mutations and resistance to immune-based treatments. Additionally, they introduced a classifier named CIRCLE, which, surpassing tumor mutational burden (TMB), exhibits enhanced efficacy in predicting responses to immunotherapy. This revelation furnishes crucial insights toward refining therapeutic choices for immunotherapy patients.
Personalized Medicine:Personalized medicine stands as a pivotal paradigm within precision medicine, with wide-ranging applications foreseen for WES technology. Through the analysis of patient genomic data, bespoke therapeutic strategies can be swiftly and accurately tailored, allowing for the anticipation of drug side effects and efficacy, thereby facilitating precision treatment and enhancing therapeutic outcomes. Kiran’s investigation into the familial cohort of glaucoma, via WES, unearthed a cadre of candidate genes, including those potentially pivotal in glaucoma pathogenesis. These genes may engage in various pathological processes, such as extracellular matrix remodeling, inflammation, and apoptosis of retinal ganglion cells. Targeted interventions directed at these identified candidate genes and pathways hold promise in furnishing glaucoma patients with more efficacious therapeutic options.
CD Genomics offers flexible and cost-effective whole exome sequencing services, tailored to meet your research interests with customized analysis workflows. Each step of the process involves skilled professionals to ensure quality control and result accuracy.
FAQ
1. what does whole exome sequencing test for?
WES provides an in-depth analysis of the genetic blueprint’s protein-coding domains known as exons. These sections serve as command centers for the production of proteins, essential for a wide array of biological functions. By employing WES, researchers are able to meticulously investigate these exonic regions, revealing genetic modifications such as SNPs and small Indels. This comprehensive scrutiny of the exome serves as a robust methodology, allowing for the discovery of nuances within the intricate genetic landscape and shedding light on potential influencers of disease onset or specific phenotypic traits.
2. What is the principle of whole exome sequencing?
WES is a sophisticated method that hones in on the critical protein-coding sections of the genome, known as exons. Unlike broader genomic sequencing approaches, WES selectively targets and captures these exonic regions using specialized probes or baits, ensuring a focused analysis on the genetic blueprints responsible for protein synthesis. This targeted enrichment process allows researchers to efficiently sequence the vast majority of genetic variations relevant to protein synthesis. Subsequently, cutting-edge high-throughput sequencing technologies are employed to decipher the sequence of nucleotides within these exonic regions, generating extensive data sets. Finally, this wealth of sequence data is meticulously analyzed to detect genetic variations, such as SNPs and small Indels, that may harbor associations with various diseases or phenotypic traits under investigation.
3. What is the difference between whole exome sequencing?
WES and WGS are essential pillars of genomic research, each with its own set of advantages. WES is meticulously crafted to delve into the protein-coding exonic regions of the genome, which hold the key to unraveling the genetic blueprints essential for protein synthesis. Conversely, WGS offers a panoramic view by scrutinizing both coding and non-coding sequences, paving the way for a comprehensive understanding of the entire genetic landscape.
Exons, though comprising a mere fraction, approximately 1.5-2%, of the genome, take center stage in WES, enabling an exhaustive exploration of genetic variations within coding regions. On the other hand, WGS casts a wider net, encompassing exons, introns, and other non-coding elements, thus enabling a diverse array of genomic insights.
While WES boasts precision, its scope is narrower compared to the comprehensive coverage provided by WGS. WES employs selective capture methodologies to isolate exonic regions, potentially influencing sequencing depth. In contrast, WGS offers deeper coverage, capturing extensive genomic data across a broader spectrum. However, the broader scope of WGS necessitates more intricate data processing and analysis, adding layers of complexity to genomic interpretation.
References
- Zeitz C, Jacobson SG, Hamel CP, Bujakowska K, Neuillé M, Orhan E, Zanlonghi X, Lancelot ME, Michiels C, Schwartz SB, Bocquet B; Congenital Stationary Night Blindness Consortium; Antonio A, Audier C, Letexier M, Saraiva JP, Luu TD, Sennlaub F, Nguyen H, Poch O, Dollfus H, Lecompte O, Kohl S, Sahel JA, Bhattacharya SS, Audo I. Whole-exome sequencing identifies LRIT3 mutations as a cause of autosomal-recessive complete congenital stationary night blindness. Am J Hum Genet. 2013 Jan 10;92(1):67-75. doi: 10.1016/j.ajhg.2012.10.023. Epub 2012 Dec 13. PMID: 23246293; PMCID: PMC3542465.
- Gajic ZZ, Deshpande A, Legut M, Imieliński M, Sanjana NE. Recurrent somatic mutations as predictors of immunotherapy response. Nat Commun. 2022 Jul 8;13(1):3938. doi: 10.1038/s41467-022-31055-3. Erratum in: Nat Commun. 2022 Aug 5;13(1):4558. PMID: 35803911; PMCID: PMC9270330.