Central Dogma & Molecular Genetics

Central Dogma & Molecular Genetics

5 sections · 15 min

01

Genome Organization and the Genetic Code

The human genome contains approximately 3.2 billion base pairs organized into 46 chromosomes, but only ~1.5% encodes protein. The remaining ~98.5% includes regulatory elements, non-coding RNA genes (20,000–25,000), introns, and repetitive sequences — much of which is increasingly recognized as functionally relevant. The genetic code is a degenerate triplet code: 64 codons specify 20 amino acids plus 3 stop signals, with degeneracy partially buffering synonymous substitutions.

Key Points

  • Only ~1.5% of the genome is protein-coding (~20,000 genes); non-coding regulatory and RNA elements account for much of the remaining sequence and are increasingly linked to neurological disease
  • Degeneracy: synonymous codons partially buffer against nucleotide substitutions, but synonymous variants can still be pathogenic by disrupting splicing enhancers
  • GC-rich regions tend to be gene-dense and actively transcribed; CpG dinucleotides are mutation hotspots (~10× higher transition rate) due to spontaneous deamination of 5-methylcytosine

Check Your Understanding

A synonymous variant (c.300G>A, p.Thr100=) is identified in a patient with an unexplained genetic condition. Which of the following is the most accurate statement about this variant?

Select an answer to reveal the explanation


02

Replication Fidelity, De Novo Variants, and Repeat Expansions

A three-tier fidelity system (base selection, proofreading, mismatch repair) reduces the replication error rate to ~1 in 10⁹–10¹⁰ per base per division. Despite this, the germline accumulates ~60–70 de novo SNVs per generation (~1–2 per genome per cell division), providing the substrate for both evolution and de novo genetic disease. Trinucleotide repeat expansions — a major class of neurological disease — arise from replication slippage at tandem repeat sequences, with expansion size often increasing across generations (anticipation).

Key Points

  • Mismatch repair (MMR) corrects post-replication errors; MMR deficiency causes microsatellite instability and Lynch syndrome
  • Trinucleotide repeat expansions: CAG in HTT (Huntington), CGG in FMR1 (Fragile X), GAA in FXN (Friedreich ataxia), CTG in DMPK (myotonic dystrophy) — arise from replication slippage; expansion size correlates with severity and age of onset
  • Germline de novo variant rate: ~60–70 SNVs per individual per generation; paternal age is the major contributor (~2 additional variants per year of paternal age), explaining the paternal age effect in de novo dominant conditions like achondroplasia and some epilepsy genes

Check Your Understanding

A boy with intellectual disability and a CGG repeat expansion of 650 repeats in the 5' UTR of FMR1 has absent FMRP on immunocytochemistry. His carrier mother (CGG repeat: 85) has a normal FMRP level. Which mechanism best explains the difference between full mutation and premutation alleles?

Select an answer to reveal the explanation


03

Transcription and Pre-mRNA Splicing

Splicing — the removal of introns and ligation of exons — is clinically the most important step in mRNA processing. Splicing is directed by conserved consensus sequences at the 5' splice donor (GT) and 3' splice acceptor (AG) sites flanking each intron. Approximately 10–15% of disease-causing variants affect splicing, making splice prediction a critical skill in variant interpretation.

Key Points

  • Canonical splice site rule (GT-AG): variants at the ±1 and ±2 positions almost always disrupt splicing and support PVS1 in ACMG classification
  • Exonic splicing enhancers (ESEs) are disrupted by some synonymous and deep-intronic variants, causing exon skipping — e.g., certain SCN1A synonymous variants cause Dravet syndrome through splicing disruption
  • Alternative splicing generates tissue-specific isoforms; brain-specific exons explain why variants in ubiquitously expressed genes (e.g., DYNC1H1, SCN1A) can cause purely neurological phenotypes
  • In silico splice predictors (SpliceAI, MaxEntScan) are essential tools for flagging cryptic splice variants; RNA studies (RT-PCR) provide definitive functional evidence

Check Your Understanding

Alternative splicing of a gene produces both a 'short' isoform (expressed in muscle) and a 'long' isoform (expressed in brain). A patient has a splice site variant that disrupts exon inclusion in the brain isoform only. This variant most likely causes:

Select an answer to reveal the explanation


04

Translation and Protein Function

Mature mRNA is exported to the cytoplasm where ribosomes translate it codon-by-codon into a polypeptide chain. The AUG start codon is recognized by the 43S pre-initiation complex and Met-tRNA. Elongation proceeds until a stop codon (UAA, UAG, or UGA) is encountered, triggering release of the completed polypeptide. Post-translational modifications — phosphorylation, glycosylation, ubiquitination — determine protein localization, activity, and stability.

Key Points

  • Ribosomes read mRNA in the 5'→3' direction, synthesizing protein N-terminus to C-terminus
  • Kozak sequence context around AUG affects translation efficiency; initiation codon variants (p.Met1?) abolish or reduce protein production
  • Signal peptides direct proteins to the endoplasmic reticulum for secretion or membrane targeting
  • Protein folding is assisted by chaperones (HSP70, HSP90); misfolded proteins are targeted for proteasomal degradation
  • Many neurological disorders result from loss-of-function (insufficient protein) or gain-of-function/dominant-negative protein mechanisms — the distinction critically determines therapeutic strategy

Check Your Understanding

A pathogenic variant is identified as c.247C>T (p.Arg83Ter) in exon 3 of a gene with 10 exons. Which statement best predicts the molecular consequence?

Select an answer to reveal the explanation


05

Variant Types and Their Molecular Consequences

Genetic variants are classified by their molecular nature and predicted effect on gene function. Understanding variant type is the first step in variant interpretation: it determines which ACMG/AMP evidence criteria apply, whether nonsense-mediated decay is expected, and whether the variant is likely to cause loss of function or a gain-of-function effect. Not all variants of the same class have the same functional impact — context is everything.

Key Points

  • Missense variant: single nucleotide substitution causing an amino acid change (e.g., p.Arg176Trp); effect ranges from benign to highly damaging depending on position and residue chemistry
  • Nonsense (stop-gain) variant: nucleotide change introducing a premature stop codon (e.g., p.Arg100Ter); typically causes NMD if the stop codon is >50–55 nt upstream of the final exon-exon junction
  • Frameshift variant: insertion or deletion of non-multiples of 3 nucleotides, shifting the reading frame; almost always introduces a premature stop → NMD
  • Splice-site variant: disrupts canonical ±1/2 donor or acceptor splice sites → exon skipping, intron retention, or cryptic splice site activation
  • Synonymous (silent) variant: nucleotide change that does not alter the amino acid but may affect splicing, mRNA stability, or translation efficiency — not always benign
  • Nonsense-mediated decay (NMD): surveillance pathway that degrades mRNAs with premature termination codons >50–55 nt upstream of the last exon-exon junction, preventing production of truncated, potentially dominant-negative proteins

Check Your Understanding

Which of the following variant descriptions is consistent with a frameshift mutation?

Select an answer to reveal the explanation

0 of 5 sections read

Scroll through all sections to track your progress.