Biology11 min read1,085 words

What Is DNA? The Code That Builds Every Living Thing

DNA is a molecule that carries the genetic instructions for building and maintaining every living organism. Learn how the double helix stores information, how genes become proteins, and why your DNA is 99.9% identical to every other human.

edit_note

Explain It Simply Editorial Team

Published May 5, 2026

The Double Helix: DNA's Elegant Structure

DNA (deoxyribonucleic acid) is a long molecule shaped like a twisted ladder — the famous double helix discovered by James Watson and Francis Crick in 1953, building on critical X-ray crystallography work by Rosalind Franklin and Maurice Wilkins.

The 'rails' of the ladder are made of alternating sugar (deoxyribose) and phosphate molecules — the sugar-phosphate backbone. The 'rungs' are pairs of chemical bases: adenine (A) always pairs with thymine (T), and guanine (G) always pairs with cytosine (C). These base-pairing rules are absolute — A never bonds with G or C, only with T. This complementarity is the key to DNA's ability to copy itself.

The human genome contains approximately 3.2 billion base pairs organized into 23 pairs of chromosomes (46 total). Each chromosome is a single, enormously long DNA molecule wrapped around protein spools called histones. The largest human chromosome (chromosome 1) contains about 249 million base pairs; the smallest (chromosome 21) contains about 48 million.

The sequence of bases along the DNA strand encodes information, much like letters in a book. Just as the English language uses 26 letters to create every word and sentence, DNA uses 4 chemical 'letters' (A, T, G, C) to encode every instruction needed to build and maintain a human body. The 'words' of DNA are three-letter sequences called codons, each specifying one of 20 amino acids — the building blocks of proteins.

DNA Base PairingAThydrogen bondsGCTASugar-phosphatebackboneSugar-phosphatebackbone

DNA's four bases always pair the same way: Adenine with Thymine, Guanine with Cytosine. This strict pairing enables accurate replication.

From Gene to Protein: The Central Dogma

The central dogma of molecular biology describes how genetic information flows: DNA → RNA → Protein. This process is how genes actually DO things in your body.

Transcription is the first step. When a gene needs to be expressed, an enzyme called RNA polymerase unzips the DNA double helix at that gene's location and creates a single-stranded copy called messenger RNA (mRNA). The mRNA is complementary to the DNA template strand — wherever the DNA has a C, the mRNA has a G; wherever the DNA has an A, the mRNA has a U (uracil, which replaces thymine in RNA).

The mRNA molecule then travels from the nucleus to the cytoplasm, where it reaches a ribosome — the cell's protein-manufacturing machine. Translation begins: the ribosome reads the mRNA three bases at a time (each three-base sequence is a codon). Each codon specifies one of 20 amino acids. Transfer RNA (tRNA) molecules bring the matching amino acids, which are chained together in the exact sequence specified by the mRNA.

The resulting chain of amino acids folds into a specific three-dimensional shape — a protein. This shape determines the protein's function. Hemoglobin (which carries oxygen in your blood) contains 574 amino acids folded into a precise configuration that creates a binding site for oxygen molecules. Change just one amino acid (as happens in sickle cell disease), and the protein misfolds, causing the entire red blood cell to distort.

Your body contains an estimated 20,000-25,000 genes, but produces over 100,000 different proteins through a process called alternative splicing — the same gene can produce different proteins by including or excluding different segments of the mRNA. This is one reason why humans are far more complex than our gene count alone would suggest.

DNA Replication: Copying 3 Billion Letters With Near-Perfect Accuracy

Every time a cell divides, it must copy its entire genome — all 3.2 billion base pairs. This process, called DNA replication, happens with stunning speed and accuracy.

The double helix unzips at multiple points simultaneously (called replication origins — there are roughly 30,000 in human cells), and an enzyme called DNA polymerase reads each strand and builds a complementary copy. Because A always pairs with T and G always pairs with C, each strand serves as a template for a new partner strand. The result: two identical copies of the original DNA.

DNA polymerase adds new bases at a rate of about 1,000 bases per second in human cells (and up to 100,000 per second in bacteria). Despite this speed, the error rate is remarkably low — about 1 mistake per billion base pairs, thanks to built-in proofreading mechanisms. DNA polymerase checks each base as it's added and corrects mismatches. Additional repair enzymes patrol the genome, fixing an estimated 10,000-100,000 DNA lesions per cell per day caused by UV radiation, chemicals, and normal metabolic activity.

Despite these safeguards, some errors slip through. An average human baby is born with approximately 70 new mutations. Most are harmless (occurring in non-coding regions or producing silent changes in proteins), but occasionally a mutation affects a critical gene — potentially causing genetic disease or, over generations, providing raw material for evolution.

Cancer, at its core, is a disease of DNA replication errors. When mutations accumulate in genes that control cell growth and division (oncogenes and tumor suppressor genes), cells can begin dividing uncontrollably. This is why cancer risk increases with age — more cell divisions mean more opportunities for replication errors.

What Your DNA Reveals — and What It Doesn't

The Human Genome Project, completed in 2003 after 13 years and $2.7 billion, sequenced the entire human genome for the first time. Today, companies like 23andMe and AncestryDNA can sequence relevant portions of your genome for under $100.

All humans share 99.9% of their DNA. The 0.1% difference — about 3 million base pairs — accounts for all the visible and invisible variation between individuals: eye color, height predisposition, disease risks, drug metabolism differences, and more. You share about 50% of your DNA with bananas, 60% with fruit flies, 85% with mice, and 98.7% with chimpanzees.

Genetic testing can reveal predispositions to certain diseases. Variants in the BRCA1 and BRCA2 genes dramatically increase breast and ovarian cancer risk (up to 72% lifetime risk for breast cancer, compared to about 13% in the general population). The APOE ε4 variant increases Alzheimer's risk. Pharmacogenomic tests can predict how you'll respond to specific medications.

However, DNA is not destiny for most traits. Height is about 80% heritable — strongly influenced by genes — but hundreds of gene variants contribute, each with a tiny effect, and nutrition and environment account for the remaining 20%. Intelligence, personality, and most diseases are influenced by thousands of genes interacting with environmental factors in ways we're only beginning to understand.

Epigenetics — chemical modifications that change gene expression without altering the DNA sequence — adds another layer of complexity. Your diet, stress levels, exercise habits, and environmental exposures can alter epigenetic marks, effectively turning genes on or off. Some epigenetic changes can even be transmitted to offspring, meaning your lifestyle choices may influence your grandchildren's biology.

Sources: Human Genome Project (genome.gov), Watson & Crick (Nature, 1953), ENCODE Project, National Human Genome Research Institute, 23andMe research publications.

💡

💡 AHA Moment

Here's the staggering part: your body contains roughly 37 trillion cells, and nearly every single one contains a complete copy of your DNA — about 2 meters (6.5 feet) of it, coiled and folded to fit inside a nucleus just 6 micrometers wide. If you uncoiled all the DNA in your body and laid it end to end, it would stretch to Pluto and back — roughly 12 billion kilometers.

But here's what's truly mind-bending: only about 1.5% of your DNA actually codes for proteins. The other 98.5% was once dismissed as 'junk DNA,' but we now know much of it serves regulatory functions — acting like switches, dimmers, and timers that control WHEN, WHERE, and HOW MUCH of each gene is expressed. A liver cell and a brain cell contain identical DNA, but they look and function completely differently because different genes are switched on and off. You are not your genes — you are how your genes are played.

Want a deeper explanation?

Use our AI tool to get personalized, interactive explanations on any topic.

auto_awesomeTry It Free