EducationBasics

What is a SNP? A plain-language guide to single nucleotide polymorphisms

A SNP, pronounced "snip", is a position in your genome where one letter of the DNA code differs between people. That's it. The entire concept fits in one sentence. If you came here asking what does SNP mean in DNA, that's the whole answer: single nucleotide polymorphism explained in twenty-two words.

What makes SNPs interesting is that there are about 4 to 5 million of them in any given person, and a small but meaningful fraction of them shape how your body works. Some affect how you metabolize caffeine. Some affect your risk for specific diseases. Some affect how a particular medication works in your bloodstream. The vast majority do nothing observable. The NHGRI genetics glossary is a reasonable second-opinion source if you want a textbook framing alongside this one.

If you've ever sequenced your genome, through 23andMe, AncestryDNA, MyHeritage, or a clinical lab, your raw data file is essentially a list of which version (called an allele) you carry at each measured SNP position. The whole personal-genomics field is built around figuring out what those alleles mean.

SNP, rsID, allele, genotype, what each word actually means

The vocabulary trips people up because every word sounds technical. They're not.

SNP (single nucleotide polymorphism): a specific position in the genome where humans differ. The fancy name just means "one letter that varies."

rsID: the unique identifier the international research community uses for each SNP. Looks like rs429358 or rs1801133. The rs stands for "Reference SNP." Every published study uses these IDs so anyone can look up exactly which position is being discussed in dbSNP, NCBI's canonical catalogue.

Allele: the specific letter (A, C, G, or T) you carry at a SNP position. Most SNPs have two common alleles. Some have three or four. You have two copies of each, one from each parent. When researchers ask what is a risk allele, they mean the version of that letter that statistical analysis associates with higher odds of a trait or disease. The opposite, a protective allele, lowers those odds. Risk allele vs protective allele is really one coin with two sides, and the same SNP page on Expressive will name both.

Genotype: the pair of alleles you carry together at a SNP. If a SNP can be A or G, your genotype is one of AA, AG, or GG. Genotype matters because for many SNPs, carrying one risk allele has a different effect from carrying two. This is where "what does heterozygous mean" starts to matter: a heterozygous genotype carries one of each allele (AG), while a homozygous genotype carries two of the same (AA or GG). The homozygous vs heterozygous health risk distinction is load-bearing, because plenty of variants only express a measurable effect when both copies are the risk allele.

How researchers figure out what a SNP does

The standard tool is a genome-wide association study (GWAS). The basic recipe: gather a large group of people, sequence their genomes, measure some trait (like blood pressure, height, or a disease diagnosis), and check which SNPs show statistically different distributions between people with and without the trait. The published hits get curated into the NHGRI-EBI GWAS Catalog, which is the dataset most reputable genetic-health tools pull from.

When a SNP "associates" with a trait, it doesn't mean the SNP causes the trait. It means the SNP is statistically linked to it. The reasons vary. Sometimes the SNP directly changes how a protein works. Sometimes it's near another variant that does. Sometimes it tags a chunk of DNA that happens to be inherited together with the real driver.

The honest interpretation of a GWAS hit is: "people with this allele are slightly more likely to have this trait, in this population, by this much, with this much certainty."

Three things matter when reading a GWAS result:

Most consumer genetic-health products do not surface these numbers. We do, because they're the difference between "this finding is robust" and "this finding might disappear in the next study."

Common vs rare variants, the distinction that matters

There's a critical difference between two kinds of variants that lay in the same gene:

Common variants, SNPs that show up in 1% or more of the general population. These are what GWAS picks up. Their effects on disease risk are usually small. Each one shifts your odds by a little. Together they add up.

Rare variants, DNA changes that show up in a small fraction of people, often discovered through clinical sequencing of families with a particular disease. These are what ClinVar catalogues. Their effects are often large. ClinVar labels each entry on a spectrum from benign to "what is a pathogenic variant", with the in-between bucket called a VUS, variant of uncertain significance. A pathogenic rare variant in BRCA1 can shift breast cancer risk by 5x. A pathogenic rare variant in CFTR causes cystic fibrosis. A VUS in either gene means the lab has seen it but the evidence isn't sufficient to call it disease-causing or harmless yet.

Both can occur in the same gene. The trap that ruins a lot of consumer-genetics interpretations is conflating them: "you have a variant in [gene] that's associated with [disease]", true at the gene level, but radically misleading if the variant is a common GWAS hit with OR 1.1 and the disease was characterized through rare-variant studies showing OR 8.0.

A good genetic-health platform tells you which kind of variant it's looking at. Expressive does. Most don't.

How to read what a SNP page actually says

When you land on a SNP page on Expressive (or PubMed, or SNPedia, or any reputable source), the information you want to find is:

  1. Which gene is it in? This tells you what biological system might be affected.
  2. What trait or disease does it associate with? Usually multiple, with varying confidence.
  3. What's the effect size? Pay attention to odds ratios and beta coefficients. A "highly significant" association can still have a tiny clinical effect.
  4. How replicated is the finding? One study is preliminary. Multiple studies across independent populations is convincing.
  5. Is this a rare-variant story or a common-variant story? Mendelian disease findings rarely transfer to common GWAS hits.
  6. What about your specific genotype? Many variants only have an effect at homozygous risk; heterozygotes look like reference.

If a page tells you what to do without addressing those questions, treat it skeptically. The biology is not that simple. The honest answer for most SNPs is "interesting research signal, no clinical action warranted at this time." We don't prescribe, we describe. That posture, plus evidence quality always visible on every page, is the whole reason this platform exists.

What to do next

If you're new to all this, start with the variants index and look up a few SNPs you've seen mentioned, APOE's rs429358, MTHFR's rs1801133, FTO's rs9939609. Read the citations. Notice how often the answer is "the evidence is mixed" or "this matters at homozygous risk only." That nuance is the point.

If you've got a raw DNA file from a consumer test, you can upload it to Expressive and we'll map your variants against the literature. No medical advice. Just the research, transparently.


Want updates when we ship new variant pages or a research deep-dive? Read the latest issue or get notified about early access.