Sharing Large Segments With a Match Does Not Validate Small Segments Shared With That Match

OK, that could be one of the worst blog titles I’ve written, but it’s intentional. When people share this post, I want the title to clearly convey the lesson.

Small Segments are Poison

We know that many small segments are false, and thus that many distant matches are false positives. I have written about small segments and distant matches many times. For a few background articles, see the following:

The (most current as of September 2017) definitive article on the nature of false versus true small segments is “Reducing Pervasive False-Positive Identical-by-Descent Segments Detected by Large-Scale Pedigree Analysis.” The paper is available online for free (http://mbe.oxfordjournals.org/content/31/8/2212). In the paper, the researchers found that more than 67% of all reported segments shorter than 4 cM are false-positive segments. At least 60% of 4cM segments were false-positive, and at least 33% of 5 cM segments were false-positive. The number of false-positives decreased fairly rapidly above 5 cM. See my analysis of this paper here. ... Click to read more!

How Do DNA Segments Get Smaller?

Many genetic genealogists, myself included, often talk about DNA segments getting “broken up” or “broken down” as they are passed from one generation to the next. But this language can be misleading, since DNA isn’t really “broken up” into pieces when it passed down; instead, a few pieces are traded between nonsister chromosomes in a process called RECOMBINATION.

Genetic recombination is a process of crossover between chromosomes during MEIOSIS (meiosis = a very specialized cell division that creates eggs and sperm for reproduction). Very early in meiosis, the cells duplicate the chromosomes. Normally, every cell has 23 pairs of chromosomes, for a total of 46 chromosomes. However, in the first step of meiosis, the chromosomes are duplicated to result in a total of 92 chromosomes. There are 4 copies of chromosome 1 (2 copies of the chromosome you got from your mother, and 2 copies of the chromosome you got from your father). There are 4 copies of chromosome 2, and so on. ... Click to read more!

Small Matching Segments – Examining Hypotheses

Last week I published “Small Matching Segments – Friend or Foe?” to join in the community’s conversation about the use of “small” segments of DNA, referring to segments 5 cM and smaller (although keep in mind that the term “small,” without a more specific definition, will mean different things to different people).

The question that the community has been struggling with is whether small segments of DNA can be used as genealogical evidence, and if so, how they can be used.

As I wrote in my post, a significant percentage of small segments are false positives, with the number at least 33% and likely much higher. In my examination and in the Durand paper I discuss, a false positive is defined as a small segment that is not shared between a child and at least one of the parents. ... Click to read more!

Small Matching Segments – Friend or Foe?

There has been a great deal of conversation in the genetic genealogy community over the past couple of weeks about the use of “small” segments of matching DNA. Typically, the term “small” refers to segments of 5 cM and smaller, although some people include segments of 7 cM or even 10 cM and smaller in the definition.

The question, essentially, is whether small segments of DNA can be used as genealogical evidence, and if so, how they can be used.

While it may seem at first that all shared segments of DNA could constitute genealogical evidence, unfortunately some small segments are IBS, creating “false positive” matches for reasons other than recent ancestry. These segments sometimes match because of lack of phasing, phasing errors, or a variety of other reasons. One thing, however, is clear: there is no debate in the genetic genealogy community that many small segments are false positive matches. There IS debate, however, regarding the rate of false positive matches, and what that means for the use of small segments as genealogical evidence. ... Click to read more!

A Small Segment Round-Up

If you aren’t already a member of the coolest Facebook group ever, Genetic Genealogy Tips & Techniques, you really should be! We have a friendly and engaging environment, and everyone learns something new every day!

This post is meant to answer a question or issue that is raised almost daily in the group, and that is the issue of small shared DNA segments. Although these small segments are alluring, they are the mythological sirens of the genealogical world!

Small Segments Executive Summary

Here’s a bite-sized summary of the content below:

  • Many to most small segments (at least 7 cM and smaller) are FALSE, meaning they are NOT actually shared by the two matches, and therefore do NOT indicate shared ancestry;
  • This is supported by a 2014 paper by 23andMe scientists showing that at least 33% of 5 cM phased DNA segments are false-positive (and it’s much worse for unphased segments or segments smaller than 5 cM);
  • This is further supported by evidence that anywhere from 20-35% of distant matches at a testing company are not shared with either tested parent;
  • This is further supported by evidence that phasing your DNA with two tested parents significantly reduces the number of matches below 10 cM (with proportionally more matches reduced as the segment size gets smaller);
  • There is currently no evidence that triangulating segments or finding a paper trail provides a mechanism for distinguishing between false segments and valid segments;
  • Since we can’t tell the difference between false small segments and valid small segments, we must avoid these small segments to avoid poisoning our genealogical conclusions with false data; and
  • Beware any research or conclusion that uses these small segments without specifically addressing the issues that are known – based on all the scientific research and evidence gathered to date – to surround small segments.

If you’re interested in learning more, keep reading!

Small Segments In Detail

One of the most common questions in the group has to do with small segments. There’s no exact definition of “small” when it comes to small segments, but many of us define them as being a single segment of DNA of 7 cM or smaller. Others use 5 cM or smaller, while others use 10 cM or smaller. Personally, I consider segments of 7 cM or less to be “small,” although when I’m being very conservative I use a definition of 10 cM or smaller. ... Click to read more!

How Many Segments Do You Share?

I have told people in the past that we share a single segment of meaning IBD DNA with the vast majority of our genetic matches (where IBD means Identity-by-Descent, or a valid matching segment of DNA from a recent genealogical relationship). I usually say that we share a single segment of DNA with 99% of our matches, but that’s been an off-the-cuff estimate. I wanted to have better data to cite, so I took a closer look at this issue.

At FTDNA, you can download a list of all of your matches:

I downloaded my list and removed all of my targeted test-takers (anyone that I tested or I asked to test). These close test-takers would skew the data.

After removing them from my match list, I have a total of 2,491 matches at Family Tree DNA.

Family Tree DNA also allows you to download a list of all the segments you share with your matches: ... Click to read more!

TGG’s Top Posts in 2017

I started The Genetic Genealogist on February 12, 2007 with my first post, “New estimates for the arrival of the earliest Native Americans.” There were few educational resources for genetic genealogy back then, and all testing was Y-DNA and mtDNA. Although 23andMe would launch the first large-scale atDNA test a few months later in November of 2007 (see “23andMe Launches Their Personal Genome Service” announcing the $1,000 test), it would be a couple of years until they used the results for cousin matching. Today, almost 11 years later, there are 617 posts with more than 310,000 words.

Here’s a screenshot from the blog in December 2007:

This year I posted about 30 times about a wide variety of topics. Here are the most popular posts in 2017: ... Click to read more!

The Effect of Phasing on Reducing False Distant Matches (Or, Phasing a Parent Using GEDmatch)

Genealogical autosomal DNA evidence relies on segments of DNA shared between two or more individuals. When they are true matching segments, they provide information about shared ancestry. One problem that genealogists are currently facing is the inability to decipher between “real” or “true” matching segments and “false” segments.

I won’t get too much into all the different terminology of “real” versus “false” here, because it isn’t important and takes away from the more important discussion. Genealogists, like patent attorneys, can be their own lexicographer, just so long as they are understood by the reader by providing a good definition. So here are my definitions for this post (and I typically use these elsewhere): ... Click to read more!

The Danger of Distant Matches

We know that small segments shared between two individuals can be problematic (see Small Matching Segments – Friend or Foe?), whether the two individuals are closely related or distantly related (or not related at all, as we’ll see). I call small segments (which I usually classify as 5 cM or less) as POISON because it is currently impossible to decipher between which are real segments and which are not.

In the following analysis, I use the wonderful new Match-O-Match tool at DNAGedcom to compare my and my parents’ match lists from AncestryDNA. The Match-O-Match tool is a powerful spreadsheet analysis tool developed by Don Worth. It is available to DNAGedcom subscribers as part of the DNAGedcom Client. For more, see page 10 of the PDF HERE. Thank you Don for this great new tool! ... Click to read more!

The DNA Era of Genealogy

When does DNA prove a relationship? When is a triangulation group sufficiently large enough to prove descent from an ancestral couple? When is a shared DNA segment large enough to prove someone is your first (or second/third/fourth, etc.) cousin? At what point does the DNA prove that I am descended from Samuel Snell? When does the DNA prove that you’ve found your great-grandmother’s biological parents?

NEVER.

And this is, perhaps, one of the greatest misconceptions in the post-DNA era of genealogy.

What is Proof?

Genealogy is the study of lives and relationships. Accordingly, genealogists spend much of their time identifying, hypothesizing, supporting, and sometimes rejecting, relationships.

Unless you have direct knowledge of a relationship (and even sometimes when you do), you identify relationships using evidence that you’ve gathered from multiple different sources (including DNA, census, land, tax, vital, and many other types of records). ... Click to read more!