Accuracy of Large-Scale Genome Scanning Services

Although the genome scanning services offered by companies such as 23andMe, deCODEme, and SeqWright have been front and center in the press the last few weeks, I’m sure that the following information will not be included in any of the reports.

Comparisons

Two different sources have concluded that the scanning service offered by 23andMe and deCODEme, who use different types of Illumina SNP Chips, are highly reproducible. In January 2008, Ann Turner compared the results of testing at deCODEme and 23andMe, and concluded that of the 560,163 SNPs that overlapped and had a “call” (meaning there was a measurable result), they agreed on 560,128 and disagreed on 35. Ann wrote in January:

In all of [the disagreed calls], one company would make a homozygous call while the other company made a heterozygous call – there were no cases where they made a completely discordant call. All in all, I’d say that is pretty impressive.

The second analysis comes from Antonio C B Oliveira at Longa Vista, a new blog that appears to have been created to present these results and related information. Oliveira obtained results from 23andMe and deCODEme and compared the results, which are available here. He concluded that of the 560,299 SNPs that overlapped and had a call, the two scans agreed on 560,276 and disagreed on 23. The 23 disagreed upon SNPs are listed by chromosome. Oliveira writes:

This error rate seems to me to be quite acceptable and I wonder if this is the rate expected in scientific studies using the same technology.

Program to Compare Your Results

Interestingly, Oliveira created a computer program to analyze the results for him, and he has graciously made that program available “as a Windows executable and the source code is provided under the GNU General Public License.”

Conclusions/Thoughts

Note that Oliveira’s results contained 136 more overlapping results, presumably because of fewer no-calls in the data. Is Illumina able to produce more calls as they gain experience with the process, or is this an expected amount of variation from person to person? I would be interested to see more results and comparisons to determine the answer to this question.

HT: Genetic Future. If you’re interested in genome sequencing or personalized genomics, you should be reading Genetic Future. I highly recommend adding the feed to your reader. Genetic Future gave a hat tip about this information to Kevin Kelly at The Quantified Self. There, Kelly points out that none of the SNPs in Oliviera’s analysis are currently associated with any physical phenotype or disease. I hope Kelly plans to do a comparative analysis of his results, as that would be an interesting addition to the information provided by Turner and Oliviera.

14 Responses

  1. Pingback: Synthesis
  2. JHealey 5 May 2008 / 8:32 pm

    In conflict with the title, “Accuracy of Large-Scale Genome Scanning Services”, I don’t see evidence that 560,000+ calls were indeed correct. Instead, I acknowleging that the results correlate well. This may seem like a small point, but as both methods are similar, concluding they are indeed accurate because they give common results is faulty and misleading.

  3. JHealey 5 May 2008 / 8:34 pm

    Sorry, “acknowledging” should have been “acknowledge” [Note to self: must read full text before hitting “Post Comment” button]

  4. Blaine Bettinger 5 May 2008 / 9:31 pm

    JHealey, you are absolutely right that there is a very big difference between the concepts of accuracy and the reproducibility. As you’ll note however, the article itself clearly discusses the reproducibility and does not make any assumption or conclusion about accuracy (i.e. whether or not these results reflect the individual’s actual genetic code).

    Regarding the title, I would argue that reproducibility is one of the factors involved in ultimately determining accuracy, and that this data begins to shed some light on the topic.

  5. JHealey 6 May 2008 / 4:23 pm

    Thanks Blaine, no need to argue

    I would only add that the title should reflect the primary content or purpose of the text. The title clearly points to accuracy and the text, as you noted, “does not make any assumption or conclusion about accuracy”. If I am incorrect, please allow me to send you a copy of the Bible retitled as “The Evolution of the Species”! 😀

Comments are closed.