The Genographic Project Public Participation Mitochondrial DNA Database

The Genographic Project is probably the largest genetic genealogy project in the world. For $99, the project will sequence seqments of either your mtDNA or your Y chromosome for addition into their publicly available database. The goal of the project, with ten research centers around the world, is to “map humanity’s genetic journey through the ages,” and to “address anthropological questions on a global scale using genetics as a tool.” There has been a huge response to this project, and they just released their first research paper using the results they have collected to date:

“Family Tree DNA is proud to announce that the first paper resulting from data collected through the Genographic Project has been published today at the PLOS GENETICS. “The Genographic Project Public Participation Mitochondrial DNA Database” can be found at http://genetics.plosjournals.org and it will be uploaded to the Family Tree DNA public library as well.

The paper resulted from the collaboration of the Genographic Project Scientific Team, Family Tree DNA Genomics Research Center, and the IBM Data Analytics Research Group.”


This paper is all about the mtDNA sequences they have obtained through the project. In the first 18 months of the project, they have collected an amazing 78,590 mtDNA genotypes!! In the paper, they describe their genotyping parameters (i.e. how they go about sequencing the mtDNA), the frequency of each haplogroup in the database (for instance, 38.2% of the database is Haplogroup H!), and their attempt to identify any potential Neaderthal contribution to the database (there isn’t any).

The researchers also list a few goals for the future of the project and the scientific community as a whole:

“First, as sequencing procedures have become more efficient and stretches of 600 bp can easily be obtained, we suggest standardizing the reported ‘‘HVS-I’’ range to include positions 16024–16569 as presented herein.”

“Second, it would be worthwhile to create a standard list of coding-region SNPs used by the scientific community for Hg assignment.”

Third, the project should actively recruit samples from people in non-Western populations “to properly survey the genetic variation in non-Western Eurasian lineages.”

So what is the take-home message from this new paper? That the Genographic Database is a valuable, standardized database for geneticists, genealogists, anthropologists, and other -ists. The last paragraph of the study states: “In summary, we report both data and new classification methods developed using by far the largest standardized mtDNA database yet created, and detail the logistic, scientific, and public considerations unique to the Genographic Project. Most importantly, we return to the public a database made possible by their enthusiastic participation in the Genographic Project.”

Here’s Figure 4 from the project, a phylogenetic tree of mtDNA haplogroups, with the number of each haplogroup represented in the database (click it to get a larger version):


(Note that PLoS uses the Creative Commons Attribution License for all their papers, meaning that the public is free to, among other things, “copy, distribute, display, and perform the work”, as well as “make derivative works,” as long as the user gives the original author and source credit.  Thus, the above figure comes from:

The Genographic Project Public Participation Mitochondrial DNA Database Behar DM, Rosset S, Blue-Smith J, Balanovsky O, Tzur S, et al. PLoS Genetics Vol. 3, No. 6, e104 doi:10.1371/journal.pgen.0030104

This is, of course, another great reason to love and support open-access journals such as PLoS.)

Blaine Bettinger

Intellectual property attorney, genealogist, and author of The Genetic Genealogist since 2007


  1. Pingback: PredictER Blog
  2. I would like to take the test – How do I?
    Where do I go OR How do I find the DNA test???

  3. Nobody else sells all this data on a disc for business use. Our competitors are very expensive and very often very out of date. Usually they charge £180 – £250 per county; Why not check them out so you can really understand the competition and our market advantage and you will find us to be the best?
    Remember – Our Customers can use our Sales Leads as many times as they like. All of the above competitors usually quote their prices for SINGLE USE. Multiple-use is extra, even though a single mail shot rarely works well. Take care, some even charge extra for the phone number and other information that is really needed for your needs!
    Why is your data so much better?
    We own it. Our competitors have to rent their data from other company’s or scrape from other sites or mine it in the UK at extreme cost. Our UK team hand make all our databases as to make them 100% right up to date, and of course we double check it, check it and check it again to make sure the information is 100% correct. We have priced the database to make is as cost effective as possible in these very hard times we are in now.

Comments are closed.