The Genetic Genealogist

Adding DNA to the Genealogist's Toolbox

Archive for April, 2012


Genetic Genealogy and Personal Genomics in the Classroom – Part I

Today begins the first in a series of articles about the use of genetic genealogy and personal genomics in the classroom, ranging from high school to college-level.

Many scientists and health care experts believe that genetics will be a vital component to several facets of our lives in the future, especially in the field of medicine.  Indeed, some consider the study of genetics to be one of the most promising solutions to many of the health dilemmas facing society today, including advancing our understanding of interactions between genetics and the environment.  Accordingly, today’s students should have at least a basic grasp of genetics, and science educators must find innovative ways to share those concepts with their students.

A Need for Genetics Education

Unfortunately, some studies suggest that many of today’s students lack comprehension of some of the most basic concepts in genetics.  See, e.g., Wood-Robinson, C., Lewis, J. and Leach, J. 2000, Young people’s understanding of the nature of genetic information in the cells of an organism, J Bio Educ 35(1):29-36; and Quinn, F., Pegg, J., and Panizzon, D. 2009, First-Year Biology Students’ Understandings of Meiosis: An Investigation Using a Structural Theoretical Framework, Int’l J. Sci Educ 31(10):1279-1305.

While students certainly can learn about genetics through lectures and textbooks, there is little doubt that hands-on experiences help reinforce concepts and may even reach some students that are less likely to learn from passive methods.

Over the next few weeks, we’ll examine several instances of genetic genealogy and/or personal genomics being used in the classroom to examine and reinforce concepts of genetics, race, and ethics, including the following:

  • The Genographic Project  in High Schools (Chicago Public Schools, Soldan International High School, Edward Bleeker Middle School, and Olympic High School) (2007-)
  • The Cornell Genetic Ancestry Project (2011)
  • 23andMe Testing for Freshman at Berkeley (2010)
  • Medical School Testing (SUNY Upstate Medical University and Stanford) (2010-)
  • Anthropology and Genetics at Penn State University (2012)
  • Personal Genetics Education Project (www.pged.com)

Without further ado, let’s begin with the use of genetic genealogy in schools.

The Genographic Project in Middle and High Schools

Each of the school projects below were conducted in conjunction with the The Genographic Project, which has done a tremendous job of working with public schools to educate students about their genetic ancestry.

A.  Chicago Public High Schools in 2007

The earliest reference I can find of commercial genetic genealogy being used in the classroom is from 2007, when The Genographic Project (National Geographic, IBM, and Family Tree DNA) donated 150 testing kits to each of five Chicago Public Schools and 50 kits to each of their international partner schools in England, Jordan, France, South Africa, and China (a total of 1000 kits, priced at approximately $100 each).

According to several reports, the teachers at these schools expected the testing to provide the students with valuable information and experiences:

Parents ‘hear DNA, all they think about is “CSI.” It’s not like that at all,’ said Brian McKay, who teaches European history at the Charles A. Prosser Career Academy and scraped his own cheeks for cells on Tuesday. ‘Our kids are going to get a lot out of this. (Students) are very positive, they’re very excited.’  Source.

Prosser Principal Ken Hunter stated that:

“We are more than excited to help our students learn about our world’s common threads. At Prosser we tell our students to “extend the world”—this project presents them with a wonderful opportunity to make those words come alive in real world application. My teachers are thrilled to be taking part in such a thoughtful learning activity that brings the idea of common ancestry and shared humanity to our students in such a powerful and compelling way. This ‘learning tool’ has really helped make the education experience here at Prosser ‘the stuff that dreams are made of.” Source.

Interestingly, the schools involved were chosen based in part on the diversity of the student population:  “‘Chicago is a melting pot, a multicultural melting pot, it’s a great place to illustrate how interrelated we are,’ [Spencer] Wells said.”

The launch of the project was covered by The Genographic Project itself, and by the media:

Unfortunately I was unable to find any reports of the outcome of the testing, so it’s unclear what lessons the students derived from the experience.

B. Soldan International High School in Chicago in 2007

In 2007, forty advanced placement science students at Soldan International High School in St. Louis, Missouri submitted their DNA for testing with the Genographic Project.  (see “High school students uncover their past through their DNA“) (several articles also appeared in the St. Louis Dispatch, but are now found only in the newspaper’s archives).

At Discovering Biology in a Digital World, blogger Sandra Porter wrote the following about the Soldan project:

Most science instructors steer clear of these sorts of activities because there is a real possibility that children might learn some things in class that their parents would prefer remain secret. Any science instructor who’s had to find a really creative way to explain why a student has the “wrong blood type” based on their parentage, will appreciate that analyzing Y chromosomes has potential for trouble.  I wonder how the teachers at Soldan will answer those questions.

I actually wrote about this project here on TGG back in 2007, the early days of the blog, partially to address the concerns that were raised (see “Genetic Genealogy in the Classroom”).  As Sandra’s blog post suggested, some were concerned that testing in the classroom had the potential to reveal non-parental events.  To address this issue, I posited the following:

“although there is surely a chance of there BEING a non-parental event in a large group of students, the chance of CONCLUDING that there was a non-parental event is quite small. Most genetic genealogy companies return a list of allele numbers (12 alleles for the Genographic Project) for Y-DNA or a list of mutations for mtDNA along with a probable haplogroup designation. Armed with that knowledge, how is a student going to determine that there was a non-parental event?”

There are certainly some ethical concerns with genetic genealogy testing in the classroom, but non-paternal events are unlikely to be of serious concern.

C.  Other Schools

Following the apparent success of the Chicago school experiment, the Genographic Project worked with several other schools in the following years:

  • Newark High School in western New York state (2007) – approximately 10 students contributed DNA to the Genographic Project for testing (see “Connecting Some Dots With Our DNA”);
  • Edward Bleeker Middle School in Flushing, New York (2011) – several sixth graders participated in testing, as roughly 400 students at four different New York City public schools would trace their ancestry with the Genographic Project (see “Am I Related to Justin Bieber?”);
  • Harlem Children’s Zone in Harlem, NY (2011) – In last Sunday’s episode of “Finding Your Roots with Henry Louis Gates, Jr.,” Gates tested a group of six or so African American students at Harlem Children’s Zone.  The testing appears to have analyzed only their ethnicity, which varied considerably.  What was most interesting, however, was that Gates discussed with them the implications of their testing, and asked for their thoughts after receiving their test results.

A Useful Exercise: Estimating Admixture Before Testing

On Finding Your Roots, Gates also had the students estimate their admixture before they received their results, which is a great way to introduce the scientific and historical concepts associated with admixture testing.  This is a tool that Gates has already used at least once in the series, and I’m sure we’ll see it again.

Another useful component of this exercise might be to have the kids do some preliminary research on their own family tree before estimating their admixture, including research as simple as asking parents and grandparents.  With this information, they could make a more educated estimate of their admixture.

Conclusions

Although using genetic genealogy in the classroom is not new, it hasn’t been used as extensively as it could be.  What suggestions do you have for the successful use of testing in the classroom?

A Review of AncestryDNA – Ancestry.com’s New Autosomal DNA Test

In the past, I’ve reviewed new autosomal DNA testing options offered by 23andMe and Family Tree DNA:

Today, I’m reviewing the new autosomal DNA test from Ancestry.com called “AncestryDNA.” I’ve already written at length about AncestryDNA, so I won’t cover too many of the basics here.  I have an in-depth introduction to the product located at “Ancestry.com’s AncestryDNA Product,” which you might want to check out before or after reading this review in order to gather more information.

AncestryDNA: An Introduction

The introduction page, which appears after clicking on “View Results” on the front page, consists of my Genetic Ethnicity Summary and the Member DNA Matches (which is further broken into close cousins and distant cousins, as discussed in detail below).  Please note that for purposes of this review I’ve removed the identifying information for my genetic matches.

Genetic Ethnicity Summary:

My genetic ethnicity results, which suggest 90% European and 10% Uncertain, are very interesting.  In a recent webinar with the AncestryDNA team, they reported that the genetic ethnicity analysis is still very early in the beta phase, and will continue to be updated and refined as new reference populations are added.  Indeed, I’m predicting that over time as new information is added and the algorithm is refined, some or all of my10% Uncertain will be categorized (perhaps to reflect my maternal Asian and African contributions, which I’ve written about before), and that some of of my 90% European may very well change.

Under a heading “About Your Ethnicity” is a pop-up file with more information about Ancestry.com’s ethnicity estimation algorithm.  In that file, under “Is It Accurate,” for example, Ancestry.com provides the following:

When determining your genetic ethnicity, we hold our process and results to an extremely high standard of accuracy.  Our lab’s analysis uses some of the most advanced equipment and techniques to measure approximately 700,000 points in your genome (with at least a 98% rate of accuracy).  We compare that to one of the most comprehensive and unique collections of genetic signatures from around the world.  And as this collection improves over time, it can only get better.

I’m not sure whether the AncestryDNA tests these 700,000 SNPs, or whether it tests more SNPs but is currently using a subset of 700,000 for its analysis.  I’ll try to find this information.

I thought it might be interesting to compare my genetic ethnicity results from the three companies (Ancestry.com, 23andMe, and FTDNA):

Ancestry.com’s AncestryDNA:

  • 78% Scandinavian
  • 12% Central European
  • 10% Uncertain

23andMe’s Ancestry Painting:

  • 98% European
  • 2% Asian
  • <1% African

Family Tree DNA’s Population Finder:

  • 68% European (Northeast European) – Finnish
  • 32% Middle East (Jewish) – Jewish

After reviewing the results one thing is certain: all three companies estimate a strong European contribution to my genome, particularly Scandinavian (ranging from 68% to 78%).  It’s ironic, however, that I have yet to identify a single Northern European ancestor!  I certainly won’t be surprised when one pops up someday.

Clicking on “See Full Results” takes me to a more detailed analysis of my ethnicity results, but not before I click through the following pop-up:

Please keep in mind…Our prediction of your genetic ethnicity is not yet finalized. As we gather more DNA samples and continue our research we expect your ethnicity results to become more accurate and perhaps more detailed.

As I stated above, the ethnicity results are likely to change over time, so be forewarned.

The Full Results page – reproduced below – includes historical and anthropological information about each of the identified regions from your ethnicity profile (Scandinavian and Central European, for me).  It also shows a list of genetic matches who share the relevant region (it’s a long list along the right lower side of the page, but it’s not shown below for privacy reasons).  You can also zoom into the map where ancestors from a tree you’ve linked to your account are displayed.  For example, I have 8 listed in Ireland and 2 in Central Europe.

In summary, Ancestry.com’s AncestryDNA test provides a genetic ethnicity/region calculation based on about 700,000 SNPs and a large collection of both public and proprietary reference databases.  The product can currently categorize DNA into at least 22 different ethnicities/regions, with more to come.  So be prepared for changes to your estimation as their algorithm and databases grow.

Member DNA Matches

Also on the introductory page is a listing of genetic matches.  These are individuals that, based on shared segments of DNA, you are predicted to share a common ancestor with.  An interesting aspect of the DNA matches list, however, is a sliding scale for the relationship confidence level, which ranges from 99% to 10%:

  • 99% Confidence – Immediate Family
  • 99% Confidence – 1st Cousins
  • 99% Confidence – 2nd Cousins
  • 98% Confidence – 3rd Cousins
  • 96% Confidence – 4th Cousins
  • 50% Confidence – Distance Cousins
  • 20% Confidence – Distance Cousins
  • 10% Confidence – Distance Cousins

Accordingly, the introductory page can be customized to only display cousins of a certain confidence level.  If I reduce the confidence level to 96%, for example, I only have two matches (my two predicted fourth cousins shown in the picture above).

Clicking on the “What Does This Mean” link next to the  possible relationship range on the “Review Matches” page for each genetic cousin (see the figure below) causes the following information to be displayed, along with some nice inheritance charts:

Predicted Relationship Info: FOURTH COUSIN

It’s interesting to note that (at this degree of separation) we are accurately able to predict only about 85% of the possible relatives that are out there—in other words there is a 15% chance that our DNA analysis does NOT recognize an actual relative of yours. One way to be more certain that the DNA testing captures as many relatives as possible is to have multiple members of your immediate family tested.

It is also interesting to note that at this degree of separation we are sometimes wrong in our prediction of a real relationship. We’ve found that for this relationship about 15% of the time we predict a relationship that cannot be found in any family tree.

This provides some interesting insight into AncestryDNA’s matching algorithm and, accordingly, the algorithm’s results.  For example, it’s important to always keep in mind that there is a roughly 15% chance of incorrectly labeling an individual either as a match or as not being a match.

As the user slides the scale from 99% down to 10%, more results typically appear.  For example, I currently have two 4th cousins listed as matches, 9 matches with 50% confidence, 14 matches with 20% confidence, and 38 matches with 10% confidence.  I expect these numbers to increase considerably once more test results become available.  I don’t know how big the AncestryDNA database currently is, but I’m guessing that only a few 100 to a few 1000 people, at the very most, have undergone testing so far.

Comparing Family Trees

The true power of the AncestryDNA test lies in the ability to automatically compare your uploaded family tree with the uploaded family tree(s) of genetic matches.  For example, one of my predicted fourth cousin matches has a public tree with 408 people.  Clicking on “Review Match” takes me to the next page with more information (see the next screenshot) including each of the following:

  • A predicted relationship and predicted relationship range;
  • Our ethnicity comparison (a very cool and potentially very useful feature);
  • My genetic cousins’ entire tree out to 7 generations (and a link to see more);
  • A possible shared ancestor (a “shaky leaf” hint) if one is identified;
  • Surnames that we share in common; and
  • My genetic cousins’ surnames through 10 generations.

I especially like the Genetic Ethnicity Bar (I just made that up, but I guess it fits) comparison, which shows your ethnicity prediction next to your matches ethnicity prediction.  For example, my fourth cousin displayed in the image below is 93% British Isles and 7% Uncertain.  Since I have no reported British Isles genetic contribution, my Genetic Ethnicity Bar is gray:

 On the other hand, if there is some matching ethnicity contribution, the Genetic Ethnicity Bar comparison will look like this:

This genetic match and I, predicted to be distant cousins, both have contributions from Central Europe and Scandinavia.  My match also has British Isles and Middle Eastern, which I am estimated not to have.

Also on the the “Review Match” page is a link to send a message to the match (very important for genealogists).  I also like the “Last signed in” information, which lets people know just how active a genetic match might be (and why they aren’t answering your email!).

Common Ancestor and Shared Surnames

As can be seen from the last two screenshots, the list of shared surnames (if there are any) is prominently displayed near the top of the page.  If there was an individual in common between our trees, he or she would also be displayed there.  Unfortunately, when I review the match with each of my possible genetic cousins, I typically have one or more shared surnames, but none have a single identified common ancestor.  I was hoping for such a match, but I’ll have to be a bit more patient.   While I currently have about 55 matches, only some of those have public trees, and even fewer have substantial family trees (larger trees increase the likelihood of identifying a possible shared ancestor, of course).

Conclusion

This post included just a few initial thoughts about my testing experience and results.  I may add more information, or create a new post, as I continue to review my results.  If you have any questions about the testing process or ancestry results that I didn’t address, please feel free to leave a comment.  I’m sure many other people have the same question, so don’t hesitate to ask.  I’ll also try to get the AncestryDNA team to answer any questions I can’t answer.

While there is currently no information about when AncestryDNA will be available, or pricing, I’m sure that this will be available soon.

I’m looking forward to your comments, ideas, and questions.

(Disclosure:  I received my AncestryDNA test without charge from Ancestry.com for review purposes and beta testing.  Regardless, I have attempted to review this product as honestly and as objectively as possible in order to provide valuable information about AncestryDNA to my readers.)

Ancestry.com’s AncestryDNA Product

I’ve written before about Ancestry.com’s new AncestryDNA autosomal test.  See, for example:

Webinar with Ancestry.com

Last week, I participated in a webinar with Ancestry.com regarding the AncestryDNA test (although, unfortunately, I had to leave a bit early due to a previous engagement).  It was a great list of about 10 well-known genealogy bloggers, each one of whom is someone I’ve been reading or following for years.  It was an honor to be included among them.

One of the participants was CeCe Moore of Your Genetic Genealogist.  CeCe has a nice summary of the webinar and the important points about the autosomal test and the user interface at “New Information on Ancestry.com’s AncestryDNA Product.”  If you’re interested in autosomal DNA testing, or in Ancestry.com, I highly recommend reading her post.

The Power of DNA

The highlight of the webinar – and of the AncestryDNA product – was the combination of DNA and family trees.  I’ve said before that the ability to combine DNA and the paper trail is the future of genetic genealogy, and the true power of DNA.

The AncestryDNA test automatically compares your family tree (if you have one hosted at Ancestry.com) to the family tree of your genetic matches (if they have one hosted at Ancestry.com, and if it’s public).  The user interface then suggests overlapping individuals that might be the source of the shared DNA!  The user interface presents this information as a “Potential Common Ancestor,” and provides it as a “shaky leaf” hint.  Thus, as with all shaky leaf hints, it should be subjected to further research and not blindly accepted.

You can also see the first 7 generations of each genetic match in your user interface (again, if their tree is public), another great benefit.

While there are of course MANY caveats to this matching algorithm, it eliminates a time-consuming step in sharing information with genetic matches, as many of us know from [many hours of] experience.  (I didn’t get a chance to ask if the matching algorithm takes into account the predicted relationship range of the genetic cousins being matched, but I’ll try to get that information for you.)

If you think about it for a moment, the power of this approach is mind-boggling.  Over time it will create a mesh of DNA and genealogies, with individual data points that can be confirmed or rejected based on the results of numerous test-takers.  In other words, there will be an enormous DNA family tree.  Not only that, but that enormous DNA family tree can then be used to test genealogical hypotheses (was John Smith’s mother a White?  was John Smith Jr. adopted? etc…).  While a long way down the road, the possibilities are endless.

Concerns About Combining DNA and Family Trees

I know there is a lot of criticism and concern about the quality of third-party genealogies on Ancestry.com.  It’s impossible to know just how subjective or objective the data in any given tree is.  It’s true that there will always be concerns about third-party genealogies, and that there will be many, many errors as genealogists begin to tie DNA to specific ancestors.

But these concerns are equally true for paper records.  Any time you tie a paper record to a certain individual in your family tree, there’s a serious possibility of error, and this error can be propagated throughout numerous genealogies.  Every genealogist has seen this before, probably many times. But the fact that we’ve recognized the error likely means that the error has been corrected through careful research.

There is nothing different or exceptional about tying DNA to ancestors.  Any time you tie a piece of DNA to a certain individual in your family tree, there’s a serious possibility of error.  Over time, however, careful and methodical research – likely contributed by many different test-takers – will allow genealogists to make the most reasoned and knowledgeable judgment.

There’s enormous power in numbers.

A Roundup of AncestryDNA Posts

Here’s a complete roundup of posts around the genealogy blogosphere about Ancestry.com’s new Autosomal DNA product (AncestryDNA):

Did I miss any?  Feel free to mention them below.

Disclosure: I received a free beta test from Ancestry.com, although I have not yet received my results (I will receive them this week, I believe).  However, I have tried to review this product objectively.