The Genetic Genealogist

Adding DNA to the Genealogist's Toolbox

Archive for 2011


23andMe Announces 80x Exome Sequencing for $999

Yesterday, at Health 2.0 in San Francisco, 23andMe announced that it will be offering sequencing of exomes with 80x coverage for $999.  At Exome 80x, 23andMe discusses their test:

Your exome is the 50 million DNA bases of your genome containing the information necessary to encode all your proteins. Informally, you can think of the exome as the DNA sequence of your genes.

Your entire genome is made up of your exome plus other DNA, consisting of three billion bases with repetitive sequences, sequences of unknown function, and DNA that does not code for proteins.

Note that the Exome 80x test is only available to current customers, and is determined on a “first come, first served” basis.  Further, test-takers will initially only receive their raw data of 50 million DNA bases at 80x coverage, but 23andMe plans to develop new tools to take advantage of exome sequencing.

The Exome?

Many non-geneticists will no doubt be wondering what the “exome” really is.  The exome is the protein-coding portion of your genome, and comprises roughly 1.5% of the total genome.

For insight into what type of information might be gleaned from exome data, Daniel MacArthur has an article entitled “Venter’s exome, and the challenge of rare variants for personal genomics” from August, 2008.  In the article, he discusses some of the findings from the analysis of J. Craig Venter’s exome.

The Genealogist’s Exome

As a genetic genealogist, I was of course interested in the ramifications of exome testing on testing for genetic ancestry purposes.  23andMe states the following on their Exome 80x page:

Exome data are less suitable for ancestry or genealogical research, since they will not provide mitochrondrial sequence or much information on the Y chromosome.

This is a strange sentence, and one I believe wasn’t properly screened.  In my experience few genealogists decide to pursue 23andMe testing for the mtDNA or Y-DNA results.  It’s autosomal DNA testing and tools like Ancestry Painting and Relative Finder for which most genealogists use 23andme testing, and it’s far too early to tell whether genealogists will be able to make use of exome sequencing (of course we will!).

I hope this sentiment does not discourage genetic genealogists from pursuing the Exome 80x product.  Genealogists have been – and continue to be – among the very first adopters of new DTC DNA testing (including 23andMe’s very first product back in the 2007 to 2009 time frame).  Indeed, genealogists having been driving the DTC genetic testing market since 2000 with the launch of Family Tree DNA!

The Possibilities

One of most exciting uses of the Exome 80x product might be in self-directed discovery of rare variants in genetic disorders.  Numerous rare genetic diseases, many of which likely result from unidentified rare variants, have not been exhaustively studied.  At least one group has estimated that 85% of disease-causing mutations are found in the exome.

I can envision 23andMe Community Projects for rare genetic disorders, similar to its Parkinson’s Community but much smaller in size, where several members of a family purchase the Exome 80x sequencing in an attempt to identify variants that might be involved in the disease.  These projects may be sponsored and supported by 23andme, or might simply be a family attempting to analyze their genomes themselves.

Other Viewpoints:

Will you be signing up for 23andMe’s Exome 80x product?

NFL Players Xavier Omon and Ogemdi Nwagbuo Confirmed as Half-Brothers

Direct-to-consumer DNA testing has led to the re-joining of yet another family.

Y-DNA and autosomal testing by Family Tree DNA has revealed that two NFL players , Xavier Omon (San Francisco 49ers)) and Ogemdi Nwagbuo (San Diego Chargers), are half-brothers.  ESPN has a long write-up of the story at “A brothers’ tale for Omon, Nwagbuo.”

Meeting for the First Time

The brothers had planned to meet face-to-face yesterday, September 1, 2011, as their teams met on the field.  Turns out Omon’s team, the 49ers, were victorious, meaning that if he’s anything like my brothers, he gave Nwagbuo a hard time about it!  The Mercury News has a story about the brothers’ first meeting at “Omon meets half-brother (a Charger) for first time,” and the SF Gate has a story at “49ers’ Xavier Omon meets half-brother.”

Family Tree DNA’s Press Release:

Houston, TX – August 31,2011 – Family Tree DNA, the pioneer and largest DNA testing company for genealogy purposes, through its Family Finder test, provided the conclusive proof that two NFL players are half-siblings.

Until a few months ago, Xavier Omon, from the San Francisco 49ers and Ogemdi Nwagbuo from the San Diego Chargers did not have a clue that they were related. Early August, at the request of ESPN, Family Tree DNA performed the Family Finder test on both, and the result was unequivocal: definitely half-siblings. More of the story can be found at the ESPN website,under the “Brother’s Tale” story.

The Family Finder test allows connecting with family members across all ancestral lines. While the Y-DNA matches men with a specific paternal line and the mtDNA finds potential relatives only along the maternal line, Family Finder can look for close relationships along all ancestral lines. Anyone, regardless of their gender, may confidently match to male and female cousins from any of their family lines in the past five generations. The science is based on linked blocks of DNA across the 22 autosomal chromosomes that are matched between two people. Based on this concept, Family Tree DNA bioinformatics team has worked extensively to develop the calculations that would yield the closeness of the relationship. The possibilities to find matches abound: grandparents, aunts and uncles; half siblings; first, second, third and fourth cousins; and, more tentatively, fifth cousins.

About Family Tree DNA

Founded in April 2000, Family Tree DNA was the first company to develop the commercial application of DNA testing for genealogical purposes, something that had previously been available only for academic and scientific research. Almost a decade later, the Houston-based company has a database with over 345,000 individual records – the largest DNA database in genetic genealogy, and a number that makes Family Tree DNA the prime source for anyone researching recent and distant family ties. In 2006 Family Tree DNA established a state of the art Genomics Research Center at its headquarters in Houston, Texas, where it currently performs R&D and processes over 200 advanced types of DNA tests for its customers.

Media contact

Sharon Weisz, —tel: 323-934-2700; e-mail: Sharon@familytreedna.com

“My Beautiful Genome” by Lone Frank

Lone Frank, a journalist and author with a Ph.D. in neurobiology, has just published her fourth book, entitled “My Beautiful Genome: Exposing Our Genetic Future, One Quirk at a Time” (available for pre-order at Amazon).  A chapter of the book is available here (pdf).

Frank describes her book thusly: “This book is my very personal take on personal genomics. It chronicles my meetings and interviews with leading scientists and lays out the – somtimes [sic] disquieting – discoveries I make in my own genome.”

The book is described as follows at Amazon:

“Internationally acclaimed science writer Lone Frank swabs up her DNA to provide the first truly intimate account of the new science of consumer-led genomics. She challenges the scientists and business mavericks intent on mapping every baby’s genome, ponders the consequences of biological fortune-telling, and prods the psychologists who hope to uncover just how important our environment really is – a quest made all the more gripping as Frank considers her family’s and her own struggles with depression.”

I haven’t read the book myself, although I will soon be receiving a review copy.  Once I’ve finished it, I’ll write more about the book here at the blog. There is a recent write-up of Frank’s experiences at the Daily Mail entitled “If the blues genes fit…

I’m most interested to see what Frank finds in her genome, and how she interprets and uses her data beyond the interpretation provided by the testing companies.

Family Tree DNA’s 7th International Conference on Genetic Genealogy Announced

Family Tree DNA has announced the 7th Genetic Genealogy Conference for Family Tree DNA Group Administrators, to be held in Houston, Texas on November 5th and 6th, 2011.

Featured speakers at the meeting include the following:

Another interesting speaker at the meeting will be Jessica L. Roberts, J.D., an Assistant Professor of Law at the University of Houston Law Center (recent C.V. here (pdf)).  Although it’s not clear what Roberts will be speaking about, her recent publications (pdf) focus on genetics and the law, including the Genetic Information Nondiscrimination Act.  Kudos to Family Tree DNA for again bringing together a wide array of viewpoints and opinions at the conference.

——————————————————-

Unfortunately I will be unable to attend the conference this year, although I made it last year and hope to make it to the next conference.  I look forward to live-tweeting of the conference, which is the next best thing to being there.  Are you attending the conference?

Interpretome: New Analysis Software for Autosomal Testing Results

Daniel MacArthur tweeted this morning about “Interpretome,” which is browser-based software that can be used to examine autosomal testing results from 23andMe and Lumigenix.  There is also an interesting blog post about the software at the blog of Konrad J. Karczewski, one of the co-creators of the software, and one by Daniel at Genomes Unzipped.

Users load their raw data files, and then can use that information to explore their genome.  There are a number of different exercises that a user can run through with their data, including health issues (diabetes, warfarin sensitivity, many other diseases, etc.), ancestry analyses, and determination of “Neanderthal SNPs,” which are SNPs that have been suggested to derive from Neanderthal ancestry (note that this science is still VERY early stage and subject to change OFTEN!).

There are two very features that I find very interesting.  First, there is an “Advanced Settings” tab where users can make several important adjustments to their analysis.  Second, the site allows for “imputation” when looking up a SNP, which means that “If the SNP is not found in your file, the utility attempts to impute the SNP using Hapmap data.“  I haven’t tried this yet, but it will be interesting to see how well it works.

Ancestry Information

Interpretome allows users to create, among other things, an “Ancestry Painting” using either HapMap2 or HapMap3 data.  For example, using the HapMap2 data, my Interpretome ancestry painting is very similar to my 23andMe ancestry painting.  For those who aren’t familiar, here are the HapMap2 populations (HapMap3 can be found here):

YRI (Ibadan, Nigeria)

CEU (Northern/western Europe)

CHB+JPT (Beijing, China and Tokyo, Japan)

Medically-Relevant Information and Privacy Issues

The creators of Interpretome do address several issues, including the medical information controversy:

No information should be considered diagnostic and as with any genetic testing service, the interpretation is not regulated by the FDA.

And the important privacy issue:

Your genome will not be sent to any server, it remains on your computer. This website will make requests to a database that only contain “rsid” (without genotypes) and “population” (self-reported in the top-right) information. At no point will any genotypes be sent across the wires (all computation will be done in the browser).

However, the creators do go on to note that some exercises have the option of submitting personal information, which “will be anonymously stored on a secure server.”  So be cautious if you’re worried about privacy, as with any testing or analysis service.  As my genome is public domain, I’m not concerned.

Family Tree DNA Results?

For fun, I also tried my Family Tree DNA results.  Since FTDNA raw data results do not contain most, if any, medically-relevant SNPs, most of the “exercises” were fruitless.  I did have some luck with the ancestry sections, although I have yet to compare my 23andMe analysis with my FTNDA analysis to determine if there is consistency.

My Genome Online – A Challenge To You

As you may have heard, I recently made my 23andMe and Family Tree DNA autosomal testing results available for download online at “mygenotype,” and dedicated the information to the public domain (if dedicating DNA sequence to the public domain is even possible – I’m currently doing some research in this area and expect to write more in the future).

At “mygenotype” you can download the following:

My Family Tree DNA Results:

  1. Affymetrix Autosomal DNA Results (2010)
  2. Affymetrix X-Chromosome DNA Results (2010)
  3. Illumina Autosomal DNA Results (2011)
  4. Illumina X-Chromosome DNA Results (2011)

My 23andMe Results:

  1. V2 Results (2008)
  2. V3 Results (2010)
  3. Y-DNA Results (2010)
  4. mtDNA Results (2010)

You can also find my SNPedia Promethease reports:

In addition to my genome, Razib Khan of Gene Expression has a spreadsheet of approximately 48 other genomes that are available for download online.

A Challenge To YOU

Now that the information is out there, available to anyone who might be interested, it remains to be seen who might be interested in the information.

Indeed, as evidenced by Razib’s spreadsheet, while dedicating a genome to the public domain has only been done by a small handful of people worldwide, it isn’t as novel as it was just a few months ago.

So, I’m challenging everyone who reads this to download my data and analyze it to find the most interesting or surprising results.  For example, you could use my most recent 23andMe V3 data.

I’ve already done a fair amount of analysis myself, including the Promethease reports above (and see here), and a recent blog post about my vastly increased Type 2 Diabetes riskHowever, perhaps there’s a recent but relatively study that applies, or perhaps there’s a story you can weave with a handful of SNPs. Or, even better, what can you tell me about my ancestry other than mtDNA and Y-DNA haplogroups? Don’t worry about the strength of the study, reproducibility, etc. – I’m aware of the uncertainties associated with this type of research, and my goal here is to make people aware of possibilities.

Please post your findings in the comments below, and in two weeks I’ll pick the most surprising or interesting findings and make them the focus of a new blog post.

Can you surprise me with my own genome?

Using Autsomal DNA Testing to Identify An Adoptee’s Roots

The Mystery

Helen Marley Johnson, my great-grandmother, was born to unidentified parents on March 3, 1889, in Oswego County, New York.  Although I didn’t really know Marley, I remember meeting her when I was very, very young, just before she died in 1983.

Copyright Blaine T. BettingerMarley lived in Oswego and Jefferson counties for all her long life.  She was married twice, had two children, and today has numerous descendants located throughout the United States and the world.  However, by the time Marley was 13 years old, she had been adopted by at least three different families, eventually marrying into the last family that adopted her.

Since I began my genealogical research more than 20 years ago, I’ve worked to find the parents of Marley Johnson, without much success.  I have a plethora of data about the entire remainder of her life, but almost nothing about her ancestry.  For example, although I’ve found her birth certificate, it lists her mother as Minerva Johnson (a name that may or may not be real, and which I’ve found nothing on) and lists her father as “unknown.”

Autosomal DNA

Autosomal DNA testing presents the most promising new avenue of researching into Marley’s ancestry.  Copyright Blaine T. BettingerUnfortunately, both of Marley’s children have been dead for more than 30 years.  However, Marley has several living grandchildren, including my father and a first cousin named Edgar (name changed for privacy reasons).  By comparing autosomal results my father with his first cousin, it is possible to identify stretches of their DNA that they inherited from Marley and her husband Frank Bettinger.  Here’s why:

Both my father and Edgar are grandchildren of Marley and Frank, or children of Marley’s children. My father is the son of Marley & Frank’s son, and Edgar is the son of Marley & Frank’s daughter.  Approximately 25% of my father’s DNA comes from Marley, and approximately 25% of Edgar’s DNA comes from Marley.  Although it is not the same 25% in both cousins (because the children inherited random pieces of Marley’s DNA and then passed on random pieces of that DNA to their children), it is statistically nearly certain that they will share some of Marley’s DNA.  Indeed, first cousins are predicted to share 12.5% of their DNA, with about half each from the shared grandparents (6.25% of their shared DNA from Marley, and 6.25% of their shared DNA from Frank).  Both will have much more DNA from these ancestors, but it won’t be shared between them.

By comparing the autosomal DNA testing results of my father with Edgar, it will be possible to identify the DNA that they have in common.  Because they only share Marley and her husband Frank as ancestors (an important assumption here), then any DNA they have in common must be DNA that they inherited from Frank and Marley.

Of course, this is dependent upon Edgar and me not sharing any DNA from other ancestors, for example on my maternal side.  If we shared other ancestors, it would be much more difficult (but not impossible) to identify which DNA came from which ancestors.  However, given Edgar’s paternal ancestry – the side which does not involve Frank and Marley – this is exceedingly unlikely (but will be kept in mind during future analysis).

Results

I now have autosomal DNA results for Edgar and myself using Family Tree DNA’s Family Finder, and more specifically using their new Illumina OmniExpress chip.  The figure below highlights the regions of our genomes where we share at least 3cM stretches of DNA.

Note that I’ve used my DNA for this test, rather than my father, simply because have yet to test my father.  The numbers change slightly, as I’m predicted to share 6.25% of my DNA with Edgar, my first cousin once removed.  We share about 333 cMs (268 million base pairs), which I’ve calculated to be about 4.4% of our genomes (please chime in if you think this estimate is incorrect, as I haven’t had sufficient time to explore it).

With this map and the data that comes from it, I’ve identified portions of my genome (and Edgar’s) that come from Marley and Frank.  Although I don’t know which portions came from who, I have a wealth of information I can now use to explore our shared ancestry.

Now What?

So now what?  Now, I wait for matches shared by Edgar and I, people who share one or more of these stretches of DNA.  Currently, we do not share any individuals.  If another individual shares a piece of the identified DNA, it is likely that they are related through Frank and Marley.  As I have a great deal of information about Frank’s ancestry, I can try to narrow down the matches to Marley’s ancestry.  This, of course, presents one of the biggest challenges of this approach.

Further, identifying relatives is only the very first – and the easiest – step.  Once I have identified someone who might be Marley’s biological relative, I have to obtain as much of their genealogical tree as they are willing to share in order to mine it for information.  I will be looking for families that lived in or migrated through the Upstate New York area in the early 1880′s.  Of course, I must consider all the descendants of any potential relatives as well.

Yes, it’s a great deal of work, and there is no guarantee that I will ever identify a link.  For example, what if John Doe, Marley’s father, took an undocumented vacation in Upstate New York to visit his best friend and had a fling with Marley’s mother?  I may not be able to uncover that connection either in paper records or in DNA, at least for now.

My best bet is to accumulate as much information as possible – paper records, DNA, gedcoms, family trees, etc. – and slowly create a web of paper and DNA.  This web will undoubtedly slowly reveal overlapping information that hints at Marley’s ancestry.  For example, there may only be one potential male individual who possesses DNA from family X, DNA from family Y, and DNA from family Z, all of which Marley inherited and of which Edgar and I share.  A needle in a haystack, but an exciting possibility nonetheless.

The Future

In the future, I can attempt to mine existing genomes for more data.  For example, by comparing my father’s siblings with Edgar’s DNA.  Statistically, they will share different portions of their genome with Edgar, allowing me to more completely identify the DNA in Edgar’s genome that came from Frank and Marley.  Since Edgar is the extent of the other line, and Marley’s children are dead, this is the best I can currently do (until I can sequence Marley’s DNA directly from the stamps and letters she licked and I’ve saved).

Conclusion

Essentially, using autosomal DNA testing and the approach described above, I have re-created portions of my great-grandparent’s genomes by identifying bits and pieces of their DNA in living individuals. What an exciting time to be a genealogist.

Now let me know, do you have any tips or suggestions for me as I continue my hunt for Marley’s parents?  If so, please share them below.

DNA Heritage Ceases Operations and Transfers Database to Family Tree DNA

DNA Heritage, a popular genetic genealogy company intiated in 2002, has ceased operations (although pending orders will be fulfilled).  The company’s website announced today that it is in the process of transferring their database and domains to Family Tree DNA.

Family Tree DNA, meanwhile, has announced that it records in the DNA Heritage database will only be placed into FTDNA’s database if the owner agrees to opt-in.  FTDNA has a series of FAQs related to the transfer available here.

The full text of the announcement is below:

As of April 19 2011, DNA Heritage has ceased its operations and is in the process of transferring the domains DNAHeritage.com and Ybase.org to Family Tree DNA.

All the tests in progress will be processed by our current lab and the results will be delivered to our customers.

In order to ensure the continuity of the existing surname projects Family Tree DNA will study the best options to integrate our customers’ results into their database. Once Family Tree DNA decides on the option(s), our customers will be given the opportunity to opt-in to their database.

If you have questions about the transition or need to place an order please check: http://www.familytreedna.com/landing/dna-heritage.aspx

New Report for the Department of Defense Recommends Genomic Sequencing of Troops

An independent group of scientists has recommended that the Department of Defense (“DoD”) obtain and sequence the genomes of members of the military.

JASON, a group of between 30 and 60 scientists and created in 1960 which advises the U.S. government on scientific and technological issues, authored the report entitled “The $100 Genome: Implications for the DoD,” (pdf) which was released on January 13, 2011.

In the report, the scientists provided the following recommendation:

“The DoD should establish policies that result in the collection of genotype and phenotype data, the application of bioinformatics tools to support the health and effectiveness of military personnel, and the resolution of ethical and social issues that arise from these activities. The DoD and the VA should affiliate with or stand up a genotype/phenotype analysis program that addresses their respective needs. Waiting even two years to initiate this process may place them unrecoverably behind in the race for personal genomics information and applications.”

It’s good to see acknowledgment in the report of potential ethical issues, but there was no substantive discussion of them.  Deciding to collect DNA and sequence genomes of troops is, quite frankly, a no-brainer, and the report came to all the obvious conclusions.  What the military really requires is a report on how to discover, analyze, and address the myriad ethical issues associated with the obvious decision to sequence genomes.

A news article published yesterday in nextgov (“Report urges Defense to collect genome data on all troops“) discusses a few of the potential ethical issues, and includes a few quotes from me:

“According to Blaine Bettinger, a Syracuse, N.Y.-based intellectual property lawyer who has a doctorate in biochemistry with a concentration in genetics and writes the Genetic Genealogist blog, a mass collection of genome data at Defense could eventually help improve the health of military members and their families. Collecting basic genomic information on such a large population could also “benefit all of humanity,” Bettinger said.

But Bettinger warned that collection of such data also could be used against individuals if, for example, they had conditions the military could cite as a reason to limit their careers.”

I had a few major concerns about the potential ethical issues with this project, including the following:

1) privacy concerns (since anonymity of genomic data, if it’s made public or leaked, is nearly impossible to maintain);

2) sequencing without informed consent of the members of the military (will it be fine print, or explicitly explained?);

3) use as a screening method (either for denying entrance into the military, or used to steer people toward certain careers w/in the military;

4) and lastly, the unique problems that arise when several generations of a family enlist.  For example, John Doe Jr. enlists and reports that his father is General John Doe Sr.  An army doctor casually glances at the Doe’s genome reports on his iPad and says “no he’s not,” since they don’t share any appreciable amount of DNA.

Are there any potential ethical

There is a great potential for good here, and a great potential for harm.  How the military decides to proceed will determine which prevails.