I will attempt to summarize here the news and interesting tidbits from the conference using my notes. (I was writing fast, so please comment if something doesn't make sense or needs clarification.) I have added Tim Janzen's comments below. [TJ]
• FTDNA has the largest collection of full mitochondrial sequences in the world.
• FTDNA and Archives.com have entered into a partnership to integrate resources on the FTDNA website to facilitate family tree uploads and research.
• FTDNA has tested 600,000 people to date.
• John Spottiswood told us that Archives.com, through a partnership with Family Search, will offer the 1940 census images April 2, 2012. They will have the index done by the end of 2012, shooting for October. 6.5 million dollars have been invested in new records. They have 400,000 active subscribers. Archives.com has 18 of the top 20 collections at Ancestry. All conference attendees received a free 1 year+ membership to Archives. com. It is regularly $39.95 per year.
Spencer Wells and News from The Genographic Project:
• They are wrapping up Phase 1 of The Genographic Project by the end of next year and beginning Phase 2, which is leveraging the resources gathered from Phase 1.
• The Genographic Project has collected 75,000 samples from indigenous peoples in more than 130 countries from ~1000 populations.
• The majority of Canadian native tribes refused to DNA test. As a result, the project still does not have adequate sample coverage of the indigenous peoples of North America. South America is better.
• The National Genographic Project sold 10,000 kits the first day and 100,000 kits in the first 8 months with over 415,000 kits sold worldwide to date.
• The Project has raised 3 million dollars for it's Legacy Fund and given away 1.5 million dollars of that so far, funding 52 grants currently with ten more being funded in the next couple of weeks.
• Two papers on Basque DNA are coming out this week. One is on full mtDNA sequences and one is on Y-DNA. "Tracks pre-Roman tribal culture".
• We are losing a language in the world every two weeks!
• Until recently there was no genetic evidence of Asian impact on Hungarian DNA. The Project is now seeing 2%-3% Asian haplogroups in both mtDNA and Y-DNA in the 2334 Hungarian samples from the public. (Originally only sampled 100 indigenous Hungarians.) [Tim Janzen notes that it was, more specifically, 3% of the Y-DNA and 2% of the mtDNA.]
• DNA evidence is showing that the Indian caste system is older than Indo-European influence. (paper)
• They are doing things with ancient DNA that 10 years ago was impossible, that they "wouldn't have dreamed of doing".
• A paper is coming next year on ancient DNA research "transecting time", including information on farmers replacing hunter gatherers in Central Germany and mtDNA Haplogroup U5, which Spencer called "the hunter-gatherer haplogroup". They found different frequencies of haplogroups from samples at different layers. He says that the debate about the age of R1b has not yet been resolved. Spencer commented that outstanding issues about STR mutation rates is "not helping" and that we still "have to figure out Y-STR mutation rates."
• There is a correlation between linguistic dates and genealogical dates. Dr. Wells feels that he evolutionary rate overestimates these. He stated that the genealogical rates are more accurate over relatively short time spans and the evolutionary rate works better for deeper time spans.
• 1 in 17 men now living in the Mediterranean descend from Phoenician traders.
• 2000 Caucasus language speakers were sampled, finding a "remarkable concordance between genetic contrasts and language groups."
• The Project is starting to look at autosomal DNA. They are first looking at the X-Chromosome as a new genetic marker. They are using the pattern of recombination to infer history called "Theory of Junctions." (paper)
• Over 100,000 of the Genographic participants have converted their results to FTDNA.
• In Central Asia Y-DNA Haplogroup R1a has a high frequency - 40%-60% in some areas. R1a's frequency is higher in the mountains than in the plains on the same longitude. There is a ring pattern where R1a virtually disappears in the middle. According to Dr. Wells, this "central hole" was probably created by a replacement of R1a by East Asian haplogroups entering through the Dzhungarian gap.
• East Asian expansion corresponds to rivers as borders instead of the mountains.
• East Indians have the most Eurasian diversity.
• The project would like to look at the Australian "song lines" to determine if there is an overlap between where the songs intersect with other tribes and the genes.
• 10 papers are going to journals in the next two weeks with about a dozen more in the pipeline.
• The goal is to get the genetic information collected out to us "citizen scientists" for public participation. Spencer was not able to give a timeline as to when the database will be available.
• A big announcement is due next year from The Genographic Project.
|Spencer Wells' slide, photo courtesy Katherine Borges|
Bruce Walsh covered DNA basics, including useful autosomal DNA information:
• Each human cell has 46 chromosomes and multiple copies of mtDNA
• 1 cM = centi Morgan or a 1% chance of recombination, roughly corresponds to one million DNA base pairs.
• 1st-3rd generations can be estimated simply by the shared percent of DNA. Distant relatives are estimated by the largest block of shared DNA. There is a wide variation expected for the more distant relatives. Odds are that for relatives greater than 5 generations apart (10 total), all shared blocks have been lost.
• For Family Finder:
TMRCA Average Size of Blocks
1 44.06 cM
2 19.15 cM
3 12.3 cM
4 9.07 cM
5 7.19 cM
6 5.95 cM
7 5.08 cM
• Some autosomal DNA is dominant and will be passed down for a greater number of generations than expected.
• Looking for a common ancestor at 5 cM - 7cM shared blocks is "deep sea fishing". It is not strong evidence. At 7 cM there is about a 50/50 chance that the segment is identical by descent and there is a shared common ancestor in genealogical times. At 10 cM it is safe to assume that there is a common ancestor in genealogical times.
• The male X chromosome is phased since there is only one allele.
• Bruce Walsh discussed phasing and various other topics. FTDNA is exploring the option of phasing data where two parents and at least one child have done the FTDNA's Family Finder test or the 23andMe v3 test and using the phased data to run comparisons against other people in the FF database. The use of phased data in Family Finder would significantly reduce the number of matches that are simply identical by state. [TJ]
Phasing and Analysis of Family Finder Data from David Pike:
• Phasing is separating the alleles to distinguish which are inherited from each parent.
• David gave us a hands-on demonstration on using a number of his tools for analyzing Family Finder raw data, available here. These include:
- Search for Runs of Homozygosity (ROHs)
- Search for Heterozygous Sequences
- Search for Shared DNA Strands in Two Raw Data Files
- Inspect a Shared DNA Strand in Two Raw Data Files
- Inspect Shared DNA Strands in a Trio of Raw Data Files
- Search for Discordant SNPs in Parent-Child Raw Data Files
- Search for Discordant SNPs when given data for child and both parents
- Search for Differently Reported SNPs
- Phase a Child when given data for child and both parents
- Phase Siblings with Data from One Parent
- Phase Siblings with Data from Both Parents
• A deceased parent's genomic data can be reconstructed from testing the other parent and the children.
• Cousins' data can also be utilized to help phase portions of relatives' genome.
• In particular, I enjoyed his discussion of microdeletions of autosomal DNA segments, which
can generally found by checking for discordant data. [TJ]
Very enthusiastic presentation!
Thomas Krahn and News from Walk Through The Y:
• 366 participants, 125.8 million basepairs sequenced, 180,000 bp average coverage per participant, 450 undocumented new Y-SNPs have been found, 167 participants did not find a new SNP.
• 90% of participants have chosen their results be public on the Finch2 platform.
• Very advanced customers have mined the data from the 1000 Genomes Project and extracted new SNP candidates (Z series), suggesting promising markers and helping to design the primers.
• Covered new features on the draft tree.
• Currently 350k-400k base pairs for Walk Through The Y, 20 times more data via the new Roche 454 - ran Saturday night.
• Thomas Krahn summarized the latest Roche 454 sequencer Y chromosome sequencing results. He is doing Y chromosome enrichment of the DNA prior to sequencing so that he can maximize the Y chromosome sequence data from each sequencing run. In his latest run he tested 8 samples, but only 2 came out reasonably well. He plans to reduce the number of beads he uses in the sequencer and he hopes that will improve the quality of his data. In the latest experiment he got about 19,000 reads from one sample, of which about 48% of the reads were from the Y chromosome after Y enrichment. The average read length was in the 400-600 base pair range. Thomas plans to put the latest sequencing results on his FTP server as a downloadable file of about 300 million megabytes of data for Y SNP hunters to review. Thomas plans to continue to work on Y sequencing until he can perfect the sequencing. Thomas said that there are about 20 million base pairs on the Y that are worthwhile sequencing. The first 2 million base pairs on the p arm are pseudoautosomal and thus aren't helpful from a Y SNP search prospective. The palindromic regions also generally don't have many
Y SNPs. The new 454 sequencer will allow about 20 times as many bases to be sequenced as can be done with the WTY project currently. Now the WTY results generally include about 400,000 base pairs. Thus Thomas anticipates at least 6-8 million base pairs of the Y chromosome can be sequenced with the new 454 sequencer in the short term and hopefully about 20 million base pairs can be sequenced in the long term. [TJ]
Thanks to Dr. Krahn, the full presentation is available here. The download is near the bottom and called "Walk Through The Y - Update 2011".
Peter Hrechdakian and Findings from the Armenian DNA Project:
• Tested over 600 Armenians since 2009 in the Armenian DNA Project.
•Armenians are very diverse with 14 major Y-DNA haplogroups, 80 distinct Y-DNA subclades and 13 major mtDNA Haplogroups with 67 subclades.
• Armenians and Assyrians have very similar YDNA and mtDNA distribution patterns.
• There have been 13 Armenian Walk Through The Y participants so far. New SNPs have been found in 10 of them.
• Please visit the Houshamadyan Project website. This project is attempting to reconstruct Ottoman Armenian town and village life.
|Peter Hrechdakian's slide, Photo courtesy Katherine Borges|
Stephen Morse spoke explained how to use his One Step Webpages.
• Dr. Stephen Morse discussed many of the "one step" tools that he has available on his web site at www.stevemorse.org. It had been some years since I had last visited his web site and I was pleased to learn that he has added some DNA tools to his web site, including a genetic distance calculator. Some other web sites he mentioned that can be helpful for finding living people include www.PrivateEye.com and www.ZabaSearch.com. [TJ]
Question and Answer Panel:
• Julie Hill walked us through the Archives.com site.
• Instead of importing Gedcoms to FTDNA, users will be able to link through to their live family tree on Archives.com.
• In November/December 2008, FTDNA started advertising on Facebook, offering 12 markers for $59. They sold ZERO tests from these ads.
• In October 2010, FTDNA started a Facebook page. They have had their biggest ever promotions there and have ~16,000 "likes". Facebook users are now regularly demanding promotions.
• Bennett - "23andMe raw data uploads will be coming in the next 4-6 weeks for about $50." (v3 only)
• Sometime next year all Genographic Project samples will be destroyed in line with their original terms if not transferred to FTDNA. It will take about a year to destroy them all. The Genographic Project has made the FTDNA logo larger and larger on the bottom of their page, but they cannot directly contact participants due to their anonymous collection method.
• FTDNA is considering extending sample storage form 25 to 50 years.
• FTDNA will have some presentations from the conference available for download.
• FTDNA has tested samples for Family Finder that were up to 8 years old. Some have worked, some have not. The Illumina chip is much more robust than the Affy chip was with a 99% success rate the first time a sample is run. Bennett "would be more on the liberal side about trying old samples than 6-8 months ago".
• FTDNA will allow uploads of Ancestry.com autosomal data, but will not provide customer support for it.
• Recommended reading: "The Great Human Diaspora" by Cavalli-Sforza.
• Archives.com does not yet have searchable family trees or the New York passenger lists. They are working on adding these.
You can find Day Two here.
[Disclosure - my company StudioINTV has an existing production agreement with FTDNA that has no bearing on the opinions I express. I also receive a small commission from FTDNA on non-sale orders through my affiliate link, which I use to fund DNA tests. I receive no other compensation in relation to any of the companies or products referenced in my blog.]