Thursday, March 31, 2011

Can mtDNA give clues to Sarah Stolebarger's identity?

I have long wondered about who my 3rd great grandmother Sarah Stolebarger really was. Sarah was born about 1802 and by 1823 she was married to John Stolebarger and living in Union, Huntingdon County, PA. Her maiden name, country of origin and date of birth are all unknown.  I wrote about the details of my genealogical research here.

My mother's male first cousin is a direct maternal line descendant of Sarah, so I decided to test his mtDNA to see if I might get lucky and find something meaningful. Since mtDNA is not often helpful for genealogy, I know that this is a long shot.

Yesterday, I received the HRV1 results:

Haplogroup I 

HVR1 differences from CRS:
16093C
16129A
16218T
16223T
16263C
16391A
16519C

I don't have any experience with mtDNA Haplogroup I, so I have been doing some research. According to Wikipedia, it is quite rare and found in low levels throughout Europe, the Middle East and South Asia. Reviewing the mtDNA I Haplogroup Project, I did not find any matches.

According to FTDNA, there is one exact match (HRV1). I contacted her and learned that she is also researching a Sarah (maiden name unknown) from the same area and time period as my Sarah. We both suspect that our Sarahs were of German ancestry, but have no documentation of their origins or parentage. Hopefully, this match will bear fruit in the future, but for now both genealogical brickwalls stand.

I am still waiting for the HRV2 results, but I don't expect to have any new matches at this time since there must be an unusual mutation in HRV1. It will be interesting to see if I can learn anything more specific about Sarah's possible origin when/if a subclade is assigned.

[Disclosure - My company StudioINTV has an existing production agreement with FTDNA that has no bearing on the opinions I express. I also receive a small commission from FTDNA on non-sale orders through my affiliate link, which I use to fund DNA tests. I receive no other compensation in relation to any of the companies or products referenced in my blog.]

Investigating Small Segments of Shared DNA with HIR Search

It can be tricky determining if small segments of shared DNA (~5.0cMs) imply true relatedness or are simply "noise".  It is possible that matches of this length are, in fact, Identical By State (IBS) and not Identical By Descent (IBD). In simpler terms, this means that the match could have happened by coincidence and does not mean that there is a common ancestor. Some experts have even predicted that up to one-third of these reported small matches are false positives.

However, in my opinion, sharing multiple small segments of DNA in common with another person does imply relatedness because the odds decrease dramatically that the coincidence of a  ~5 cM match could occur more than once. With this in mind, I often investigate my small 23andMe matches further to see if there are more segments in common that fall just below 23andMe's reporting threshold for Family Inheritance Advanced (5.0 cMs and 700 SNPs). If my "match" has submitted his/her data to independent projects such as Jim McMillan's DNA Cousins Project (must be logged into 23andMe to access) or HIR Search, then I can easily determine this.

As an example, for one person of Finnish ancestry (FM), 23andMe's Family Inheritance Advanced reports a match to my mother on two small segments (5.0 cMs x 2; Chr. 4, 5), my mother's sister (5.0 cMs; Chr 7) and my sister (5.5 cMs; Chr 9).

Chart from 23andMe's Family Inheritance Advanced

These matches all fall on different chromosomes, which on the surface doesn't make sense since my sister had to have inherited her segment from my mother because all of our Finnish matches are, undoubtedly, through her. Since the match is hovering on the border of 23andMe's cut-off, I am betting that my mother does in fact share the match on Chromosome 9 with my sister.

HIR reports 10 matching segments between my mother and FM ( > 4 cMs and 500 SNPs). Most of these were not identified by 23andMe due to the SNP count falling below 700. As I suspected, one of these segments is on the same spot of Chromosome 9 that was reported at 23andMe for my sister. HIR calls it at 5.8 cMs for my mother, which is close to the 5.5 cMs that 23andMe reported for my sister, but my mother only shares 638 SNPs on HIR, while on 23andMe, 872 matching SNPs were reported for my sister. This may be an instance where my sister coincidentally gained a strand of matching SNPs from my father's DNA that made the match appear a bit longer (IBS).

23andMe and HIR agree on my mother's matches with FM on Chr 4 and Chr 5. HIR calls them a bit longer, but not significantly so. The match that my aunt shares with FM is not present in my mother's DNA. Interestingly, HIR reports a 7.7 cM match on Chromosome 22  for my mother and FM, but with only 529 SNPs. Notably, 23andMe did not report any matches between FM and myself, while HIR shows 3 matches ( > 4 cMs and 500 SNPs). As would be expected, those matches are all in common with my mother though not completely consistent in reported size. Somewhat surprisingly, two of the matches reported by HIR between FM and myself, by their calculations, exceeded 23andMe's threshold: 1) 6.1 cMs and 1261 SNPs and 2) 5.2 cMs and 1781 SNPs. It is possible that HIR has a higher tolerance for mismatched SNPs especially since, in both cases, HIR reported more matching cMs and SNPs with FM for me than for my mother.

In this case, I can confidently conclude that this is an authentic match. The next step is to share family trees and attempt to find our common ancestor(s). It may be that we share multiple common ancestors that have each contributed one or two of these segments, in which case the connection may be too far back in time to identify. I'm looking forward to finding out.

Friday, March 11, 2011

23andMe Proposes Improvements to Relative Finder

23andMe is addressing many of our concerns with the functionality of Relative Finder and proposing some very welcome improvements. These changes to Relative Finder will enable the active participants to streamline their experience, while minimizing the frustration of seeing pages and pages of unanswered invitations when a user logs into their account.  These proposed improvements include:

1. Adding fields for genealogically relevant information such as historical family locations, family haplogroups and notes (no GEDCOMS yet).
2, Actively encouraging users to fill out their profile information on their first visit to Relative Finder.
3. Due to the fact that only about 4% of users have made their profile public, they will keep the option to participate anonymously for now, but will prominently place the option to forego the anonymous invitation in the settings dialogue.
4. Those users who fill out their profile will be prominently displayed on the first pages of the List View results.  All close relatives (up to 3rd cousins) will still appear upfront. The rest (those without profiles filled out) will be pushed to the later pages.
5. Profile SmartSearch will search through these matches and automatically highlight any matching profile information.
6. Adding the ability to search and filter matches.
7. Increasing the speed of searches.

23andMe is hoping to improve the Relative Finder experience for those of us who are very active, without scaring off those who may still be on the fence about sharing their genomes. With this in mind, they are looking at ways to better educate their customers in regard to Relative Finder and the privacy concerns associated with it.

This is only meant to be a quick summary since I wanted to share this good news as soon as possible. I will have more specific thoughts on these proposed changes in a later post. I would like to hear your thoughts and suggestions as well. So, please feel free to comment.

Thursday, March 3, 2011

New Air Date for My Japanese TV Interview

The segment on DNA testing on Close-Up Today, NHK Japan Broadcasting Corporation has been preempted, according to the producers, due to the unrest in the Middle East, New Zealand's earthquake and the volcano eruption in Japan. The new air date is April 6th. Hopefully, international events will settle down so they can concentrate on personal genomics. :-)

[Update - I would like to report that all of the NHK production team are safe and the crew is currently in Japan covering the sad events of the last few days. Obviously, my tongue-in-cheek wish for international calm was in vain and I am sure it will be quite awhile until this segment can air. My heart goes out to the Japanese people at this very difficult time.]