Thursday, March 31, 2011

Investigating Small Segments of Shared DNA with HIR Search

It can be tricky determining if small segments of shared DNA (~5.0cMs) imply true relatedness or are simply "noise".  It is possible that matches of this length are, in fact, Identical By State (IBS) and not Identical By Descent (IBD). In simpler terms, this means that the match could have happened by coincidence and does not mean that there is a common ancestor. Some experts have even predicted that up to one-third of these reported small matches are false positives.

However, in my opinion, sharing multiple small segments (> 5 cM) of DNA in common with another person does imply relatedness because the odds decrease dramatically that the coincidence of a greater than 5 cM match could occur more than once. With this in mind, I often investigate my small 23andMe matches further to see if there are more segments in common that fall just below 23andMe's reporting threshold for Family Inheritance Advanced. If my "match" has submitted his/her data to independent projects such as Jim McMillan's DNA Cousins Project (must be logged into 23andMe to access) or HIR Search, then I can easily determine this.

As an example, for one person of Finnish ancestry (FM), 23andMe's Family Inheritance Advanced reports a match to my mother on two small segments (5.0 cMs x 2; Chr. 4, 5), my mother's sister (5.0 cMs; Chr 7) and my sister (5.5 cMs; Chr 9).

Chart from 23andMe's Family Inheritance Advanced

These matches all fall on different chromosomes, which on the surface doesn't make sense since my sister had to have inherited her segment from my mother because all of our Finnish matches are, undoubtedly, through her. Since the match is hovering on the border of 23andMe's cut-off, I am betting that my mother does in fact share the match on Chromosome 9 with my sister.

HIR reports 10 matching segments between my mother and FM ( > 5 cMs and 500 SNPs). Most of these were not identified by 23andMe due to the SNP count falling below 700. As I suspected, one of these segments is on the same spot of Chromosome 9 that was reported at 23andMe for my sister. HIR calls it at 5.8 cMs for my mother, which is close to the 5.5 cMs that 23andMe reported for my sister, but my mother only shares 638 SNPs on HIR, while on 23andMe, 872 matching SNPs were reported for my sister. This may be an instance where my sister coincidentally gained a strand of matching SNPs from my father's DNA that made the match appear a bit longer (IBS).

23andMe and HIR agree on my mother's matches with FM on Chr 4 and Chr 5. HIR calls them a bit longer, but not significantly so. The match that my aunt shares with FM is not present in my mother's DNA. Interestingly, HIR reports a 7.7 cM match on Chromosome 22  for my mother and FM, but with only 529 SNPs. Notably, 23andMe did not report any matches between FM and myself, while HIR shows 3 matches ( > 5 cMs and 500 SNPs). As would be expected, those matches are all in common with my mother though not completely consistent in reported size. Somewhat surprisingly, two of the matches reported by HIR between FM and myself, by their calculations, exceeded 23andMe's threshold: 1) 6.1 cMs and 1261 SNPs and 2) 5.2 cMs and 1781 SNPs. It is possible that HIR has a higher tolerance for mismatched SNPs especially since, in both cases, HIR reported more matching cMs and SNPs with FM for me than for my mother.

The next step is to share family trees and attempt to find our common ancestor(s). It may be that we share multiple common ancestors that have each contributed one or two of these segments, in which case the connection may be too far back in time to identify. I'm looking forward to finding out.

4 comments:

  1. Hi CeCe,

    These little apparent matches can be very mind boggling. I have 19 overlapping matches on one segment of Chromosome 4 that are between 5.0 and 5.8 Centimorgans in length. These show up in the Ancestry Finder at 23andMe. Six of them begin and end at the same point. The others only vary slightly at the beginning point or the ending point.

    Adding to my frustration, only one of these 19 has chosen to publicly identify herself, and that person knows very little about her ancestry. Therefore, at present I have no way to contact them to get more information from my matches.

    The matches seem to be geographically diverse. However, the four who list all four grandparents born in the came country are from Macedonia, Greece, France and Denmark. I'd welcome comments about this apparent cluster.

    ReplyDelete
  2. Hey Dr D,
    I don't know what to think of these clusters of small matches either without being able to determine if you share more DNA with each of these people.
    I have recommended to 23andMe that they look into a way for us to contact our Ancestry Finder matches. I know that they are thinking about it. I feel that if a person is interested enough to enter the information needed for AF, they might also be open to inquiries in this regard. The response rate should be higher since the person is already interacting with the site in an Ancestry capacity.
    We must be patient, I guess. ;-)
    CeCe

    ReplyDelete
  3. Ola CeCe,

    I don't know if you qualify at 5 CM's with me, but according to the HIR search, you are my relative. It might just be a coincidence because we have only one match on chromosome #1 at 402 SNPS. It is quite interesting to read how small segments can indicate old ancestry that transmits as a uninterrupted group. All of this is very knew to me, so my understanding is at the surface level at most!! Anyway I will keep my eyes peeled for any new revelations on how small chromosome segments show sanguinity!!

    ReplyDelete
  4. Interesting post. CeCe, is there a cM rule of thumb to identify when it's worthwhile to pursue a potential match? For example, if we have a 7 cM match solely on 1 segment is that long enough to warrant more research? What about a 10 cM match? 15 cM on a single segment? What if there is 1 match of >12 cM and a few little one's (6's & 7's)? Or what about if there were 5 little one's (all 6's & 7's)? Should we disregard any of these or pursue?

    ReplyDelete