Saturday, November 11, 2017

Discrepancies with Amount of Shared DNA for Close Family Matches at MyHeritage

I was previously aware that there are some issues with the more distant matches on MyHeritage DNA, so I have been advising caution about using those in genealogical research, but was more confident about the close family matches. I uploaded both my parents' data and my aunt and uncle and all matched me and each other as expected. However, in the last couple of days, I have become aware of some pretty serious issues with matches in the category that includes half-sibling relationships (~25% shared).

Case #1
For several months, I have been working with a woman who was abandoned as a baby. We had successfully zeroed in on her biological family through pedigree triangulation on AncestryDNA and were trying to determine which of two sisters was her biological mother. The daughter of one of the sisters had agreed to test at MyHeritage, with the expectation of a DNA share consistent with either first cousins or half-siblings. Her results came back with 17.9% (1,294.9 cM) DNA shared between them. This was unfortunate since it fell in a gray area where the ranges of shared DNA for the two possible relationships overlap, so it looked like we would have no definitive answer to the question of her parentage. We then uploaded her data to Gedmatch and were shocked to see that they actually shared ~25% (1,758.9 cM) of their DNA - a clear half-sibling match.

This is what the comparison on MyHeritage looked like:


This is what the comparison looked like on Gedmatch:

That is a 464 cM difference! This pushes the relationship solidly into the half-sibling relationship category without any ambiguity. We expect small differences between the different companies and/or third party comparisons, but in all the years I have been involved in genetic genealogy, I have never seen a comparison vary so drastically. In fact, they have been so consistent in the eight years we have been working with autosomal DNA matching, that it has given our community great confidence about the reliability of the matching algorithms that we work with at the three major DNA companies and Gedmatch. 

This was very concerning to me so I followed up on some potentially similar situations I had heard about in my DNA Detectives Facebook group and immediately found two more examples like the one above.

Case #2
Here is the comparison between two half-siblings at MyHeritage:



Here they are at AncestryDNA:




And here they are at Gedmatch:




As you can see, this set of half-sisters was reported to share 1,142 cM at MyHeritage, 1,620 cM at AncestryDNA and 1,699.4 cM at Gedmatch.  Again, this is highly problematic with a difference of 478 cM and 557 cM between MyHeritage's estimate and the other two services.


Case #3
This is a comparison of a full uncle/nephew at MyHeritage:

and at Gedmatch:

Again, we see a large discrepancy between the comparison at MyHeritage versus the one at Gedmatch - 937 cM at the former versus 1,409.2 cM at the latter, for a difference of 472.2 cM.  Also note, that the number of matching segments is doubled in the Gedmatch comparison as opposed to the MyHeritage one.

I would really like to see the MyHeritage comparisons on a chromosome browser to determine exactly what is going on here. Hopefully, they will soon add that feature.

Don't get me wrong, I welcome new companies that offer services to our community and am very supportive of their efforts, however accuracy is absolutely essential when using DNA to draw genealogical conclusions and determine the relationship between two people. These very significant discrepancies definitely can and, perhaps, already have caused MyHeritage customers to reach inaccurate conclusions about their relationships to each other. This can be very damaging to the reputation of our industry and, especially, in relation to the work I do assisting people of unknown parentage to identify and connect with their biological families. If we cannot count on reliability in the reported amount of shared DNA, this undermines our efforts to convince newly-found family members that the proposed relationship is authentic. It is my hope that MyHeritage will move quickly to correct this very serious issue. In the meantime, I recommend always double checking your comparisons by uploading to Gedmatch and running the one-to-one comparison there. 

I was able to locate these examples very quickly, so I am confident there are many more out there. Please comment below if you have an example of your own.

[Edited to add - I am still recommending that people of unknown parentage get their DNA into the MyHeritage database due to the many success stories we are seeing there, but I strongly suggest checking any important/significant matches at Gedmatch, if at all possible, to confirm any newly-found relationships.]

39 comments:

  1. Doesn't FTDNA do MyHeritage's lab work? This sounds like either a faulty algorithm or the use of differing default thresholds (e.g. 7 cm vs 10 cm). I hope they explain this like yesterday. We must strive for consistancy, not confusion.

    Jim Lannin

    ReplyDelete
    Replies
    1. It's quite possible that they use the same or identical equipment to obtain the data, but perform the analysis separately. Bear in mind that the cM "distance" is itself an estimate based on the empirically-derived recombination rates along each chromosome. A difference in models, a difference in thresholds, a difference in mismatch run-length limit, can all affect the calculated cM values, given the same raw data.

      For consistency, the testing companies would have to agree on quite a few parameters. However, they consider their parameter settings proprietary--the "secret sauce" that differentiates the services they offer.

      Delete
    2. I don't think any of us expect the companies to report the exact same numbers, but it should be close. If it isn't, it throws into question the basis of our entire field.

      Delete
    3. Myheritage does use FTDNA's lab to process the DNA, however they use their own algorithms to report on matching.
      What is interesting for me in these examples is that myheritage reports less DNA than at gedmatch. In the more distant relationships the issue is that they report too much shared DNA (assumed to be due to imputation).
      I would be very curious to see if the same results occur for kits transferred to myheritage (I am assuming that the comparisons so far are for people who tested at myheritage?)
      I don't know how to get this reply to show my name!!
      Caz Brymora

      Delete
  2. My cousin and I both uploaded to My Heritage. We share 158cM on Gedmatch and 133cM on Ancestry. We have never shown up on each other's match list on My Heritage.

    ReplyDelete
  3. My estimate with my full aunt is also considerably lower on My Heritage than Ancestry and Gedmatch. On My Heritage, they show we share 1328cM vs a 1709cM share on Gedmatch and 1641cM on Ancestry.

    ReplyDelete
  4. My sibling: Ancestry 2453 cMs over 60 segments, FTDNA 2401, GedMatch 2575, MyHeritage 2279 over 37 segments. All places other than MyHeritage believes this is my full sibling. MyHeritage believes this is either a half sibling, uncle or full sibling.

    ReplyDelete
  5. I see similar results with my match to my 1/2 brother.
    At Ancestry, we match at 1681 cM. Upload to GEDMatch shows 1728.9 cM, and a Family Tree DNA transfer shows 1651 cM (ignoring segments <7 cM and shared X yields 1614.45 cM).

    My Heritage shows us matching at just 1110.8 cM.

    This isn't just a case of minor differences in matching algorithms, it's plain wrong.

    ReplyDelete
  6. My father was adopted and I think (almost 100%) sure I found his BF family. I found on my heritage a potential half sister who is more accepting, and later a potential half brother tested to verify. The half brother is very skeptical saying these test are unreliable. He has talked to me some but unsure how to find more information to prove the relationship between him and my father. His father fits age wise too and I have images of them that look like the same person. These are the my heritage numbers for the half brother to my father 22.4% (1,625.5‎ cM) 23 shared segments and 198.7 CM. the half sister is 31.2% (2,265.8‎ cM) 25 shared segments and 195.6‎ cM. GedMatch kits numbers are H402211, H833718 and A104247. the ged match numbers are 1709.4 cM and 192.2 cM for the half brother and 2307.3 cM and 193.5 cM for the half sibling. My Heritage list these people as Half brother, uncle or nephew and Aunt or niece, half sister, sister, grandmother or granddaughter. My father was adopted in Indiana so I believe we can not see his adoption records until this coming July and that's if they show any father listed. not sure what else to do for this half- brother to help him know this is real. I can only hope the half sister is also still accepting as 3 or 4 siblings believe this to be false. Only 2 have tested and i have suggested the other two test with maybe FTDNA or Ancestry since my father's DNA is there as well. Thanks much.

    ReplyDelete
    Replies
    1. Congratulations, PhotoMom1978. There is no doubt that those comparisons are consistent with half-sibling relationships. (I ran them myself to be sure.) As far as the discrepancies at MyHeritage, so far, they have always been too low and not too high and therefore, your father's results would not be in question. I hope his half-siblings come to accept him since there is no doubt of their close family relationship. Sending my best...

      Delete
  7. ​Hi All, I am MyHeritage's Chief Science Officer. We are well aware of these issues that affect a minority of our close matches. My team is actively working on this and we are in the final steps of a major overhaul to our matching system that resolves many of these issues and better tunes our parameters for our fast growing database. It is a good time to remind that our database allowed multiple success stories that involved half-siblings and other close relatives and allowed them to find their birth families. We will be in touch with you shortly to check your findings on our new system with the same DNA kits and ensure they are no longer experienced there. We will give another update ASAP.

    ReplyDelete
    Replies
    1. Thank you for taking the time to comment, Dr. Erlich. I am glad to see that this is being addressed, but unfortunately, I am not convinced that it is only affecting a minority of the close matches based on the reports I see pouring in from the many discussions going on at Facebook. Even if it is, it is still affecting a very large number of people and is, undoubtedly, a significant problem. Since you state that MH was well aware of it, I strongly believe that it was MyHeritage's responsibility to warn their customers about this known issue so as not to risk misleading them regarding the actual relationship to their close matches. Further, you mention it is a good time to remind my readers about success stories, but I had already done so at the end of the post in a note that was added shortly after publication. I will also remind everyone that I have been, by far, the most vocal among the genetic genealogy community about the success stories we are seeing come out of MyHeritage, so my concerns are definitely not stemming from any bias against MH. Instead, I very much support MH in their efforts. I look forward to the updates from MyHeritage regarding the correction of this problem and other similar ones we are currently seeing.

      Delete
    2. Dear Dr. Erlich, How is it that you were ""well aware" of these issues but failed to address this sooner? The fact that myHeritage didn't care to notify the genetic genealogy community directly is unconscionable. You waited until someone got the goods on myHeritage. This is not behavior that garners trust.

      Delete
  8. I have a known maternal half-sister that I have been in contact with for 20+ years, found and confirmed by papers and confirmed by bio-mother. I manage her kits.
    Ancestry, 1655. FTDNA 1688. Gedmatch, 1783. I had never bothered to look at this when I uploaded us to My Heritage! Wow.... 1311....
    This is a huge problem! And it is a problem with My Heritage!

    ReplyDelete
  9. I have a number of matches from other services turning up at MyHeritage. Most of them are known paper trail matches at Ancestry. The numbers all pretty much agree with each other. Nothing outlandish. But I wonder about some of the other matches who haven't tested elsewhere.

    ReplyDelete
  10. Full sibling (brother): FTDNA -- 2398, largest segment 140 cm ; Gedmatch 2,535 , largest segment 143.3; MyHeritage 1657.4, largest segment 151.9. Note: My Heritage says he is my half sibling or nephew. Something is not right here.

    ReplyDelete
  11. Is it possibly because MyHeritage is using a minimum other than 7cM segment lengths? What happens if you adjust Gedmatch to a 10 cM minimum, or 20 cM minimum? Then does the Gedmatch result look more like MyHeritage?

    ReplyDelete
    Replies
    1. Good thinking, but that is not the explanation. Close relationships like this typically don't have many, or any, short segments in common. I looked at the Gedmatch comparisons for two out of three of the cases above and the segments were all larger.

      Delete
  12. Is the GenMatch secure or is a highly shared site?

    ReplyDelete
    Replies
    1. Yes, the GedMatch is the site where everyone uploads their raw autosomal DNA results, regardless of a company they test with. It seems pretty secure.

      Delete
    2. Your data will be visible to anyone who also uploads their data at GEDmatch, and you will have to give an email address - you can use a fake name, and you can set up an email address (they are free at Yahoo, Gmail, Juno, some other places) for that purpose only, but you are not going to be keeping things private at GEDmatch. I would not call it secure.

      Delete
  13. If my results as far as where I'm from in the DNA results do not make sense when it comes to my family's research over the decades, what do I do? Will sending my info over to the other site show any difference or am I one of those who has bad results?
    My family is a big portion Irish and Menorcan yet it doesn't show. And yes I have a lot of partial matches on there saying like third cousin twice removed and so forth.
    By the end of this month my parents results come in. Will this change my results? Or does this only show their results? And what if my mother and father do not say they are middle eastern or these crazy results?
    How do I get mine to match my parents more accurately if they happen to get an accurate result by chance? I'm really confused with how my results showed up. And my German and polish results were very low. And that is half of my family. My father's side. I want to get into the genetic or genealogy field. Yes 36 is a little late but I'm not incapable of understanding anything I'm taught. Plus I live right near st.Augustine and I want to help with the genealogy society in a big way here. Please feel free to write me at my email. It's d.b.gager@gmail.com
    I would love to chat more with you and make better sense of this.

    ReplyDelete
  14. My dad's 2C1R shows as a match to me BUT NOT to my dad at all! On Gedmatch they share 175.3cM's.

    ReplyDelete
    Replies
    1. That should certainly be reported as a match on any site. I have been hearing a lot of situations similar to yours.

      Delete
  15. 1/2 sister-1966.1/50 Gedmatch, 1885/62 Ancestry, 1851/52 Ftdna, 1637.5/34 MH; Great uncle-832.5/24 Gedmatch, 770/26 Ancestry, 764/28 Ftdna, 717.3/18 MH; Uncle/Sibling-3017.3/47 Gedmatch, 2847/79 Ancestry, 2821/47 Ftdna, 2702.9/31 MH

    ReplyDelete
  16. Thank you CeCe for bringing this to our attention. I realised that there were problems with the more distant matches at MyHeritage but I hadn't appreciated that the problem extended to very close matches as well. When I checked my results back in July this year I found that 73% of my matches at MyHeritage did not match either of my parents:

    https://cruwys.blogspot.co.uk/2017/07/parent-and-child-comparisons-at_26.html

    Alex Coles had an even higher mismatch rate. 92% of her matches did not match either of her parents:

    http://wing-ops.blogspot.co.uk/2017/08/imprecise-science-part-2-myheritage.html

    Lorna Henderson has reported a second cousin match which did not show up at MyHeritage but was reported elsewhere:

    http://dnasurnames.blogspot.co.uk/2017/10/yet-another-myheritage-dna-mystery.html

    I'm hoping that the updated matching algorithms will fix these problems.

    It appears that MyHeritage are making the same mistake as FTDNA and using small segments under 5 cMs to calculate the matches. If I do share a match with one of my parents at MyHeritage I've found that in some cases I share many more segments and many more cMs than my parent, which is clearly impossible. Here is one example:

    Debbie's match with a predicted 2nd to 4th cousin
    Shared DNA 0.9% (62.6‎ cM)
    Shared segments 7
    Largest segment 17.9‎ cM

    Debbie’s Dad's match with the same person
    Predicted 1st cousin twice removed - 5th cousin
    0.6% (39.8‎ cM)
    Shared segments 4
    Largest segment 18.4‎ cM

    It's worth noting that anyone who has tested at MyHeritage can benefit from a free transfer to FTDNA. MyHeritage transfers do not incur the $19 fee to access the chromosome browser and the MyOrigins report. Matches can also be checked out at GEDmatch.

    ReplyDelete
  17. My cousin's grandson, my first cousin 2x rem, someone i actually know, showed up on MH long ago, when he tested, i guess, and it shows: Shared DNA 2.5% (178.8‎ cM); shared segments 9; largest segment 44.5‎ cM. When i check their 'DNA matches' area he shows up at the top. Under him is a guy who shows up as a 2.2% match at 23andme (as high a match as i have there, except my kids) - we don't know how we are related, but apparently someone in my North Carolina family made his grandma pregnant when she was age 16 in about 1920 in North Carolina (and did it again - he has a sister who matches me similarly at Ancestry)(he does not know who his grandpa is, but would like to know . . . i just have never been able to figure out who it is). - b in California

    ReplyDelete
  18. CeCe, we have DNA at FTDNA and 23andMe. I have also uploaded to Gedmatch. We have multiple matches to an ancestor shared 8 and 9 generations back who are estimated as 2nd-5th cousins. Some match on more than one chromosome. What am I missing? Do we have common lines closer than 8-9 generations? Thank you!

    ReplyDelete
  19. I have found only one known relationship through myHeritage and it is a bit bizarre in that I match the cousin higher than my mother.
    He is a second cousin 2x removed to my mom and matches 0.9% (67.9 cM) 5 shared segments and 30.2 cM largest segment.
    For me (as a third cousin 1x removed) he matches 1.0% (70.0 cM) 6 shared segments and 31.7 cM largest match.

    So far, I keep an eye on matches but I don't really trust the numbers.
    Not only that, the ethnicity estimate is even crazier. My dad is exclusively Irish/English/Scottish and shows 12.4% Italian. While I have 0% Italian

    ReplyDelete
  20. I've uploaded everywhere now and have relatives who have done the same with widely differing cM matches coming back from all the testing companies. Being an adoptee I'm finding it muddies the waters (one example being a cousin who matches 105 cM over 19 segments on FTDNA, 89 cM over 5 segments on Gedmatch, 53 cM across 3 segments on Ancestry and 55.8 cM over 4 on MH.
    What is really perplexing me is my highest match on MH (someone who hasn't tested elsewhere, or uploaded to Gedmatch) who I match at 2.1% 153 cM over 6 segments. Has anyone any pointers as to what my match with her might be, given the vagaries of MH please?

    ReplyDelete
  21. A cousin matches me at Family Tree DNA and Gedmatch.

    But on my heritage my name shows up on her list of names but she is not on my list of names.

    ReplyDelete
  22. This may explain why my maternal half-brother and his 3C do not show up as matches, but my brother's daughter does show up as a match, and they are 3C1R.

    ReplyDelete
  23. I have never trusted GENI or My Heritage. Not only are the numbers way off, but it has failed to show matches that I know exist because I transfered them. I transferred my Ancestry results there over a year ago. Months later I realized that I also had authority to transfer two family members DNA. I made the transfer on May 18, 2017. Neither of them has showed up as a match to me as of November 2017. These aren't low level matches. On Ancestry one matches me at 856 cMs and the other is at 294 cMs.

    GENI has a slew of its own separate issues. Examples from my match list: the relationship of a a 32.7 cM match is "your third cousin once removed's wife's first cousin 6 times removed's husband's aunt's husband's 7th great niece" (really!!) and a 24.7 cM match is my "your first cousin twice removed's husband's first cousin thrice removed's wife's brother's wife's fifth cousin five times removed's husband" WTF????? I reference trees and other data on My Heritage and Geni as simple clues that cannot be accepted without further substantiation. And I also hate, hate, hate that they show females using a married name rather than the maiden name.

    ReplyDelete
  24. A bit late to the party but thought I would throw this out. My Paternal aunt. Gedmatch, 1752.9 cMs, MyHeritage, 1339.8 cMs. She has never had more than 13 matches. I have 81 with many that match her on Ancestry and Gedmatch but not MyHeritage. That alone tells me MyHeritage is not ready for prime time.

    ReplyDelete
  25. I have another example. A match on FTDNA showed 984 shared cMs, while My Heritage listed it at 687. That seems like a big difference to me!

    ReplyDelete
  26. I agree it's a pain searching for women.

    ReplyDelete