Showing posts with label Chromosome Browser. Show all posts
Showing posts with label Chromosome Browser. Show all posts

Saturday, May 10, 2014

AncestryDNA at the National Genealogical Society Conference - A Report from Angie Bush

My colleague and friend, Angie Bush, is attending the National Genealogical Society's conference in Richmond, Virginia this week. She has kindly agreed to fill my readers in on any interesting DNA news from NGS. Her report on the AncestryDNA presentation given by Kenny Freestone follows.

I attended the AncestryDNA presentation by Senior Product Manager, Kenny Freestone, in hopes of learning what new and exciting features Ancestry has on the horizon for genetic genealogists. There was not much new information presented, but there were a few things that I thought might be worth mentioning:


1. In response to questions about AncestryDNA's plans for adding a chromosome browser or segment data, Kenny repeated that Ancestry is working on something that would give their customers access to that type of data, but that it would be something different than what current chromosome browsers offer. No date was provided for launch or when such a feature might appear. He did admit that at this point the tools that Ancestry has for triangulating data are quite lacking. This tells me that they recognize that there is a need for these features. I can only hope that when these new tools are finally released that they really are as good as what they are claimed to be. I found it very interesting that he used a slide showing how he inherited DNA from a set of third great-grandparents and that he illustrated chromosomes, but that Ancestry provides their customers no way to view this type of information.


2. In a somewhat related slide, Kenny showed several of his lines that had been "confirmed" by DNA shaky leaf hints. He said that this was "independent" evidence that his tree was correct. As readers of this blog know, unfortunately you cannot always say that is the case. As a serious genealogist and scientist, I continue to find the lack of segment data to be a problem. In both disciplines it is imperative that data be able to be reviewed. On the genealogy side of Ancestry's site, they do provide the actual images or data in many instances. When viewing any educational video by Ancestry, there is always encouragement to look at the actual image, as it contains so much more information than the transcription. I just cannot understand why this same level of access to the underlying data is kept hidden on the DNA side of their site. 


3. Kenny was asked a question by someone who has Jewish background regarding why there are so many matches at a high level and yet no common ancestor is discovered. Genetic genealogists who have worked with endogamous populations know this can a difficult problem. Kenny did say that they are actively working on this issue, but have not yet come up with a solution.


4. I have long wanted to understand the cut-off levels for how AncestryDNA is predicting matches. For example, if AncestryDNA predicts that you are a 1st - 2nd cousin to a match, then how much total DNA do you share with that person and how many segments do you share? 23andMe and FTDNA have always provided this information. Kenny flashed the following slide, which may be helpful in determining the parameters they are using for predictions:


200 megabases for 2nd cousins

150 megabases for 3rd cousins
100 megabases for 4th cousins
30 megabases for 5th cousins
20 megabases for 6th cousins
10 megabases for those further out
 

This slide raised a question as to whether or not AncestryDNA is using centimorgans or megabases in their matching algorithms. Kenny clarified that they are using a combination. They switched to using centimorgans in November - December 2013. If you tested recently, then your matches are in centimorgans. If it was prior to that date, then your matches are in megabases.

5. Kenny told us that the communication/contact rate between DNA customers was twice that the communication rate between regular customers.
 

6. An audience member asked if Ancestry stored the sample for future/other tests. Kenny didn't directly answer this and said that as the science improves that they will just apply those improvements to the current test. He did say that the only thing better than their test was a full genome sequence, and for that a new sample would need to be submitted.

7. I appreciated the fact that Kenny emphasized that the ethnicity information is an ESTIMATE. It is important that we all remember that the science that each company uses to give us our admixture is still in it's infancy and that each company uses different reference populations to do so. It behooves all of us to take this information with a grain of salt no matter which company we test with.


8. Kenny did a great job answering several questions from audience members regarding Y-DNA and mtDNA testing that were completely unrelated to the product that AncestryDNA offers. Attendees even had specific questions about surname and haplogroup projects. This highlighted the need for those of us in the genetic genealogy community to reach out to the genealogists and help them to understand the power of DNA. Things we take for granted such as the three types of tests and the companies that offer these tests can be confusing. If DNA is to be effectively used as a genealogical research tool or record, there is a significant amount of education that will need to be done.
 

9. Lastly, after the presentation, Kenny showed me that Ancestry has a new "spit kit." The return box and packaging are much more compact now and the kit itself is a bit different. I asked if there were plans to offer some type of assisted collection kit or "cheek swab" as the spit kit can be difficult for older individuals. He said that they recognized this was a need and that Ancestry probably would do something to address it, but that he couldn't confirm anything.

The new AncestryDNA Kit - Packaging
AncestryDNA Kit Contents
AncestryDNA Compact Return Mailer

Thanks to Angie for this AncestryDNA/NGS update!

Thursday, June 13, 2013

A Sneak Peek at the UPDATED AncestryDNA Search Filter

AncestryDNA has been hard at work perfecting the new search filter these last few weeks. Not surprisingly, it has undergone some changes since I last shared it with my readers.

Stephen Baloglu, Ancestry.com's Director of Product Marketing, described what appears to be the version that will go live in the next few weeks:

The search includes looking within your DNA matches for surnames and birth locations in their family tree that they have linked to their DNA results. It does not currently include searching for username, but we may update that in the future. The top request was surnames/birth locations, so we started there. Also, we expect people to more likely know surnames they're looking for much more often than ancestry username.

He also shared these screen shots and accompanying descriptions with me on June 6th. Click to enlarge them for a closer look.








 






























In a YouTube video on Ancestry.com's account (dated June 3, 2013), this slide is shown describing the AncestryDNA search filters.

Notice that the username search was, apparently, still expected at that time. Ancestry.com employees Anna Swayne and Crista Cowan discuss it on the video from about 15:30 to 16:00.  Crista clarifies that the username filter will be "an ability to search by the name of somebody who is a DNA match" and Anna says that it "will be coming soon". However when I asked Ken Chahine, General Manager of AncestryDNA, about the username filter at the recent DNA conference on June 6th, he explained that Ancestry.com does not currently have the option to search by username built into their system, which is limiting AncestryDNA's ability to provide it. [Update: I have asked for clarification on this point.]

Although this additional filter would be convenient, in my opinion, it isn't essential. For example,  I would like to search my mother's account to determine which of my matches with a surname of interest also appear in her account, but I can do so almost as easily by searching on that surname. Further, to search your account to see if your research partners appear as matches, you can also filter by your shared surname(s). I would really like to see an "In Common With" filter to determine which of my matches are from my mother's side, but, fortunately, I can already use Jeff Snavely's tool for that purpose.

With these upcoming filters, Jeff's terrific tool and Family Tree DNA's recent announcement that they are accepting raw data uploads from AncestryDNA (for only $49!), this test is becoming much more useful for genetic genealogists. (Now, if only we could get that chromosome browser onsite!)

Stephen tells me that there still isn't a confirmed date for the new search filter's arrival, but I expect it to appear sometime this summer.

Tuesday, August 21, 2012

AncestryDNA: Confusing Relationship Predictions and Adoptees

As my readers are aware, I have been advocating for AncestryDNA to release the genetic data behind their matching predictions since the launch of their autosomal DNA test. You may also know that I am a passionate advocate for adoptees and their right to discover their heritage. This week, the two issues have collided into what I feel is a very important issue.

An increasing number of adoptees have been discovering their roots and, in some cases, their birth families through autosomal DNA testing at 23andMe and Family Tree DNA. I have been very encouraged by this and, as a result, have been suggesting that adoptees who are able to afford it, test at all three of the companies currently offering atDNA testing in order to "fish in different ponds" for close relatives. AncestryDNA has been last on this list of three companies due to the fact that their test does not include the raw genetic data for download, the specific matching segment information or the total DNA shared between matches. However, they were still on the list because I believed that if an adoptee were to get a very close match there, finding their birth family would be very clear-cut even without the genetic data. Well, I was wrong.

Initially, I was very excited to learn that an adoptee had received a parent/child prediction for one of their matches at AncestryDNA this week. What has happened since really illuminates the problem of not allowing customers access to the genetic data behind the predictions. The adoptee, a couple of adoption search angels and myself have all been researching and have come to the conclusion that there is absolutely no way this match is being accurately predicted.

Let me explain further. For the purposes of this story and to protect the identities of those involved, I will use non-gender specific names and call the adoptee "Chris" and the match "Pat". I also cannot share some of the specific details for privacy reasons but, believe me, I am very confident about what I am writing.

A parent and a child share 50% of their autosomal DNA. Since Chris and Pat cannot possibly share that relationship due to the fact that they are much too close in age, we looked at the most obvious alternate theory, which is that they are full siblings. Full siblings also share approximately 50% of their DNA on average. Since Pat's parents are both too young to have conceived Chris, then that was also determined to be impossible. This also rules out half-siblings who share approximately 25% of their DNA on average.  The next most likely scenario is that Chris and Pat are aunt/uncle and niece/nephew. This doesn't seem probable based on the family structures and double first cousins is also out based on Pat's family tree. The next closest relationship genetically would be first cousins who share an average of about 12.5% of their DNA. That is getting pretty far away for a parent/child prediction AND guess what?! None of Pat's aunts and uncles were old enough to reasonably have had children when Chris was born either. Further complicating the situation is that Chris' non-ID (non-identifying information given to an adoptee about their birth families) is pretty detailed and specific, listing the birth parents' ages as in their twenties (so not exceedingly young), their family heritage and information about the maternal grandparents. None of this matches Pat's tree at all, even at more distant levels.

This has been a mind-bending, frustrating situation for all involved, especially the adoptee. Try to imagine the elation of receiving this match after being blocked in every other avenue of discovery, to then have it turn out like this: so close and yet still so far. The really unfortunate thing is that if this match was at either 23andMe or Family Tree DNA, there would be no question what the actual relationship is. This is because both of those companies give the total amount of matching DNA and allow their customers to see the actual pattern of inheritance, which in most cases, will point to the exact relationship. In the few remaining cases, 23andMe can dispel all doubt for parent/child/sibling/aunt/uncle/niece/nephew and often even first cousin matches because, in addition, they include with their results fully identical segments, haplogroups and X-DNA inheritance. The fully identical segments will only appear in full siblings and/or double first cousins, haplogroups will help narrow down on which side of the family the relationship lies and the pattern of X-DNA inheritance will usually discriminate between aunt/uncle/niece/nephew and half siblings, as my colleagues and I recently realized while working on another very successful adoption DNA case.

Let me give you an example of just how clear-cut this really is.

This is what half-siblings look like in 23andMe's Family Inheritance feature (not Family Inheritance Advanced):

Half-siblings DNA sharing, click to enlarge

Versus full siblings:

Full siblings DNA sharing, click to enlarge

Notice the dark blue in the full siblings' comparison. That color is illustrating the areas where the siblings share "fully identical regions" versus the light blue which illustrates the "half-identical regions". Full siblings are the only relationship (except occasionally double cousins) that share fully identical regions, while half identical regions are what we find for all other atDNA matches. This is because full siblings get DNA from the same mother AND father, so on some of the chromosomes, they match on both pairs. For example, in the illustration above, the paternal Chromosome #1 and maternal Chromosome #1 have four fully identical regions, six half identical regions and one non-identical region. Remember we all get one of each chromosome 1-22 from mom and one from dad. This means that in some areas, we will inherit the same DNA as our full siblings on both pairs of chromosomes, while in some places we will inherit the same DNA on one chromosome and in some regions we will not inherit the same DNA on either chromosome. (This in-depth analysis would rarely be needed since it is usually obvious from the percentage of DNA shared if two people are full or half-siblings. The exception is when two people share an amount of DNA that falls somewhere in the middle of what would be expected, for example 37.5%.)

Although a parent a child and full siblings both share approximately 50% of their DNA, there is no confusing these two relationships when you see the pattern of DNA inheritance. Take a look at these graphs from 23andMe's Family Inheritance ADVANCED:

Parent/child DNA inheritance, sharing 50%

Full siblings DNA inheritance, sharing ~50%

As you can see, when the match is between a parent and a child, it is very obvious. This is because a parent and a child (top) will share the entire length of each chromosome 1-22, while other relationships, such as siblings (bottom), will have interrupted, randomly interspersed blocks of sharing.

Here is what the same relationships looks like using Family Tree DNA's Family Finder Chromosome Browser:

Parent/child DNA inheritance at FTDNA's Family Finder


Full sibling DNA inheritance at FTDNA's Family Finder

At AncestryDNA, all you get is this:


With this explanation:


It reads, "Our analysis of your DNA predicts that this person you match with is either your parent or your child. While there may be some statistical variation in our prediction, it is very likely to be a parent/child relationship. There is a very small possibility that the relationship may be up to two degrees of separation like a brother or a grandchild."

This explanation is very confusing to me for a couple of reasons. First, there does not need to be any level of "statistical variation" or uncertainty between parent/child versus sibling relationships. Doesn't AncestryDNA take into account the two testers' ages? Don't they look at the pattern of inheritance as illustrated above? If they had done either in the case outlined in this post, they would have easily realized that their prediction with 99% confidence was wholly inaccurate. Second, it is a bit odd to me that they discuss degree of relationship instead of expected percentage of shared DNA for immediate family relationships, which is much more relevant here. Their explanation groups brother and grandparent together, separate from parent and child, rather than explaining that parent/child/sibling relationships all share around 50% of their DNA, while grandparent/grandchild only share about 25% of DNA. Aunt/uncle/niece/nephew/half-sibling relationships also share about 25% on average. Ages of the matches will usually distinguish between these relationships, but when it doesn't, the pattern of inheritance almost always does.

This is not the only case where an adoptee has been confused with their AncestryDNA close relationship predictions this week. Another adoptee was elated to receive a first cousin prediction, but doesn't know if it is indeed a first cousin because there is no way of determining what criteria AncestryDNA used for the prediction. Search angels have been assisting the adoptee research this one too and all have strong doubts as to the accuracy of the prediction based on the match's family tree.

I realize that Ancestry.com has said that they wish to keep the interface simple for the layman, but look what this adoptee wrote to me today, "They need to change something. It is much too confusing to predict what it actually means, especially for those of us who are doing our searches from home with no training." It sounds like, at least for adoptees, the end result of not including the specific underlying genetics is the exact opposite of what AncestryDNA was intending to accomplish.

I am involved in and aware of a quickly increasing number of successes involving adoptees using 23andMe and Family Tree DNA to discover their roots. By most accounts, there are at least six million adoptees in the United States, many of whom wish to learn about their genetic roots. (This number does not include donor-conceived individuals.) When these adoption DNA success stories get out in a big way, AncestryDNA is going to miss out on a very large market. I really hope they rethink their offerings, so we can ALL benefit from their service.

When contacted about the confusion with Chris and Pat's match, AncestryDNA's customer service was quick to remind them that the test is still in beta. With a database of over 50,000 autosomal DNA customers and growing fast, that seems a weak excuse. If they were unsure of their algorithms (and as I have demonstrated, there should be no reason for uncertainty in predicting close relationships), then they should have limited the beta to the original first 12,000 participants until they had tested it further. When a customer sees a 99% confidence prediction, this does not imply uncertainty, even in beta. In this case, the AncestryDNA representative told Chris that he thought the prediction might be in error. He said that they believed that the match was real, but that the prediction may be too close. Strangely, Chris was told that they needed a new sample and it would take two weeks for the kit to arrive and 6-7 weeks more to receive the results after kit activation. Why would they need a new DNA sample? Can't they just rerun the comparison or, even easier yet, simply look at the DNA sharing and reach a conclusion? If AncestryDNA wants to send the matching data to me, I will guarantee to give them a very quick answer! ;-)

Just for those of you who are wondering...
We considered the possibility that Pat is also adopted or donor-conceived, but this is highly unlikely due to several factors that I will not disclose here. The only other possibility would be a switched-baby-scenario at the hospital. Obviously, the odds of this are extremely small.

Regardless of the real situation, should Chris or Pat have to wait another 9-10 weeks to find out? Even if it turns out that somehow they are, indeed, closer relatives than our research implies, all of this confusion and heartache could have been avoided with the matching DNA information provided by the other two companies offering these tests. Don't the adoptees in our communities deserve better? Haven't they been forced to jump through enough hoops in an attempt to discover the information that the rest of us possess as our birthright?

As I'm sure my readers will agree, I am always fair to the companies involved in genetic genealogy and no one is a bigger cheerleader when a company gets it right, but this situation is simply inexcusable to me. I am interested in hearing how you feel about it too, so please share your thoughts. I would like to close with the words of one of the adoptees involved in this regrettable situation (words in parenthesis were added for clarity):

It's bad enough some of us already don't know who we are and are refused access to our own identity and medical information, but to turn around and pay money for something we think may bring us a glimmer of hope into the secrets of who we are, and then end up with more questions than answers, it is frustrating. It's almost like dangling the carrot in front of the horse, where they can see it but just can't quite reach it.

I still feel that I am closer than I was, but without a secret decoder ring I feel like I wasted $100...
I really don't have any way to know if I have the right information or how far off this test is. I have nothing concrete to compare it to and I could be doing all this work off of information that may not even been valid...at first I was really excited because I thought I had found some major clue (and I still may have, and definitely have more than I did before) and then started realizing that this could just be a goose chase.

It's part of the search I guess, but this situation was a bit different, I knew it was a long shot, because someone else (closely related) has to have taken the test, but then when you immediately get a hit that seems that close its an amazingly surreal feeling, now I am just worried it was $100 lost that I could have used towards one of the other more expensive test on other sites... I feel they (AncestryDNA) did something wrong in the way they set this up. 


**Update** - Immediately after reading this post, 23andMe generously offered both testers a free kit through their Personal Genome Service. When they receive the results, we will be able to determine their exact relationship (if they are indeed close family).

***Update 8/24 - AncestryDNA has stated that this was a lab error that is being rectified. Update post here.

****Update 9/15 - 23andMe finds no match between "Chris and Pat", details here.

Friday, July 20, 2012

Known Relative Studies at FTDNA: Third Cousin Comparison and More Random atDNA Inheritance

I don't write about Family Finder very often for my known relative series since most of my close relatives have tested at 23andMe. Fortunately, my Travis third cousin recently decided to take the Family Finder test at Family Tree DNA. As a result, I have a new third cousin comparison to report.

Our only (known) common ancestors are our great great grandparents Abraham and Ruth (Stolebarger) Travis, so any matching DNA that we share is inherited from the Travises. Abraham's father Asa and Ruth's parents John and Sarah are two of my major genealogical brick walls, so it is really interesting to be able to isolate DNA that came from those lines.

My Travis third cousin and I share 45.96 total cM of DNA with a longest block of 14.84 cM. This is on the low end for third cousins and the most likely relationship suggested by FTDNA is actually fourth cousins. Only about 25 cM comes from segments longer than 5 cM, so just including those in my calculations (to keep it consistent with my 23andMe comparisons), that means we only share about .37% of our total DNA. Since third cousins are expected to share about .781% of DNA, this is a bit low, but it is in line with my other third cousin comparisons so far (averaging .39%). That's random atDNA inheritance for you!

Family Tree DNA offers a unique perspective on these comparisons, so I will share how this match looks on the different settings that are possible on their Chromosome Browser tool. In the chart below, the blue bars represent my twenty-two autosomal chromosome pairs. The orange bars are the sections where my Travis cousin and I have stretches of matching DNA. This chart is displaying the lowest setting in order to show all matching segments over 1 cM. Many of these are probably false positives, but it is still interesting to be able to see them.

My third cousin comparison showing all matching segments over 1 cM

The next image is set to show only matching segments over 3 cM. As you should be able to see, our main matches are on Chromosomes 2 and 14. The only other match that didn't drop off is the one on Chromosome 6. This match falls under the threshold of what you would see at 23andMe, so over there we wouldn't have known about it at all. Remember, this could be a "false positive" since a pretty large percentage of segments this size prove to be, but I will reserve judgment until I am able to compare it to my chromosome map (when it is more fully developed) to confirm if this segment falls in an area that I can positively attribute to my Travis ancestral line. That should help determine whether this is an authentic matching segment or not.

My third cousin comparison showing all matching segments over 3 cM

The final chart shows only the two largest matching segments. You can see them signified by the orange on Chromosomes 2 and 14. When you scroll over these spots, a window will open (as above) describing the exact starting and ending points of the matching segment(s).

Third cousin comparison showing only matching segments over 5 cM

To summarize, I compared a known third cousin to myself to identify the portions of our DNA that match each other. Due to the sizes, we can be confident that two of the matching segments are authentically inherited from our common ancestors. What this means is that I can now attribute those larger segments as originating with Abe and Ruth Travis.

Abe and Ruth Travis

If my Travis cousin were to upload his data to Gedmatch, I could compare him to my mother, sisters and some of my other cousins who have tested at 23andMe. It would be very interesting to see the variety of inheritance patterns. Hopefully, I will be able to do so in the future.

On another note, I actually have four matches on Family Finder who are predicted to be more closely related to me than my Travis third cousin. I have not found a common ancestor with any of these people mainly because, with the exception of one, they do not respond to my emails. It does make you wonder what would happen if everyone responded with great family trees, ready for comparison.

This should give some hope to those of you who are struggling to confirm common ancestors with your matches. We tend to focus on the larger matches, of course, but some of these seemingly lesser marches could still be quite significant.  As I have often emphasized, autosomal DNA inheritance starts getting pretty unpredictable at about the third cousin level. This comparison is a good example of that because we wouldn't usually expect a .37% match to have such a (relatively) recent connection. So, take a closer look at your match lists and give it another go. You just might be surprised by what you find!


*Another third cousin comparison: I found my third cousin today at 23andMe!*