How accurate is the DNA information from Ancestry and 23andMe?

Both are accurate and both are scams. Or at least, scam-ish.

I can’t find any published verification of their specific accuracy rates, but error rates for modern sequencing and microarray methods are vanishingly low, well under 1 in a 1000. Assuming they are using technology from this decade, their error rates are trivial.

But of course, they don’t send you raw sequences. Both Ancestry and 23 provide family history reports detailing where your ancestors come from. This information is not nearly so exact, and it can’t be. There are a couple of reasons for this.

Ancestry determinations are based on sequence variants that are characteristic of particular geographic or ethnic groups. How do you know which sequence corresponds to which group? You have to have some reference group, one that you know is Scandinavian or Malay or Jewish. And how do you know this? It’s only by their present location or their professed identity.

That would work great if people never migrated or intermarried or changed group identities. But they do. There is no such thing as a pure ethnic or geographical group. Everybody’s ancestors – unless you are East African – came from somewhere else. Most people don’t even know who their fairly recent ancestors (eg great grandparents) are and have no idea where they came from. So the reference groups are bound to be inaccurate, containing both omissions of sequences that could define groups, as well as inclusions of sequences that don’t. And not everyone within a group – however defined – will have all the markers that are characteristic of the group.

From What is a Haplogroup?

Sequence-based identification works great at the group level, where these errors average out, but are bound to contain some whoppers at the individual level. Sequence association with group identity is a probability, not a determination.

The other type of info provided – disease risk – is even more sketchy. There are a few one gene-one disease associations, but they are rare. And even more sophisticated analyses of multiple genes aren’t very helpful. This isn’t a technology problem – it’s because our disease risk is overwhelmingly determined by environmental exposures, not genes – I’ve written about this here and here and here.

From Distribution of breast cancer according to genetic risk

Population-level genetic information is immensely helpful in understanding human history and biology. As a guide to who you are, not so much. I’ve never felt the least temptation to pay for information that is likely flawed and of no use.

No use to me anyway. I’m sure Ancestry and 23 have plans to monetize our information for the benefit of their investors. If they want mine, they’ll have to pay me for it, not the other way around.

Leave a Reply