The popularity of personal DNA testing has grown exponentially over the past few years. For $99 or less, anyone can obtain heritage estimates, lists of genetic relatives, and a DNA raw data file from companies like 23andme, AncestryDNA, and FamilyTreeDNA. Approximately 1 in 25 Americans has used one of these tests.1
I think it’s pretty amazing that the technology exists to use DNA as a map to locate unknown relatives, to estimate where our distant ancestors came from, and to learn about potential disease risks — all with just a little spit, and for less than a hundred bucks. However, if you take a look at the Facebook page for any of the testing companies, you’ll notice more than a few unsatisfied customers.
One of the biggest complaints has to do with the perceived accuracy of ethnicity percentage estimates. Some customers claim testing companies “messed up” their reports. The angriest gripes seem to be from the people with the least diverse genetic makeup — participants whose family trees only span a small portion of the globe.
Disgruntled consumers have been dominating Ancestry’s public page since the company made some huge changes to users’ heritage estimates last week, though similar comments have been made about Ancestry’s competitors as well. I’ve included a just a few of the recent comments.
Customer T.B. doesn’t seem very pleased with her results, and apparently her father is now missing…
K.R.’s complaint is so common, I wrote two articles to explain the disappearing German phenomenon. To understand why Germans aren’t always German, It helps to understand the history of Europe’s changing borders and the impact of migration on the ethnic mixture of different regions.
I.O. is not alone in his belief that DNA testing should correctly identify distinct genetic markers to differentiate between inhabitants of Britain, Scotland, Ireland, and other nearby countries. This grievance is echoed by countless others of primarily European descent.
The problem with estimating ethnicity percentages for those of relatively unmixed heritage, is that it’s very difficult to pin down exactly what British vs. Scottish vs. Norwegian DNA looks like. Let’s take a closer look at the regions mentioned in I.O.’s comment. As an example, the distance between London and Oslo is shown on the map below. It’s about 718 miles.
For comparison, that’s about the same distance as New York City to Chicago. Sure, there’s a bunch of water between London and Oslo, but scientists believe even the Neanderthals traveled by boat.2
Irish, Scottish, and British DNA is very similar, so it’s understandable, given the proximity of these places, that DNA across this entire region is somewhat homogeneous. Instead of blaming Ancestry for their estimates, it would make more sense to blame the Vikings. There’s an interactive tool here that may help explain why so many people have bits of mystery Scandinavian DNA.
The Vikings went everywhere, and most Europeans will have a few Nordic genetic markers. 1-3% of unexpected Scandinavian DNA is pretty common, even for those with no paper-trail history to ancestors from this region. Amounts nearing 10% or higher should indicate at least one 2nd or 3rd great-grandparent of Scandinavian descent.
In my previous post about the old versus new Ancestry ethnicity estimates, my 9% Scandinavian disappeared and was replaced by Estonian/Latvian/Lithuanian, which makes sense. These regions are so close to each other, it’s understandably difficult to determine the origin of a bit of genetic code.
The distance from Stockholm, Sweden to Riga, Latvia, is only 275 miles. Below is another map, bringing southern Finland into the mix, to further illustrate the challenge of defining genetic nationality within geographic boundaries.
Tallinn, Estonia is only 50 miles from Helsinki, Finland. These are two different countries, separated by water, with a lot of similar DNA. The bottom line with these ethnicity estimates is that most companies will be able to offer a pretty close estimate to your overall genetic heritage.
And Europeans will be disappointed.
Those who may find the heritage breakdowns enlightening are people whose forebears were from different parts of the world. Testing companies have access to population samples from across the globe, but most of their customers are predominately European. Many of these people put a lot of faith into old family lore and become upset to learn they are not actually related to Sitting Bull.
Native American DNA has very specific genetic markers. It’s possible J.A.’s Native American ancestor is too distant to show up on a DNA test, but more likely, she’s just one of many people who’s fallen for family lore. I’ve even seen comments from customers claiming that all test results were the same, or results were somehow rigged to only show certain ethnicities.
My AncestryDNA heritage mix map is shown below. It sort of looks like a partly-filled-in coloring page.
Below is my friend Colleen’s map. She has Native American ancestors.
Colleen’s mother is German-Irish, with centuries of history in America (shown by the orange circles in the center of the U.S. on the map). Her father is half Italian and half Mexican. We’ve been trying locate her paternal grandfather’s place of birth in Mexico, haven’t yet figured it out.
With this paper-trail info, we can guess about 50% of Colleen’s heritage mixture should be a combination of German, Irish, and English. About 25% should be Italian, or nearby regions, and the other 25% should be Spanish or Native American, or a mixture of both.
Zooming in, we can see her full percentage breakdown.
Adding up Germanic Europe, Ireland & Scotland, and England, we get 61%, about 11% higher than expected. This suggests Colleen’s father’s side also had some Germanic and/or Northern European ancestors.
Combining the percentages for Italy, Greece, Portugal, and Sardinia gives us 22% — very close to the 25% we’d predicted with one Italian grandparent.
Every person has eight great-grandparents (unless your parents were cousins, but that’s another article) and each great-grandparent contributes about 12.5% to someone’s total DNA. This percentage can vary quite a bit, but it gives us an idea of what sort of baseline percentages to look for when trying to identify an unknown ethnicity.
I also want to note that we have a paper-trail confirming Colleen’s maternal lineage, as well as dozens of DNA matches to confirmed 3rd-5th cousins who share those maternal-side ancestors. We also have documentation going back to Italian birth records to verify Colleen’s father’s mother’s side. DNA should be used in conjunction with historical evidence to draw the most accurate conclusions.
According to AncestryDNA, Colleen has 15% Native American DNA (Including the Andean 1%), which suggests one or more of her great-great-grandparents was Native American. It’s even possible she has Incan Ancestry, based on the parts of South America on her genetic map. Of course, without much more data, this is pure (but fun) speculation.
The small percentages of African and Asian may never be traceable to a particular great-great-great-grandparent, but those numbers indicate Colleen shares some very unique genetic markers with people from those regions. But, could these be AncestryDNA errors, or are they truly indicative of ancient ancestry?
Colleen has also tested with 23andme. Their estimate map is similar to Ancestry’s map, except 23andme has broken down the African and Asian into smaller sub-categories.
According to the 23andme map, Colleen’s Germanic & British and Northern European mixture is about 49%, her Italian and Southern European mixture is about 34%, and Native American is about 11%. This percentage breakdown, though more detailed than the previous map, still leads to the same tentative conclusions as before.
Hopefully, we’ll figure out who Colleen’s paternal great-grandparents were, but in the meantime, DNA strongly suggests some of her ancestors were in North and South America for hundreds, if not thousands of years. A sample from Colleen’s father, brother, or any direct paternal-line relative could be used for a Y-DNA test. These results would likely provide more information about her father’s father’s genetic origins.
Ethnic percentage estimates may not be terribly useful for those with a very homogeneous genetic lineage, but for anyone else, this can be a useful tool. These estimates are consistently improving with increased population sizes, more ethnic groups, and better algorithms, but are still far from perfect.