In my previous Blog, I looked at AutoClustering my AncestryDNA and FTDNA matches. In this Blog, I’ll look at 23andMe. I have to confess, that I have never had a good feel at working the DNA matches at 23andMe. I was hoping that AutoCluster would give me a boost in figuring out what I have there.
Here is my AutoCluster at 23andMe:
Now I am up to 45 Clusters. I used a slightly lower threshold than I used at FTDNA, and got different results (20 cM at 23andMe vs. 25 cM at FTDNA). At FTDNA, the first two clusters had 108 members and Cluster 2 had 10 members. At 23andMe, the first two Clusters are a bit more even at 66 and 65 members. Also I note that the green Cluster 2 is quite closely related. All 65 members match each other.
Identifying the 23andME Clusters
My first thought is to figure out what these clusters represent. Which line is which? I do have a few known cousins at 23andMe.
Cluster 8: The Lentz/Nicholson Line
My mom has a cousin Judith who is on the Lentz Line. She is on Cluster 8.
Judith also descends from the Nicholson family as does at least one other person in Cluster 8.
My Cousin Jennifer: Hartley Side
Another point of reference is Jennifer who is my 2nd cousin, once removed.
This corresponds with my Hartley’s at AncestryDNA:
Steve with Clarke Ancestry
I’ve blogged about Steve who is a 23andMe match. He has Clarke ancestry and is in Cluster 19:
Cluster 19 is quite a ways down on the list.
Cluster 2 and Chromosome 20
I have written a few Blogs on my Chromosome 20. I have many matches there on my Frazer grandmother’s Irish side. These Chromosome 20 matches appear to correspond with my Cluster 2. Here is one Blog I wrote on my Chromoosme 20 about 2-1/2 years ago. In that Blog, I reasoned that the matches may be on my McMaster side:
In my previous request for an AutoCluster at FTDNA, I had set the lower threshold at 25 cM and that had filtered out a lot of the Frazer side matches. At 23andMe, I lowered the threshold to 20 cM which would explain the larger cluster.
Deciphering FTDNA Cluster 1
If FTDNA is like Ancestry and 23andMe, then the yellow Cluster should be a Hartley Cluster. First I checked the top match. It turns out that FTDNA over-reports these matches:
Roger shows a match of 67.3 cM with me, but his top segment is 12.3. Here is what the FTDNA Browser shows:
The browser shows one small match at Chromosome 20. This is where I have a lot of Frazer matches as described above. Theresa is also in FTDNA Cluster 1:
Thesesa also has a relatively small match corresponding with her 13.1 cM largest segment on Chromosome 20. That means that even though I tried to avoid my Chromosome 20 overmatching problem by raising the cM threshold to 25 cM, FTDNA managed to add in tiny cM’s and up the totals for these matches.
It is unfortunate that FTDNA has small matches that come out as large. I don’t know if this is as big a problem for others as it is for me. Basically I have a large group of distant relatives that I can’t connect with in Cluster 1.
A Comparative View: Three Companies
Here is a comparison of the three AutoCluster runs I have done with three companies. A better comparison would be for me to rerun the Ancestry results with a lower threshold:
- I changed the Ancestry Cluster 1 name from Hartley to Snell. That is because the cluster goes back to Snell and beyond my Hartley ancestors for some of the matches.
- In the three analyses Clarke went from Cluster 2 to 6 to 19.
- I noted a special Chromosome 20 issue that I had. This didn’t come up at Ancestry as the threshold was set low. I may be able to identify this group later at Ancestry when I am able to run an AutoCluster at a lower cM threshold.
- The Ancestry AutoCluster analysis only went up to 5 Clusters based on the strandard set AutoCluster thresholds.
FTDNA Cluster 2
The above summary points out that I have not yet figured out FTDNA Cluster 2. So far, I don’t have a definitive answer for this Cluster. The people tend to match me on my Chromosome 10. I have tended to associate their ancestors with Colonial Massachusetts.
FTDNA Cluster 3
This Cluster appears to match on Chromosome 22. I think that they are Irish in background. My Chromosome 22 (Joel) is all Irish Frazer on the paternal side:
At least one of my matches from Cluster 3 is also listed at Gedmatch. I have a paternally phased kit which she matches. That is how I can tell that the match must be on my Irish Frazer side.
Back to 23andMe: Cluster 4
Cluster 4 has 17 people in it (or items according to AutoCluster).
Two of these “items” are listed as unknown. Next I need to identify one or more of these people in the list. John listed 8 surnames, but none of them sounded familiar. So far, these matches are matching me on Chromosome 3. Here is the match with Kris at the top of the Cluster 4 list:
From visual phasing, I know that has to be either Hartley or Rathfelder DNA (at the level of my grandparents).
I recognize some Hartley names in that area of the match and they aren’t in Cluster 4. That means that this has to be a Rathfelder side match.
I’m not getting very specific with these Clusters. Part of the reason is that 23andMe does not emphasize ancestral trees. So if I ever meet these cousins, I can introduce them as my Rathfelder Line Chromosome 3 cousins. From one of my other maternal Chromosome 3 matches, I see that I have traced one of these families to a German Colony in Saratov, Russia. I have not yet made the connection between them and to my ancestors who lived in a German Colony in Latvia.
So, Where Are We?
Here is a summary of some of the clusters:
I had the best luck with AncestryDNA. This is partly because I having been working with them more. Also partly because I used lower thresholds, I had the more obvious clusters and only five clusters. Ancestry also has the most matches and best genealogical trees.
FTDNA came in next as they do have some genealogical trees. This is where I tested first, so I have some familiarity with how they work. Their matching algorithm causes a perfect storm for my Irish Chromosome 20 matches showing that they match much more closely than they should. I expect that this is true to a lesser degree with some of my other matches.
23andMe was the most difficult as they focus the least on genealogical trees. It would take a bit of time to contact some of the critical matches there. I believe that 23andMe have more test results than FTDNA, so they have that going for them.
Summary and Conclusions
- So far, it has been easiest to interpret the AncestryDNA clusters. I would like to take the cM levels down once some of the bugs have been worked out.
- I got many more clusters at FTDNA and 23andMe, but some of the clusters descriptions are more vague than I would like.
- I would like to look more into the Hartley/Snell clusters. I am interested in Hartley’s that don’t match Snell’s as my genealogical brick wall goes back on my Hartley line – pre-Snell.
- It would seem that I should be able to cross-reference the clusters. Even though the matches are different at the different companies, the common ancestors are the same.
- This utility is new, so people are still experimenting with it. For example, is there a cluster sweet spot that isn’t too high or too low. Obviously, I have 32 third great-grandparents representing fourth cousins. This may be a good number of clusters to shoot for. There may be those in the 3rd great-grandparent level that may be too obscure to have clusters. However, this could be off-set by 4th great-grandparents with a lot of descendants that would make good clusters.
- A lot of the clusters have two people in them. Is it worthwhile looking at such small clusters?
- The AutoCluster utility has given me a fresh look at my DNA matches. I have also been entering some of the larger matches into my match spreasheet.