An Analysis of My Mom’s Shared Clusters

I’ve been playing around lately with John Brecher’s Shared Cluster Program. I wrote a series of three Blogs looking at my owned Shared Clusters. The last Blog is here.

Downloading My Mom’s AncestryDNA Matches

First I signed into my account using the Shared Clustering program. I chose my mom’s kit and downloaded it.

I used the middle button which says ‘Slow and complete’. It was supposed to take several hours but only took 32 minutes. My mom has 26947 total matches and 438 fourth cousins. That is about half the matches that I have.

First Try: 50 cM Gets Two Clusters

My first try was conservative at 50 cM. I only got two clusters. I was sure to name the output file indicating that it was for 50 cM:

This gave me two named clusters. I say named, because I see that there are probably actually four clusters altogether. My guess is that these represent her four grandparents. The top cluster represents my mom’s mom who was a Lentz.

By the way, my analysis looked like this:

I have a green box around my mom’s side. I will be expanding that like this:

For now, I’ll keep it simple with a paternal and maternal side for my Mom:

3 Clusters at 40 cM

These clusters are in no hurry to separate. That is partly due to my mother, her niece and two grand-nieces being in Cluster 3:

One person I know in Cluster 2, Otis, descends from a Schwechheimer:

Also Gangnus, but more distantly.

At 35 cM, Mom Has 6 Clusters

Now my mother’s maternal grandparents, Lentz and Nicholson are starting to separate. Nicholson is Cluster 2 and Lentz is Cluster 4:

Here is a simple comparison of Clusters between the 40 cM and 35 cM cutoffs:

This shows that the Lentz 1900 went to Lentz and Nicholson as I have them. That means I need to figure out where Clusters 1 and 6 go. These are new clusters with no precedent. All three in Cluster 1 match my mother’s Rathfelder cousin, so Cluster 1 could be Rathfelder or Gangnus:

Cluster 6 also appears to be on my mother’s father’s side which could be Rathfelder or Gangnus. I’ll put them under Gangnus as I have nothing else there so far.

My Mom’s 10 Clusters Down at the 30 cM Match Level

My MS Access query gives me this:

At the 30 cM level, this shows that Clusters 3, 6 and 10 are new. The previously new Clusters 1 and 6 at the 35 cM level map to new Clusters 9 and 1. Cluster 7 is interesting it has no precedent at the 35 cM level but one of those non-precedents was previously a Cluster 2 (Scwhechheimer) at the 40 cM level. So that may be a clue. Here is how these 35 cM clusters map at the 30 cM level:

Three maps to 8 and 7. 5 maps to 2. Technically 5 did not map to 7 but someone who was in the precursor to 5 mapped to 7. This probably gets into intermarriage in the German colony of Hirschenhof, Latvia where some of my ancestors lived. I suppose I could have omitted the second 7. Here is Cluster 7:

It looks to have two parts to it.

That leaves Clusters 3, 6 and 10 to map. Because I compared the clusters above in Access, I didn’t have to know anything about who was in the clusters when I mapped Clusters 1, 2, 4, 5, 7 and 8. However, for Clusters 3, 6 and 10, I have to know something about these people to put them in the right places.

Finding a Home for Cluster 3

Cluster 3  at the top left is part of a larger Cluster 3-4-5 complex:

The is a large area below the second green box which can be ignored as that would be my mother’s close relatives. The second green box is Lentz. Below the close relative area is Nicholson. One person in Cluster 3 matches 2 people in the Lentz Cluster. However, 3 people in Cluster 3 match people in the lower right Nicholson Cluster. So I’ll say this is a Nicholson Cluster. The Nicolson’s came from Sheffield, England. One person in Cluster 3 has a tree. This tree shows both of his parents being from England.

Cluster 5 – Consider the Source

Here I have both Lentz and Nicholson mapping to Cluster 5. How does this make sense? The ones that map from Lentz (4) to 5 are my mom’s close relatives, so naturally, they would match both sides. I’ll take the Cluster 5 that comes from Cluster 4 out. That is one of the disadvantages of mapping the numbers without considering the source.

Cluster 6 – Rathfelder Side

Here are my notes on this small Cluster:

My Rathfelder grandfather was a German from Latvia. So by geographical phasing, that puts Cluster 6 somewhere on his side.

Cluster 10 – Mystery Cluster

Cluster 10 has strong internal matches indicated by the dark color of the Cluster. However, I don’t see any other matches with other groups or clusters:

Two of the matches have large trees of over 25,000 people. Comparing my mom’s tree and this large tree brings up the name Clayton:

However, JK also shares these areas of ancestry with my mom:

Here I have banished Cluster 10 to the Unknown Realm:

So now I am ready to dive into new depths of my mother’s clustering ancestry.

Mom Is Up To 18 Numbered Clusters at 25 cM

25 cM is the next to the last stop for our analysis. This is an important level, because at this vantage we can look toward the more recent 30 cM clusters and will be able to look next at the further in the past 20 cM clusters. The progression for the numbers of clusters has gone 2, 3, 6, 10, and now 18.

This is the overall shape of the 18 clusters

My mom’s close relatives have split into two clusters. The previous Cluster 10 of unknown origin is getting bigger.

Putting Access to Work Again on the 25 cM Clusters

I imported Columns A-L to Access and added it to my ongoing query:

When I view this query, I get this:


I’ll sort by new clusters to see which ones are new:

Next, I’ll map the non-highlighted new clusters. Here is what I get with my blind comparison:

Clusters 7 and 8 are on my mom’s maternal and paternal side. I suspect that is a result of her close relatives matching both sides. I’ll search Access for Clusters 7 and 8 and add names:

Here I see Cluster 5 going to Cluster 8 only with two close relatives, so I’ll remove that from my analysis.

However, we are still left with two Clusters 7. One is on my mother’s maternal side and one on her paternal side. That means that Cluster 7 is a compound cluster due to close relatives being there and matching both sides:

That means my mom actually has at least 19 Clusters at this level:

Ancestry uses 20 cM for a fourth cousin cutoff. The fourth cousin level represents 32 ancestors – in theory.

Next, I just need to go down the Clusters and fill in the blanks:

Clusters 1 and 2 – Sheffield Area

One person in Cluster 7a matches all four people in Cluster 1. Cluster 2 goes back to a known Nicholson ancestor born in 1765. That would bring Nicholson down to 1798 on my mother’s line in the next generation. Due to the fact that we know that ancestors, it is possible that Cluster 1 could be on Nicholson’s wife’s Clayton line:

 Cluster 3 Affinity to Cluster 7b

Cluster 3 also has an affinity to Clusters 4 and 14. That brings up an error in my Chart:

At 35 cM, I had both Clusters 3 and 5 mapping to Cluster 4 at 30 cM. A review of my Access query shows one to be wrong:

Cluster 3 only mapped to Clusters 7 and 8. Here is the corrected version:

This is bound to be a little messed up because my mother’s father had Gangnus and Schwechheimer on his paternal and maternal sides:

 Cluster 5

Fortunately, I have already done some work with Robert in this Cluster:

Here is a confusing tree. The confusing part is that the common ancestor is with Gangnus. Gangnus is also the wife of my mother’s Rathfelder grandfather. That means that this Gangnus goes in on my mother’s non-Gangnus side!

Actually, it appears that Robert is equally related on both sides:

In fact, as Otis is not in Robert’s shared group, I need to put Robert back on the Gangnus side. Good thing I’m flexible. Patrick doesn’t figure in as his results are on MyHeritage.

I needed someone on the Gangnus side that I could identify.

Summary of the 25 cM Clusters

Here I dumped more new clusters into the Unknown bucket. I don’t have much under Lentz, so some may belong there.

The Final Run of 20 cM for 36 Clusters

I run this at 6 cM which is the minimum, but the minimum shared matching at AncestryDNA is set at 20 cM. Some of the extra matches between 6 and 20 cM could give hints from the Correlated Cluster Column.

Probably the easiest part of the comparison is using Access. Here is what I can fit out of the 83 rows of results:

Here is what the mapping looks like except for Clusters 7a and 7b:

Both Clusters 10 and 5 mapped to Cluster 24. That probably has to do with the interweaving of Schwechheimer and Gangnus in my mom’s Colony of HIrschenhof ancestry. In order to map Clusters 7a and 7b, I bring in the 30 cM clusters to see which 7 I am looking at:

Here, Cluster 2 maps to 7b and Cluster 5 to 7a. This still gives some confusing results:

However, when I filter out my Mom’s close relatives of over 900 cM, I get this:

Here is how that looks in my summary chart:

I couldn’t decide whether to put Cluster 12 in under previous Cluster 7b, so I put it in parentheses. It looks like I mapped 19 clusters not counting those mapped more than once. That should leave about 17 more to map – or drop into the unknown bin.

Mapping the New 20 cM Clusters

Here are some new clusters:

All I have to do is to look at each of these clusters. My best hope is if they match another cluster or have a common ancestor or if I have some notes on them in Ancestry. Occasionally, looking at a tree or building out a tree will help, but I rarely have luck that way. One thing I find difficult is that my mom had German ancestry on both sides. One side came to the US before the American Revolution. Another side went to a German Colony in Latvia and then came to the US in the 1900’s. However, both sides were originally in Germany.

Clusters 28 and 29

Someone in Cluster 29 had a Nicholson ancestor, but it may be coincidence:

Sarah was from Bolton in the above tree. My mom’s Nicholson’s were from Sheffield, but perhaps not always from there:

The Large Cluster 30

It looks like one of mom’s unknown ancestors had a lot of descendants.

And the Answer Is…

It seems like I’ve increased my knowledge of the unknown (poor pun intended). I still have a shortage on the Lentz side. I’m not sure if there were just not many offspring or what happened. I have at least one early common ancestor at Ancestry, but that match is not in a cluster. Another question would be to identify Cluster 10 at 30 cM. I feel like I have a better idea of the weaving of DNA and families between Gangnus and Schwechheimer in my mother’s ancestry.

Summary and Conclusions

  • The combination of the Shared Clustering program and MS Access help to make a fairly quick analysis of AncestryDNA shared matches
  • My mom’s genealogy is a bit tough. I know she has some more Latvian matches out there but tracing their trees back is difficult in the period when Latvia’s records were in Russian.
  • I don’t have an explanation why my mother’s common ancestors of Lentz and Baker in Philadelphia were not part of clusters.





Walking My Clusters Backward and Forward Using MS Access

In my previous Blog, I came up with a way to show my Clusters and how they changed (or didn’t change) between Shared Cluster runs at different DNA match levels. The Shared Cluster Program is from Jonathan Brecher and the encouragement to walk my cluster back was from Jim Bartlett. As a result of my previous Blog, I thought that it might be easier to walk my clusters forward instead of back. Usually in genealogy work proceeds from the recent to the more distant path and from the known to the unknown. However, after looking at the clusters, it seems that in some ways the older clusters are more specific.

Using MS Access to Walk My Clusters Forward

I used to have some knowledge of MS Access. I wonder if I can resurrect that. I want to take my 6 cM run from my previous Blog and compare it to my 25 cM run. I hope to map 50 clusters down to 27. So I am reducing the clusters by about two to one.

I’ll start with  a new Access database:

This database had a blank table in it for me to add data to. I didn’t want that, so I got rid of it. I’ll go to External Data and import my 6 cM Shared Cluster Run. I feel like I should close out my open copy first.

I chose New Data Source and found my file. When I sorted by Date modified, my file rose to the top.

This leads me to an import wizard:

I probably won’t import all the information. I clicked the box saying I wanted my first row to be the Column Headings.

Actually, I see a better way to do this. I went back to my original file and copied columns A-L:

I put these into a new file, saved it and imported that into Access. I don’t need any of the 2500 or so columns after L. This brought me to the import wizard again. I pressed next:

Access want to add a primary key or to choose a primary key. The test ID is an ID, so I’ll use that. Otherwise, Access will assign the numbers of 1 to about 2500 to identify my new database.

Next, I give my table a name:

This now shows up in my Access Database:

Next, I do the same for my 25 cM run as I want to compare the 6 cM run to the 25 cM run.

Querying Access

The next step is to use these two tables to see how the 6 cM Clusters look like at the 25 cM level.

I’ll choose Create > Query Design:

I’ll choose these two tables. That will put them onto the blue area, then I’ll connect them by the key ID:

I wasn’t sure what to choose from the tables above to put into the query below. I put in the name. This has to be the same in each table because the Test ID in each table corresponds to the same name. Most important is to map the 6 cM Cluster Number to the 25 cM Cluster Number. I have those columns plus the 6 cM Correlated Cluster Number, the 6 cM Common Ancestors and 6 cM notes. The notes should be the same for each table. I didn’t add the 25 cM Correlated Cluster Numbers. I can add those later if I find I need them.

Next, I choose view to get the results of the query:

I get 378 results. This makes sense as there were 378 results in the 25 cM Shared Cluster Table. For some reason, Access ordered the results by the number of the 25 cM Cluster Number. This isn’t so bad.

Easy Results

Next, I’ll move the 25 cM Column in closer so I can compare the two sets of cluster numbers:

Let’s look at the 25 cM Clusters 1-3. Walking back, Cluster 1 went to mostly to 23, but one went to 24. Cluster 2 went to 24 twice and 32 once. Cluster 3

only went to Cluster 16. My guess is that the mixup between Clusters 1 and 2 have to do with the fact that these Protestant Irish families intermarried.

A Flaw in My Logic?

I’m just thinking that I need to do this exercise at least one step further back as I haven’t yet finished filling in my 25 cM Clusters. I’ll see what I can do with just the information I have sorted out from Access so far.

Without any sorting of my query, it appears that Cluster 40 comes out of nowhere. However, when I sort by the 6 cM clusters:

Here is a situation where Cluster 40 must be a compound cluster. Here is what I see:


The smaller square at the top left may be Lentz. The single dark red square in the bottom right is Nigel. I mentioned him in my last Blog. He represents an old  Nicholson DNA match.

Nigel would not match the Lentz side which appears to be the top left side of the Cluster. There appears to be other sub-clusters within the Nicholson Cluster which may represent Ellis or Clayton highlighted above.

Because, I’m in the area of my 25 cM Clusters that I didn’t finish last time, I want to do a new Access Query

Comparing My 30 cM Clusters to My 25 cM Clusters in Access

Here is what my new query looks like:

I’ve connected the two tables by the Test ID. Then I compared the 25 cM Cluster Number to the 30 cM Cluster Number. Plus I added some more information that I though to be helpful. Actually I had the Name column twice which was not needed.

This is my Nicholson section. This shows that the 30 cM Cluster 14 mapped to two clusters at 25 cM. I’m glad I did this, because Cluster 25 splits between Lentz and Rathfelder due to my mother being in that Cluster:

This brings up another issue. At 20 cM, my Rathfelder 2nd cousin match aligned with my maternal 1st cousin and her two daughters. This caused them to form a single Cluster that mapped from a previous two Clusters (25b and 11). It’s not a big deal, but I had to just enter 38 twice to account for it. To be consistent, I’ll call them 38a and 38b:


Here the green square represents 38a with my first and second Rathfelder cousins:

The matches to the lower right of the green box would be my more distant relatives with ancestry in Hirschenhof, Latvia.

Finding Cluster 22 at 25 cM

When I compare my 25 cM Cluster to my 30 cM clusters:

Cluster 22 at 25 cM has no precedent at the 30 cM level. That has to do with the match level of Cluster 22. Here is a comparison with it’s corresponding 6 cM Cluster:

All these matches to me were under 30 cM. So basically, Access makes it so you don’t have to keep switching back and forth between the different results.

Solving the Puzzle: Filling in the 20 cM Clusters

The 20 cM and the 6 cM Clusters are the same. The Shared Cluster can look at how the matches between 6 and 20 cM are associated with Clusters but Ancestry doesn’t make shared clusters at less than 20 cM.

I have some filled in already, but need to get up to 50 clusters:

The first Cluster 19 mapping is an issue:

25 cM Cluster 19 maps to Clusters 6, 7, and 15 at the lower cutoff. Further, I can see at least three clusters within the new Cluster 7:

The two green squares represent points of interest for me. They appear to represent my Hartley English genealogy.

Backing Up a Step: 30 to 35 cM Clusters Compared with Access

Another way to use access is by using an unequal join:

The unequal join is represented by the left to right arrow above. That says, show me all the cases where there is a value in the 30 cM cluster table that is equal to a value in the 35 cM cluster table. And it adds in all the remaining 30 cM clusters that aren’t included in the 35 Cluster Table. I think that that is what I want. I could have had the arrow go the other way, but would have gotten different results.

Just for fun, I’ll do it both ways. This is the way from above:

This shows that Cluster 2 at the 30 cM level had no corresponding cluster(s) at that 35 cM level. Cluster 4 only had one corresponding Cluster. And in that corresponding cluster, Cluster 4 had a correlated Cluster 7.

Here are the results with the arrow going the other way. Think of the previous results as walking the clusters back and this new one as walking the clusters forward:

Here Cluster 2 doesn’t even show up at the 30 cM level. They both give about the same information, but the walking forward comparison should provide more detailed information.

Cluster 7 at 35 cM

I’d like to take another look at this as I had ignored this Cluster previously. Based on the above query, it turns out Cluster 7 is quite important. This is Cluster 7 at 35 cM:

The first two people I have as coming from Nantucket (my paternal grandfather’s side). The second two I had guessed as being from Ireland (my paternal grandmother’s side). I don’t think that they belong in the same Cluster, so I’ll tree this as two clusters.

Putting It All Together with Access

Here is a more complicated Access Query:


Above, I’m comparing the 6 cM table to the 25, 30 and 35 tables. I have the link between the 6 cM Table and the 25 cM Table a right handed link, so I’ll see all the 6 cM (or 20 cM) Clusters. I then put the 35 cM Clusters first so they will be in the order of my Summary Chart:

At 35 cM I have 10 Clusters. Cluster 1 maps to Cluster 14 at 30 cM. 14 Maps to Cluster 26 at 25 cM. Cluster 26 maps to Clusters 40 and 42 at 6 cM. However, note that other clusters are mapping to Cluster 40. That is because we have Lentz and Nicholson in Cluster 40 as well as those who only descend from  Nicholson, if I understand it correctly.

Cluster 2 at 35 cM includes my mother. Her clusters go from 2 to 14 to 25 to 21. The correlation is one to one until we get down to the 20 cM level. At that point the cluster splits into three where Cluster 40 is a child of Cluster 1. That makes me think that Cluster 40 will be a compound Cluster.

I see at least three clusters for Cluster 40 at the 20 cM cutoff:

Another Query

This one is more like the manual comparison that I did previously:

This query says take everything in the 6 cM Table plus those things that match inthe 25 cM. Then do that for each Table. That query appears to give me everything as it includes 2,452 rows:

Here is a stripped down version of this query:

Here I only include the Cluster Number columns:

This gives me 135 rows. This also points out that Cluster 6 and 10 at the 35 cM level both map to Cluster 7 at the 30 cM level. This makes sense when you look at Cluster 7:

This is a distinction that I had missed in my original mapping chart:

This corrected chart better reflects what Access is showing me:

Access then shows this:

The part of Cluster 7 that was from 10 goes to 18 and then to Cluster 5. This was not reflected in my previous summary chart:

Here is the correction:

I’ll also add in my Cluster 7 at the 35 cM level. I didn’t add it in previously as it had 2 people with Nantucket shared ancestors and 2 people with suspected Irish shared ancestors.

Mapping Cluster 4 at 30 cM

Here I have another situation like my Cluster 7 above:

This is another 4 person Cluster. And, like Cluster 7, the first two people seem to match on my Hartley side and the second two seem to match on my Frazer or Irish side. This is reflected in my Access query:

The split goes to Cluster 8 (Irish) and 14 (Hartley ancestors).

Mapping Cluster 14 at 30 cM

Here is the work I did previously in my Cluster Summary Chart:

I had split Cluster 14 in three. These are my major maternal lines. Because my mother was in this Cluster, she was related to both these sides. Lentz and Nicholson are well-tested, so I have some good separation there. This confusion is reflected in my Access query:

The Common Ancestor Column above shows three different sets of Common Ancestors. They represent the needed splits for Cluster 14. This is where the Access results come in handy. In Access, I have a 14 Cluster going to Cluster 26 which goes to 27. I had missed that in my previous analysis and only had 27 going to Cluster 40.

My Irish Cluster 15 at 30 cM

Here is what I had:

The Access query shows some additional subtleties:

This shows Cluster 2 at 25 cM going to Cluster 24 at 20 cM. However, that doesn’t mean my other Cluster 2 to 32 is wrong:

Two other people in Cluster 2 probably would have mapped back to Cluster 15 but the DNA match was not high enough to be in Cluster 15. Also note that Cluster 24 and 32 both have Common Ancestors:

The question is, if these three have the same common ancestors, then why are they in different clusters? The answer appears to be that one Cluster (24 or 32) would represent McMaster and the other Frazer due to the two common ancestors.

As a result, I split out these two clusters like this:

McMaster 1829 is Fanny in Cluster 2 above. I now have Clusters 24 and 32 as her parents. Consider it a theory.

The Good Enough Product

I have slimmed down my chart to show between 40 cM and 20 cM. I started with 5 clusters in the 40 cM Column which is enough to describe my four grandparents. In the 20 cM column, I didn’t feel a need to describe each of the 50 clusters. However, I was interested in describing each cluster in the 25 cM column and it’s corresponding cluster at 30 cM and 20 cM. The 25 cM clusters were at a good vantage point where I could check the clusters on either side using Access. The place where I may be more interested in detailed 20 cM clusters would be for my English Hartley side.

Summary and Conclusions

  • Jonathan Brecher’s Shared Cluster Program is good for sorting out your clusters
  • The use of MS Access makes it easier to see the nuances of how the clusters merge or separate between the different lower match thresholds
  • There is a question of my mind concerning the level of accuracy needed in this analysis. It’s good not to be so detailed-oriented that you miss the big picture. The big picture for me is whether the cluster is in the right grandparent group for me.
  • This was my first shot at using Shared Clusters to sort out my Ancestry Clusters. I’d like to try using the Shared Cluster program on other DNA kits that I administer. Perhaps my mother’s results would be the next logical step.



Walking Back My Clusters: Part 2

Part 1 of Walking Back My Clusters was long and rambling. I learned a few things, looked at a few family trees and reached out to a few DNA matches at AncestryDNA.  While writing my previous Blog, I came up with a better way of presenting the results of walking back my clusters. I realize that this may sound obscure if you are not already into genetic genealogy and clusters, but hopefully the readers understand the basics of DNA and clustering.

The New Cluster Results Format

Here it is:

I have my four grandparents in four colors. The thought is that even if I am lost as to what a cluster represents, I should know under which grandparent the cluster belongs. At the top, I show the cM cutoff for the clusters. I have a small column for the cluster number that the program produces. This is a relative number and changes for each analysis. To the right of the cluster number, I have the name of the closest surname that cluster represents and the date that ancestor was born. If I don’t know this, I may give a geographical hint. As far as which ancestor to use, it is somewhat subjective. On the top row I have Hartley going to Pilling 1802 and Snell 1866. It may have made more sense to use an earlier Hartley instead of Pilling, but I suspected that one of the people in that particular cluster went back to Pilling. Under Lentz 1900, that went to the two parents who were Lentz 1866 and Nicholson 1865. This new representation, so far, keeps everything close together where I can keep track of where the clusters are going.

Cluster 13 on the 30 cM Limit

This is where I left off on the chart above. There are only three in this Cluster. I have a note on one of the match’s that they have a Northern Ireland background. I’m going to peek forward to 25 cM to see if I get any more hints. This adds one more match and tree. This tree in addition to the match with the largest tree has Canadian ancestors. I’ll take a look at the largest tree in Cluster 13:

This particular match had Ontario ancestors. The parents had connections to Owen Sound, Ontario. That sounds familiar from one of my distant Frazer relatives. My hunch is that the connection is not on the McRae side as they are listed as being originally from Scotland and Presbyterian. I’m not aware of Presbyterian ancestry on my Frazer side.

Here is my best guess:

Jane or Jennie in the bottom right of the tree is from Inniskillen. I assume this to be a variant of Enniskillen, where a lot of my DNA match leads take me:

All that to make a guess at Cluster 13.

Cluster 16 on the 30 cM Limit

I’m a bit stuck on this one. I think it is on my maternal grandfather’s side. Here is what I have so far:

I know that at 20 cM, I have 50 Clusters, so I have a way to go.

25 cM Clusters

Here I have 27 clusters, but some may be compound clusters.

Clusters 1 and 2

These split out the previous Cluster 15 which I had assigned to Fanny McMaster born 1829. Let’s take a second look:

Whitney in Cluster 1 matches everyone in Cluster 2. That is because he is a closer relative to me than I am to others in that Cluster:

What is also confusing is that Margaret McMaster had two McMaster parents.  I had the previous Cluster 15 correct on Fanny McMaster. However, it would be easier for me to think of this now as having the old Cluster 15 on Margaret McMaster 1846. Then I could assign Fanny to BV and mt and James McMaster to Whitney. Here is how I’m related to Whitney:

I am Whitney’s third cousin once removed. Here is one case where I changed an earlier analysis based on a later one:

After going through some more clusters, I came up with this:

I mentioned that I Had 27 clusters at 25 cM and 50 clusters at 20 cM, so I gave up doing this for now. I think I have an easier way to go about this which I will explore in my next Blog.

One interesting thing above is that the orange Rathfelder line jumps from 1921 to 1819 in the above cluster summary. My explanation is that there were not many DNA matches for that line at Ancestry. That line represents my mother’s father who was from Latvia. He jumped ship and came to the US in 1916. I have has one Rathfelder 2nd cousin once removed who tested at Ancestry, but one person is not enough to form a cluster at that level.

Going All the Way to 6 cM with Shared Clusters

The creator of the Shared Cluster Program commented on my previous Blog and recommended I take the clusters down to 6 cM. John Brecher tells me I won’t get any more clusters but more matches associated with those clusters. At first I thought that I had to leave the “Lowest centimorgans in shared matches” as 20, but that gave me the same results as my last run using 20 for both values in that row. So now I have both values set to 6 cM:

This kicked up my spreadsheet from 912 rows to 2453 rows. I suspect that this is where the Shared Cluster Program really shines.

Filtering the 6 cM Results

Excel has a filter button. I would like to filter my results on Common Ancestors:

When I choose Filter, an arrow appears in each column’s heading. I click on the arrow under “Common Ancestors” and unclick the ‘Blanks’ option which is at the bottom of the list:

That will give me each row that has a common ancestor:

I couldn’t get all my results in one screen shot, so the top is cut off.  Cluster 7 appears to have many of my 2nd cousins, so it shows other more distant clusters that they are related to. The 13 is highlighted in the Correlated Clusters column because it gives a clue to Cluster 13 with common ancestors Snell and Luther that I didn’t have before for Cluster 13. The same is true for associated Cluster 19 with common ancestors of James McMaster and Fanny McMaster. If I add up all the clusters plus associated clusters that have Common Ancestors, that adds up to about 20. Those will be a good clues to identifying my 50 clusters.

I highlighted Nigel because he is an interesting case. He has a fairly high DNA match with me. He’s my 5th cousin, once removed:

I don’t recall Nigel being in a cluster before due to the distant of his relationship to me. So it is good to see him in Cluster 40 now.

On to the Next Blog

Part of the difficult part of comparing these Clusters is cross checking between say, a 25 cM analysis and a 20 cM analysis. For example, Charlie was in Cluster 35 at 20 cM. What Cluster was he in at 25 cM? I hope to figure out a way to make that a little easier in my next Blog using MS Access. There may be other ways. It makes sense to me also to walk the Clusters Forward instead of back. That is because the older clusters have more people in them. As noted above they also have about 20 identified Common Ancestors.



Walking My Clusters Back – Jim Bartlett Method

I recently read two interesting articles by Jim Bartlett on the use of Shared Clustering. Jim’s most recent article discussed walking the clusters back. Shared Clustering is a free program developed by Jonathan Brecher.

Shared Clustering

Last Night while the New England Patriots were playing football, I downloaded Jonathan’s program and used that program to download my AncestryDNA matches and Shared Matches.


I used the first two radio buttons above. The first button downloads your matches up to the fourth cousin level. That is a match of 20 cM or more. As I recall, this was about 978 matches. I may be off, because I just checked AncestryDNA and I have 908 matches of 4th cousin or closer. The second button gets your matches and Shared Matches down to a level of 6 cM. It took overnight to gets all these downloaded. However, once I have those, I don’t have to connect ot AncestryDNA again – unless I need an update. The download is in the form of a text file and not overly useful in that form. It is sort of a dump of my AncestryDNA match data.


Next, I chose the recommended button for clustering under the cluster tab:

This outputs to an Excel spreadsheet file. If I shrink my spreadsheet to the minimum 10%, I can see half of the clusters:

This gets me to about Cluster 18 out of 50 clusters. So, though this is theoretically, my 4th cousins, it must go out further than that. 4th cousins would represent my 3rd great-grandparents. I have 32 great-grandparents and 50 clusters. 18 or more of those clusters must go beyond the level of the 3rd great-grandparents.

Here is the bottom half of most of my clusters down to Cluster 50 in the lower right of the screen:

Walking My Clusters Back

Jim Bartlett recommends walking back your clusters from your 4 grandparents further back a generation at a time. My first Blog on clustering was about a year ago using the Auto Cluster program. Here was my first Auto Cluster:

In this simple analysis, I had 5 clusters. However, as far as I could tell, none of these represented my maternal grandfather:

  1. Paternal grandfather – orange
  2. Paternal grandmother – green, purple and brown
  3. Maternal grandmother – red

My paternal grandfather was a German from Latvia who came to this country in the early 20th Century. So, not many relatives had tested. Not really a problem, but something to be aware of.

Shared Clustering 90 cM or Greater

Next, I tried the Shared Cluster 90 cM or Greater. It looks like this should give me 3rd cousins or greater. Somewhat surprisingly, this only gave me two clusters:

A few notes:

  • The Shared Cluster program does not appear to have an upper limit for matching. Because of that my immediate family is included. They show up as a a horizontal bar in the middle of the image.
  • The first two people are in a cluster of sorts, but Shared Cluster only includes clusters of three or more by default. They fit in on my paternal grandmother’s side.
  • The third person (the first person in Cluster 1) is actually on my maternal grandfather’s side. This was a new person who tested since last year. She is in Cluster 1 because she matches with my mother, my maternal first cousin and her two daughters.
  • Cluster 2 is all my paternal side. The matches go back further than that but the Cluster is holding together due to my close family being included in the Cluster.

Tweaking the Shared Cluster Program

Under advanced options on the Cluster Tab, I don’t see any option for screening out close relatives:

So I’ll try to ratchet down the lowest centimorgans to cluster to try to break open these clusters. I’ll try 50 cM for the lowest:

Above, I picked up one more Cluster. Cluster 1 is now my paternal grandmother’s cluster. This was the one that wasn’t a cluster previously, but I picked up one more person to make it a cluster:

  1. Paternal grandmother
  2. Maternal
  3. Paternal grandfather

The first person in the previous Cluster 1, Donna, is now the last person in the new equivalent Cluster 2. So far, I have not split a cluster but added to a previous non-cluster. This is fun to play with.

I Need to Get to About 8 Clusters Next

Trying 40 cM still resulted in 3 Clusters, so I’ll try 30 cM. I know that the three represent four grandparents as they are, but I only have one tested person for my maternal grandfather’s side tested at Ancestry. I know that at 20 cM, I have 50 clusters, so I need a match number that will get me about eight clusters. I think I see an issue. On the advanced tab, there is a maximum shared match number. When I ran 50 cM, I had a maximum shared match of 90 cM. I need to change that to 50 cM:

This flipped the clusters around:

  1. Paternal grandfather
  2. Maternal
  3. Paternal grandmother – now up to a cluster of 6 people who match me and each other by DNA

I think I’m getting the hang of this.

A 40 cM Cluster Gives Me 6 Clusters

This may be about what I want. Again, I set my shared match limit to 40 cM:

There are two-person clusters where I have the arrows. There is also a one-person cluster at the lower right of the image above. The Clusters are:

  1. Lentz
  2. Nicholson – the first two clusters look like one. I believe that that is because Cluster 1 is Lentz/Nicholson and Cluster 2 is Nicholson without the Lentz.
  3. McMaster/Frazer (Ireland) – These families intermarried more than once in my ancestry
  4. Unidentified, but believed to be Spratt (Ireland)
  5. Most likely Clarke (Ireland)
  6. Hartley – Paternal grandfather, but not further split out

Here are those Clusters on my family tree:

  • I know least about the Clarke line, yet this seems split out to the two parents of Clarke and Spratt
  • Cluster 6 is stuck probably because Hartley and Snell had 13 children and I have a lot of 2nd cousin matches at AncestryDNA
  • Cluster 2 appears to be split between three great-grandparents on my maternal side. I’m not sure why. I have some other Rathfelder cousins, but they tested at MyHeritage and FTDNA.

Some Walk Back Analysis

This shows what happened between matches of 50 to 40 cM when my clusters went from three to six.

  • My mother’s Rathfelder Cluster split into her maternal grandparents of Lentz and Nicholson
  • My paternal grandfather’s Cluster got stuck and was not further divided
  • My paternal grandmother’s Cluster seemed to skip a generation and form two clusters further out.

As my Clarke and Spratt Lines are brick walls, I would like to look at them. I am quite sure of Cluster 5. My common ancestors with two of the people in this Cluster are Thomas Clarke and Jane Spratt. That being the case, I could have put the Cluster 5 up a generation at Ancestor #11.

The four matches in Cluster 4 are all just above 40 cM, so they didn’t appear in the 50 cM analysis.

Here are Clusters 4 and 5. There are a few connections between these two Clusters. I interpreted that to mean that Cluster 4 is the ancestor of Cluster 5. Here is my modified summary:

A 35 cM Threshold Results in 10 Clusters

It’s a free program, so I can play around with it:

10 is still pretty close to 8, so let’s see what we have for Clusters:

  1. Nicholson
  2. Lentz
  3. Frazer
  4. Clarke/Spratt
  5. Snell or Colonial MA?
  6. Snell/Bradford – this was a larger cluster in my previous run
  7. Parker Nantucket?
  8. McMaster Ireland?
  9. Hartley English?
  10. Snell or Colonial MA?

I’m not sure that this is any clearer than the previous Cluster of 6. Some of my matches that were previously in clusters fell out in this analysis.

35 cM Cluster Analysis

For the 10 35 cM Clusters, it would be nice if I were able to trace where they came from. I had a question on Cluster 5. However, it is still as good as it can be right now. There are only three in this cluster. They have no usable trees and they are shown matching Hartley’s in my 2nd cousin large Cluster.

On Cluster #7, I don’t agree with the way the program drew up the Cluster, so I would rather ignore that Cluster. Half of the Cluster seems to match Cluster 6 (Massachusetts Colonial) and half seems to match Cluster 8 (Irish ancestors). Cluster 9 is difficult as there are only three in the Cluster. One tree has English ancestors, but not all are English.

A 30 cM Match LImit Gives Me 16 Clusters

So by accident, I have come upon 16 clusters. In a perfect World, this would represent my 16 2nd great-grandparents. I have already shown that theoretical perfect numbers are not showing up in my case, so I don’t see a lot of purpose in getting a perfect 4, 8 and 16 clusters.

Here I have pointed out my maternal side. They only match with the first two Clusters. That means that the following 14 Clusters appear to be paternal.  The largest Cluster is #6. That is the one with a lot of my second cousins.

Here are my guesses for these 16 Clusters:

Had this previously as possibly Hartley English due to someone with a Heaton in their ancestry. Heaton is a name that was in the area where my Hartley ancestors came from. I had that one of my Hartley ancestors possibly married a Heaton. However, I had this wife of dying before they had children. Based on others in the group I would go back to saying that this is probably a Colonial Massachusetts Cluster

Cluster 2

I would interesting in knowing about Cluster 2. One of the matches in this Cluster was part of a New Ancestor Discovery at Ancestry that I never figured out. One match has a tree, so I could try building that out. My guess is that this Cluster is along the lines of my Irish ancestors.

I don’t have a lot of hope in figuring out this line, but I’ll give it a shot:

John McLean goes back to Ireland, so that is where I was trying to get. Going out further, I get this:

The trees are going back to Scotland on many lines. I tend to put some of these lines on the Clarke/Spratt as I don’t know much about those lines except that they were from Ireland.

Back to the guesses:

  1. Snell and before Massachusetts Colonial
  2. Clarke or Spratt Ireland
  3. English Hartley ancestors?
  4. One match correlates to Cluster 7 (Hartley 2nd cousins) but one match maps to Frazer by Visual Phasing, so say Frazer side
  5. Possibly Spratt
  6. Hartley side by shared matches
  7. Snell/Bradford based on one match with common ancestor
  8. Isaac Parker/Prudence Hatch (1778)
  9. Correlated with Cluster 11;

A Cluster 9 Tree

One of the Cluster 9 matches has a tree:

I have come up with many of these names before, but the name of Reed sounds familiar. Here is the detail on Alexander Reed:

Here is Hastings:

Here is the Reid I have:

Apparently William Wynn Fraser marries a Rachel Reid. My guess was that Reid was her married name. However, this family lived in Kenilsworth, Ontario:

I’m not sure if the Reid and Reed families are the same or whether there is any connection with my family. A search for Alexander Reid/Reed shows that there were many by that name living in Ontario.

Cluster 14

I joined the Shared Cluster Facebook Group. It looks like this Cluster is actually more than one Cluster.

Because my Mom, her niece and two grand-nieces are in this Cluster, it formed a super Cluster. I’ll call them 14a, 14b and 14c.

  • 14a Nicholson
  • 14b Rathfelder
  • 14c Lentz

Rather than look at each Cluster in detail, here is a summary:

I skipped a few Clusters. This exercise reinforces my thought that getting the exact 16 clusters for 16 2nd great-grandparents is not important. I had 16 Clusters but only 2 were maternal. That means that 14 were paternal and far in excess of the 8 paternal great-grandparents expected. Cluster 16 was maternal and most likely my maternal grandfather’s side. I haven’t placed this group yet. They seem to go back to a German Colony in Russia which was a long way from my grandfather’s family’s German Colony in Latvia. There was some connection to the two colonies, but I haven’t made the connection genealogically with my family.

25 cM Cutoff – 27 Clusters

This is 5 cM above the cutoff that Ancestry uses for 4th cousin. This is equivalent to a 4th great-grandparent common ancestor. I expect that a 25 cM cutoff should be equivalent to 4th cousin.

Here is the general look of the clusters:

I am in a vertical and horizontal group that splits the chart about equally in two. My mother and her close relatives form a lop-sided plus sign in the lower right side of the chart.

Clusters 1 and 2

These two clusters hold a lot of potential. These were previously Cluster 15 and I had assigned them to my ancestor Fanny McMaster. Now that Cluster 15 has broken into two, it appears that each cluster could represent one of Fanny’s Parents who were William McMaster and Margaret Frazer. I have recently learned a lot about this family through researching their move to Ontario from Ireland. Two of the people in the new Cluster 2 share my common ancestors William McMaster and Margaret Frazer. If I could identify Cluster 1, it should help to identify Cluster 2. I know that on of the matches in Cluster 1 has an unidentified Jane Frazer or Frazier in her tree. That means that Cluster 1 could be Frazer and Cluster 2 McMaster. This is important as I have at least three Frazers in my ancestry and at least two McMasters.

To accommodate this, I have lengthened my ancestor chart down to the 4th great-grandparent level:

This would be a theory to follow up on based on the fact that a match in Cluster 1 has a Frazer ancestor but no known McMaster ancestry.

Cluster 3

There are only three people in Cluster 3. Based on correspondence from someone with a private tree, our common ancestors are Simon Hathaway born 1711 and Hannah Clifton. That is two generations back from the extension I made on my cluster summary chart, so I’ll just add Cluster three to my Hathaway 4th great-grandparent.

Cluster 4

Cluster 4 brings into question my previous Parker Cluster. I had a match with at least one person in this cluster with a common ancestor going back our shared Parker ancestor in Nantucket. However, now there are two others in this clusters. One has an ancestor in County Roscommon where I had ancestors. Another person is from Australia. Now my match with the Parker ancestor also has an Irish ancestor. Perhaps this is the real match I should be looking at?

Cluster 5 – Spratt

In my 30 cM analysis Cluster 5 was also Spratt coincidentally. However, this new Cluster 5 goes back another generation and has split off the Clarke from the Spratt:

The new cluster 5 at the 25 cM threshold has moved from my 2nd great-grandparent level (Jane Spratt born  to my 3rd great-grandparent level. This is important as Spratt is my most severe brick wall.

Triangulating Spratt Trees in Cluster 5

My thought is that if I can find common ancestors in some of the trees represented by Cluster 5, I may find my common ancestors. First in order to not duplicate effort, I checked to see if I had an existing Spratt Tree. I did:

Unfortunately, I don’t remember who Ed, Deb and Helena are. I do note with interest a George Spratt who married a Jane McGuire. Could they be the parents of my Jane Spratt thought to be born about 1830? William and Christopher are also potential candidates.

My first match in Cluster 5 is Craig. I’ll add him to the tree:

Craig matches me with a healthy 33.9 cM of DNA. One question would be whether Christopher was married previous to marrying Margaret McKay.

Next in Cluster 5 is Deb. She is already on my chart and matches me with 34.1 cM of DNA. The last person in my Cluster 5 with a tree is Helena who again is already on my tree. She matches me at 25.2 cM.

This leads me to two theories:

  • I descend from Christopher Spratt and a first wife, or:
  • I descend from William Spratt born 1775 and then from one of his sons

In now see Ed and match him by 44.8 cM.

Here is another Cluster 5 Tree:

I’ll call this person Shar. She must be on the Margery Spratt Line:

The tree is now shaping up with DNA matches. Shar’s tree ended with Jane, but I assumed it was the same Jane Hayes that was in Helena’s tree. The good news is that I have the start of a good Spratt DNA project. The bad news is, I’m not much closer to knowing where Jane came from. It’s interesting how clearly this Cluster points to this genealogy, yet I don’t have the specifics. I’m slowly getting closer to the answer.

Clusters 6-9 – Irish, But Which Families?

I’ll start with Cluster 9 as Gladys is in that Cluster. I manage her DNA:

From what I can tell, James at the top married his cousin Violet Frazer. I could safely assign this Cluster to George W Frazer as Gladys has no known McMaster ancestry. I would like to go back at least another generation, but at this time, I can’t match up the genealogy of my other matches in this Cluster.

I don’t have a good guess for the other clusters other than possibly on the Clarke side.

Cluster 11 – Schwechheimer

Through hard work and diligence, I came up with a common ancestor for one of my three matches in Cluster 11:

However, this gets confusing. Rosine Schwechheim, my ancestor married a Gangnus. Also Rosine’s mother was a Gangnus. Technically, the common ancestor would be further out, but it is safe to say that the line on my side went through Rosine Schwechheimer.

Cluster 13 – Clarke

I know that I have a Clarke/Spratt common ancestor with two matches in this Cluster. I see another match with a person in this cluster but Patricia has a private tree. She has uploaded to Gedmatch:

Cluster 14 – Snell?

There are only three in this Cluster. One match has a tree that goes to Hannah Snell. She is probably the granddaughter of my ancestor Samuel Snell born 1708. I’ll stick this Cluster with a later Snell ancestor because I don’t want to extend my list too far:

This Anthony is Samuel’s grandson, so technically, I should have gone back another generation.

Cluster 15 Hartley English Side

This is a side I am interested in if it is Hartley English. There are three in the Cluster. I have looked at one tree with no luck. Perhaps looking at a second tree will help. The matchup seemed like it should be on Mark’s maternal side:

Here is the tree from the other person in Cluster 15:

Cluster 16 has only three also. The one person in Cluster 15 without a tree had a connection to Cluster 16.

Clusters 17 and 18

Cluster 17 is picking up in size which may mean my Snell side which has the Massachusetts background. I can’t find many good trees in this Cluster. Cluster 18 is large. Despite the size, I couldn’t find common ancestors and Ancestry didn’t suggest any.

Cluster 19

This is the Cluster I am in as well as my siblings, close relatives and second cousins. Two matches in the group have the common ancestors Snell and Bradford. One match has Greenwood Hartley and Ann Emmet. That means that this Cluster should be two Clusters.

These show in the same Cluster due to all my close relatives in this Cluster. I would split Cluster 19 like this:

The grey horizontally highlighted row is the Greenwood Hartley match. This is an important distinction for me as one side represents my English Hartley side and the other side represents my Colonial Massachusetts Snell side.

Clusters 20-27

  • 20 – probably MA Colonial
  • 21 – probably Irish
  • 22 – probably maternal grandfather
  • 23 – maternal grandfather. Some match my maternal cousin but not my mother, so that seems odd.
  • 24 – more maternal grandfather
  • 25 – The is a compound cluster. 25a is Lentz. 25b is Rathfelder. This was previously 14a, b, and c so the Nicholson cluster broke off this below
  • 26 – Nicholson
  • 27 – probably Irish

Summary of the 25 cM Clusters

Some splitting out of known clusters are interesting as they suggest descent from a specific older ancestor. This was the case with my ancestor Fanny McMaster where I was able to split out matches between her parents William McMaster and Margaret Frazer. Where I didn’t know the previous cluster, when these were split out it just split out to other clusters that I didn’t know.

The Parker Cluster was confusing. I had a common ancestor for two of the matches, but two other matches seemed to indicate that they didn’t have the same common matches. This could be the case where they match each other on a different line.

When I put the clusters into my summary chart, I am putting them in vertically. However, it is important to check vertically also to make sure the clusters are being picked up. I also looked into some genealogy. I filled out a share DNA Spratt tree. I don’t know where I fit in this tree, but I am all the more certain that I do fit into this particular tree, so that narrows down where I should be looking for genealogical clues.

It seems I need a better way of presenting the results of the clusters. Right now the results are very spread out do to the increasing numbers of ancestors. It would be possible to collapse these results to include only the ancestors with clusters, but that would omit all the ancestors that I don’t have clusters for.

20 cM – 50 Clusters

At the risk of making this a marathon Blog, I’ll look at my 50 Clusters down to 20 cM. This is the matching limit for AncestryDNA. Apparently this program can take the level lower, but the shared matching limit will still be at 20 cM. I expect some more of the same of what I found out above.

I see a problem already with Cluster 1. All the levels are below 25 cM. That makes it difficult to place this Cluster. One person in the Cluster has a tree of 5:

It may be possible to build this out, but it would be a low priority for me to do this right now. I don’t see this person on my mother’s match list, so I suspect this is a paternal match.

Cluster 2 has only four in it. Two are between 25 and 30 cM, but they did not form a Cluster under my 25 cM analysis.

Cluster 3 matches are all under 25 cM, but match my mother.

Clusters 6 and 7

The program split 6 and 7 strangely. Two of my sisters are in #6 and one in #7. My son is in Cluster 6 and my daughter in Cluster 7. What is more important is the splitting of Cluster 7:

This splitting is important to me as I am trying to find English Hartley ancestors who don’t have Snell ancestry. The larger part of Cluster 7 has Snell ancestry (outlined in green).

More Detail on Cluster 7b

There are 8 people in Cluster 7b. It also looks like 7b forms two clusters. My guess is that this represents Hartley and Emmet:

The first match in the Cluster is Kristen. I think we have been in touch, but I can’t find any Ancestry messages. Here is the connection:

The second on the list is Mark. I’ve been building out the part of his tree where I think there is a possibility we might match up. That is his maternal grandfather’s side:

Lucy Priestly died in Hull, but was born in Halifax which is a bit closer to where my ancestors lived.

Lucy’s mother Sarah Ann Wilson was the one born in the Halifax area. Here is Sarah Ann’s baptismal record from 1825:

My guess is that her mother could have been Susannah? Her father was a bookbinder. I didn’t make a genealogical connection between myself and Mark yet, but I will likely come back to his tree.

The next match is Arlene. She doesn’t have a tree, but I sent her a message.

The next match with Howard appears to be important:

Even though Howard doesn’t have a tree, It appears that he may descend from my Pilling ancestor:

I guess I hadn’t realized that two separate Wilkinson lines descended from Pilling. At any rate, my guess is that Howard descends from one of these two lines. I believe that on the right, next to Richard should be a Paul also. I don’t match Paul but some of my relatives do. As far as I know, the David Watson above isn’t closely related to William Wilkinson.

Another question I have for the above Cluster is whether Bessey should be included in the Cluster. I would guess not, because I have that Bessey’s ancestors are Snell and Bradford. Also Bessey is linked to Clusters 12 and 15.

A further point to consider is that Arlene and Howard appear to be in both sub-clusters above. Assuming that Howard is a Pilling match, that may mean that both sub-clusters are Pilling clusters. That could mean that one sub-cluster is more for Mary Pilling’s mother and the other for Mary’s father. However, that is just a guess. Mary’s parents were Greenwood Pilling and Nancy Shackleton:

Dave, Bruce, Mark and Michael

Dave and Michael have trees. I’ve been working on these trees, but haven’t found the connection yet. However, I see connections in the Greenwood surname. I haven’t found a Greenwood surname in my ancestry, but it may be there. Mary Pilling’s father was Greenwood Pilling. Mary’s son was Greenwood Pilling. Many of these genealogies seem to have West Riding connections but not to bordering Lancashire where my ancestors lived.

Summary and Conclusions

This could be a good place to stop. I want to continue this Blog as I have come up with a better way to present my results.

  • Walking the clusters back is a good way to look at your clusters.
  • This is a way of organizing your cluster, making sure you have contacted the important matches and making sure the clusters are placed in the right area of your genealogy.
  • I started my clusters with a 50 cM limit. From there I went to a 40 cM limit and went down by 5 cM increments until I got to 20 cM.
  • The clusters did a good job at identifying my most recent brick wall, Jane Spratt born about 1830 in Ireland. From there I was able to place Jane in the correct Spratt tree, though I could not tell for sure which branch she was from. This could further direct genealogical research.
  • I tried to connect other genealogies from other clusters with limited success.
  • I came to the realization through this analysis that I have DNA matches with two separate Wilkinson lines descending from my ancestor Mary Pilling.
  • As I walked these clusters back, some split cleanly into two parental clusters, some didn’t. Some unknown clusters split into further unknown cluster as might be expected.

To be continued….

More Sibling Clusters at MyHeritage

So far, I have looked at AutoClusters at MyHeritage (MH) for myself, my mom and two siblings. I have been a bit surprised in how different the clusters look. In addition, Genetic Affairs (GA) has used different parameters and gotten different results. At the end of this Blog, I will have looked at my mother’s results and her six children’s clusters.

My Brother Jon’s Clusters

Jon has 7 clusters. They look a lot like my sister Heidi’s 7 clusters:

Emily In Jon and Heidi’s Clusters

Emily appears in Jon’s Theory of Family Relativity like this:

Here is Emily in Jon’s red Cluster 1:

Note that she also matches Clusters 3 and 7. In fact, Emily matches every person in Cluster 7, including another 2nd cousin once removed who descends from the same common ancestors as shown above (Frazer and McMaster). That raises the question as to why was Emily not in Cluster 7?

Here is Emily in Heidi’s clusters:

Here Emily is on the last row in Cluster 7. She matches many people in what seems to be a red over-match Cluster 1. Emily Matches orange Cluster 21 with people that have McMaster ancestors. Emily also matches Paul who is in yellow Cluster 3. I note the following:

  • Emily matches Heidi by Frazer and McMaster
  • Cluster 2 has people in it that match on the McMaster line but not the Frazer Line.
  • Cluster 3 has Paul, but as he is not in the orange McMaster Cluster 2, that the yellow Cluster 3 may be a Frazer Cluster.

Jon’s Cluster Inputs and Outputs Compared

I gave Heidi and Jon the same highlight color as their results were so similar.

Let’s ID Jon’s Clusters

I have already started to do this thanks to Emily and others. After doing a few of these, I can pretty much look at the people in the clusters and ID them:

For some reason, Jon had a better mix than Heidi. Jon has all four of his grandparents represented, where Heidi only had two grandparents’ DNA represented.


I’m not so concerned about Cluster 1. Although it is large in size, in a way it is not as important due to the over-matching. In fact, some of my most important matches are not in clusters at all.

Moving On to My Sister Lori’s Clusters

Lori has a good number of Clusters:

She has what appears to be a Chromosome 20 super-over-match Cluster 1.

Here are some of the AutoCluster input/output numbers for Lori:

Lori’s MH Clusters are similar to Heid’s and Jon’s.

Lori’s Chromosome 20 Super-Cluster

Here is the Chromosome mapping for my family on Chromosome 20:

This shows that Jim and Sharon (who I haven’t looked at yet) don’t have Frazer DNA in the area of the over-matching. Jim doesn’t have a super-cluster and I expect Sharon will not either. However, I also didn’t have an super-cluster. I did have a large Chromosome 20 cluster shown below that seems to be split in two:

Perhaps looking at the different sides of this ‘super-cluster’ will help explain what it is all about. That will be a future project.

Lori’s 11 Clusters Revealed

By looking at Lori’s matches’ names, I can get this far:

After that, I will have to look at cluster matches to see if they match my mother or not. Then I can check Lori’s chromosome mapping to get the right grandparent.

Lori’s Cluster 3 Example

Lori’s Cluster 3 matches have very German-sounding names. That makes me suspicious as I have no German on my paternal side – only on my maternal side. I pick a match with a good-sized largest match of 48.3 cM:

My guess is Rathfelder as that side is all German – though they lived in Latvia.

I was right:

Lori is the most likely of the six siblings to have a good Rathfelder side match in this location of Chromosome 3. Actually, I should have the same match as Lori. So she is most likely after me.

And the Answer Is…

Here is another bit of surprise in that Lori has no clusters with Lentz grandmother DNA. Lori also has more Rathfelder than Frazer clusters which  is unusual.

The Last Sibling, Sharon’s Clusters

First, I’ll look at the input/output for Sharon’s Clusters:

I had mentioned previously the effect that Chromosome 20 had on these matches as that was where the super-clusters were for all but Jim and Sharon. Here, we see that Jim and Sharon should have similar results as predicted above.

ID’s for Sharon’s 22 Clusters

Out of Sharon’s 22 clusters, these are the ones that had match names that I recognized:

Parental Phasing and Chromosome Mapping

For the rest of Sharon’s clusters, I’ll see if the matches are on my mother’s side or not and where the matches show on Sharon’s Chromosome map. I’ll start with Bobbijo who matches Sharon from Cluster 1:

Here is Sharon’s Chromosome 10 Map:

This doesn’t line up perfectly, but it is mostly over Sharon’s Hartley DNA. The match is from position 32 to 61M on Chromosome 10. There are a lot of crossovers in the area between 57 and 61M on Sharon’s Chromosome Map. Bottom line is that Cluster 1 is Hartley.

Sharon’s Cluster 2

Here is a match Sharon has with Anya at Cluster 2 on Chromosome 15:

This appears to be right before a pileup area:

I don’t know if that is significant. That is probably why the area before the match has the hatch marks. Cluster 2 matchAnya is on Sharon’s Frazer side:

Cluster 3

Sharon’s Cluster 3 has matches on different chromosomes. I recognize Patrick as a German cousin. I match Patrick on Chromosomes 6, 12, and 13. Here is Cluster 3:

Sharon’s first match is Ursula. Note that she matches everyone in the Cluster. The other people with yellow squares going right across the Cluster are Silvia and the last match – Patrick. Ursula matches me on Chromosomes 1, 12, and 22. My assumption is that the common matches are on Chromosome 12.

Cluster 7 and Cluster 2 Revisited

Note that Clusters 2 and 7 both match at the beginning of Chromosome 15:

Here is Cluster 7:

Valerie, Sharon’s second match matches everyone else in the Cluster:

  • Sharon matches Valerie on Chromosomes 10 and 15
  • Sharon matches the first person in the Cluster on Chromosomes 10 and 15
  • Sharon’s third match has the same last name as Sharon’s first match – they both match on Chromosomes 10 and 15
  • Sharon’s last match is on Chromosomes 1, 15 and 22

That means the common match must be on Chromosome 15

From what I can tell, Cluster 2 also matches on Chromosome 15. This begs the question as to why they are not all in one group. Is this due to intermarriage? Or is this due to over-matching aka pile-ups?

Sharon’s Cluster 9

Sharon matches Lisa from Cluster 9 mostly on Chromosome 7. That matches up with Sharon’s maternal Lentz side. I haven’t gotten many Lentz matches, so I built out Lisa’s tree. Turns out Lisa has a Lentz ancestor.

However, Conrad is Lisa’s 8th great-grandfather. That is going back far in time. Lisa’s ancestors go from a Linz to a Lintz to a Lentz. Whether this is coincidence or not, I cannot tell. Even Conrad does not link up with my ancestors. You can’t say I didn’t try.

Confusing Clusters 17 and 18

Sharon has two Donna’s and a Justin in Cluster 17. One Donna matches my mom, so would be on the Rathfelder side. The other Donna and Justin don’t match my mother and appear to be on the paternal Hartley side. I have a similar split on Cluster 18.

Sharon’s Cluster 22 and a Lancashire Tree

Cluster 22 had a match from England with no tree and a match from the US with an ancestor from England. As I am interested in my Hartley English roots, I thought I would look at Jill’s Hoyle tree and  build it out a bit.

Jill’s grandfather was John Richard Hoyle. He married Isabella Hargreaves in Accrington:

This shows that Isabella was living in Derby at the time they married.

Here is John Richard Hoyle Sr. in the 1861 Census. He was elderly at the time, but with a young son – also John Hoyle.

Of interest to me is that this family lived at Higher Booths, Goodshaw, Lancashire. I have traced one of my Emmet ancestors to Goodshaw.

Here is Goodshaw in relation to Bacup where many of my Hartley ancestors ended up:

Coincidence? I’ll continue on with the Hoyle tree. Here is the marriage of John Hoyle to Mary Lord:

John Hoyle is a widower. That means he was married before:

I assume that this is the same John Tailor, son of a John Tailor. Now I need to find another marriage for John. Here is another:

However, this marriage is in Bury. Here is another Bury marriage to a John the Tailor:

However, note that this John is son of James, so I will propose a guess that he was the father of the other John the tailor.

Here is Edenfield – not far from Goodshaw:

Shuttleworth is just to the South of Edenfield.

Tracing the Hargreaves Family

We saw above that Isabella Hargreaves’ father John was a tailor. This is likely John Hargreaves in 1851, before Isabella was born:

John and his wife were said to be born in Ropendale – maybe Rossendale makes more sense. The children who are just initials were born in Rochdale.

The family was living on Oldham Road in Castleton. Castleton is to the SW of Rochdale. My guess is that John Hargreaves and Elizabeth married about 1841 based on the age of the eldest daughter of 9.

This is the suggested wife of John Hargreaves from Ancestry:

This John was a sexton who is someone who takes care of a Church. If this is the right person, it means that he must have changed his occupation. This appears to be Isabella’s death certificate giving her parents’ names.

Fortunately, I was able to find John Hargreaves in the 1841 Census. This shows that he was married to an Elizabeth at that time. They were living at the same place they were living in 1851 – Oldham Road, Castleton:

The census was taken on June 6, 1841, so that narrows the birth of Mary.

This tells me that Mary Ellen was about 4 months old when she was baptized. When I put these records together, it appears that John Hargreaves was married to an Elizabeth. The lived at Castleton and had a daughter Mary Ellen there. Elizabeth died and John became a sexton in Burnley where he married Elizabeth Dobson. The family moved back to Castleton and John regained his Tailor business. He had at least three more children there with Elizabeth Dobson.

I may go back to this tree later.

Sharon’s Summary

  • I had problems in Clusters 17 and 18 due to matches on maternal and paternal sides.
  • I didn’t bother with Cluster 21 as the matches were small.
  • I built out a Lentz Cluster match’s tree and found a Lentz but the first name and places didn’t match up.
  • I built out a Cluster 22 match’s tree from England, and found some places where those ancestors lived that were similar to my ancestors, but didn’t match on the names. I got bogged down with the genealogy and may revisit the tree at some poin.

Summary and Conclusions

I have now looked at all of my siblings’ and my own MH AutoClusters. I have also looked at my mother’s results.

  • I was surprised to find that one of my sister’s autoclusters only cover two of her grandparents’ side DNA
  • I should be able to look at the results for my siblings and update the results for my mother and myself
  • In the past, with AutoClusters from other companies’ DNA results, I have used MS Access to compare the results. I did not do that analysis with the MH Cluster results. That would be a good cross-check.
  • These AutoClusters have given me places to look for common ancestors and birth areas, but so far, I have not found any new discoveries.
  • It was interesting to see the clustering effects of Genetic Affairs using different input parameters on my families’ DNA results.





My First MyHeritage AutoCluster Analysis

This is a busy time for genetic genealogists. Companies seem to be competing with each other to get out new products. Genetic Affairs has been a leader in clustering analysis. They perform detailed clusterings for AncestryDNA, FTDNA and 23andMe. Their MyHeritage (MH) cluster analysis is a little different as it is done within MH. One set report is created and the detailed file of the segment information is not sent.

I got 100 people in the clusters. The thresholds were between 30 and 350 cM. The matching threshold for my matches matching each other was set at 20 cM.  These clusters were done in the old way in that the largest clusters were firs and the smallest last. There is no clustering of clusters. A first look at the AutoClustering shows that Clusters 2 and 3 match each other. There is also some affinity between Clusters 2 and 3 and Clusters 7, 9, 11 and 12. Colors now repeat every 12 clusters as opposed to the previous 10.

Identifying MY Clusters

I am familiar with many of the names already and have written Blogs about them. I’m going to first look at the low-hanging fruit and put them into a spreadsheet:

I only found two grandparent (GP) lines. I notice German and Russian names in Cluster 4 which is probably Rathfelder GP.

Splitting Apart Clusters 7 and 17

I have been working on Frazer DNA for a long time. As a result, I may be able to split Clusters 7 and 17, Note that I have the same common ancestor in Clusters 7 and 17. However, that is for Emily and Paul. Gladys has a different common ancestor.


This shows that Gladys has no known McMaster ancestor. This means that Cluster 17 should not include McMaster. I don’t think that I know that Cluster 7 is McMaster for sure, but it is more likely McMaster.

Clusters 6 and 16

Clusters 6 and 16 don’t separate as easily:

Ron, Stephen and my family share Clarke and Spratt ancestry. However, at a generation or so further back, we all share McMaster ancestry. It would take finding another common ancestor from someone in one of the clusters to further separate these out. Cluster 16 is interesting because the two other people who match Stephen have their ancestry in England. It is likely that Clarke and/or Spratt had English roots. McMaster was in Ireland for quite a time, so Cluster 16 is not likely a McMaster Cluster.

Red Cluster 1

I can identify Cluster 1 – at least at my GP level. One of the matches is JL. JL shows up when I do a ‘One to Many’ query on my paternally phased kit at Gedmatch:

This corresponds with my Hartley GP Line:

In the past, I have associated this match with my Massachusetts Colonial Heritage. This heritage is through my Snell side:

Here I added England in the notes for Cluster 16 from my previous section. I added Hartley in blue with information back to my Snell and Bradford 2nd great grandparents. Both these 2nd great-grandparents have Massachusetts Colonial ancestors.

I Need a Fourth Grandparent – Maternal Grandmother

Beth is the one I call my anchor DNA match. I have blogged about Beth here.

The other two matches in Cluster 15 have ancestors from England. It would take a bit of sleuthing and research to find the connections.

I now show all four grandparent clusters.

All I have to do is figure out the other clusters.

AutoClustering My Daughter’s DNA

In my previous post, I took an initial look at Heather’s DNA. In this Blog, I’d like to look at AutoClustering Heather’s DNA. AutoClustering puts Heather’s AncestryDNA matches into groups or clusters. Then those clusters are grouped together. This makes it easy to see which matches go where.

Heather’s AutoCluster

Based on Heather’s number of 4th cousin matches or closer at Ancestry, I chose a lower limit of 20 cM and an upper limit of 600 cM. 20 cM is the limit AncestryDNA uses for 4th cousin matches.  Currently, Heather has 297 in that category.

Here are Heather’s 41 Clusters:

Some of these Clusters will be on her maternal side and some on Heather’s paternal side. The clusters with gray dots between them mean that these groups of matches match each other.

Heather has 283 matches in these clusters minus 22 that didn’t fit into any cluster.

Let’s Identify Some of Heather’s Clusters: Her Highest Matches

AutoCluster puts Heather’s clusters in the order of the match level. So the highest match in a cluster shows first. I’ll creat a spreadsheet for Heather:

This mimics the way AutoCluster lists its clusters with the highest match in the cluster listed first. I can recognize some of these names right away.

D.J. is a close relation to Heather on her mother’s side. I have not yet identified anyone on Heather’s maternal grandmother Cavanaugh side.

Next, I’ll sort by cluster, to get a skeleton for Heather’s clusters:

This brings me down to Cluster 21 or 22. I only identified one Jarek Cluster, but based on the gray dots between clusters, the Jarek or Polish relatives appear to go down as far as Cluster 14. I should have included Wozniak in those clusters.

The Bigger Picture

This shows that we have some gaps to fill in between clusters 23 and 39. I’m looking to locate some Cavanaugh ancestors. Clusters 23-39 would be one place to find them.

Cluster 32 – Cavanaugh Side?

I will be happy to find someone from Heather’s Cavanaugh grandmother side. Glenn from Cluster 32 has a tree:

I’d like to match Glenn’s tree above to Heather’s tree:

Ancestry puts Glenn and Heather at estimated 4th cousins. That means that if Heather and Glenn are in the same generation, then they will need to go back to Heather’s column starting with Jeremiah Warren and one row past where Glenn has gone.

I tried building out Glenn’s tree, but couldn’t find a connection to Heather’s tree. I think that I was on the right track as another person in the Cluster has a common ancestor with Julius Lafantasie and Emma Chamberland. So, no luck right now with Cluster 32.

Cluster 35

As I go down the clusters, I notice some match me or my siblings, so they are likely Hartley Clusters:

Donna has a 10 person tree in Cluster 35. Let’s see if that tree leads anywhere familiar.

Summary and Conclusions

  • Using Clusters, I was able to identify specific regions for three out of four of Heather’s grandparents
  • I had trouble pinning down Heather’s Cavanaugh grandmother side.
  • Once Ancestry has a chance to analyze Heather’s tree, it may make it’s own suggestions as to her Cavanaugh side.
  • One issue with the AutoClustering is that it requires you to build out a lot of trees to try to find connections. Then once the trees are built out it is very rare that a connection is made.




AutoClustering Aunt Esther’s Newfoundland DNA

In previous Blog, I looked at the autoclustering of my mother-in-law Joan’s DNA. Esther is Joan’s half Aunt. That means that Joan and Esther have a connection on only one of Joan’s grandparents. All of Esther’s four grandparents were from Newfoundland. I am hoping that the AutoClustering process will make sense of Esther’s Newfoundland DNA.

Esther’s AutoCluster

This is the overall chart:

The 54 clusters are difficult to see because Esther has 612 matches. I set Esther’s autoclustering limits between 30 and 600 cM and was a little surprised at how many matches Esther had at that level.

Esther’s Family Tree

There are a few holes in Esther’s family tree:

The Peter Upshall born 1800 above is also a guess.  I’m not as familiar with the Shave and Kirby sides as my wife is not related on that side. The Clusters should identify some of them.

Here is a spreadsheet that I will need to fill in.

My wife is at the top of the list with the largest match in Cluster 1. In a way that is not good because my wife will be related to two of Aunt Esther’s grandparents: Henry Upshall and Catherine Dicks. Perhaps that is why the Cluster 1 is so large. I will try another AutoCluster for Esher between 40 cM and 250 cM. That should be clearer. Also Marie’s niece Tina is the top match for Cluster 6. Tina will also share Upshall and Dicks matches. However, lowering the upper match limit to 250 cM will not solve all the problems. Even though Marie and Tina share both Upshall and Dicks, it is possible that many in the clusters will only have either Upshall or Dicks DNA. Or they will have more Upshall than Dicks or the other way around.

Esther’s Shared Ancestor Hints (SAHs)

At AncestryDNA, Esther has some Shared Ancestor HInts. Here is one:

Pat is a 2nd cousin once removed. Esther and Pat share the common ancestors of Shave and Burton. I was looking for easy answers but got thrown for a loop because Pat is in Cluster 1. She is in Cluster 1 with Marie who is not related on the Shave side. Interesting.

Here is some more of Pat’s paternal side lineage:

This tells me that perhaps Pat is in Cluster 1 because of her Upshall match and not her Shave/Burton match. That could mean that Margaret Upshall is a sister to Esther’s grandfather. If that is the case, then Esther and Pat may be 2nd cousins once removed on the Upshall side also. It’s a possibility.

A Kirby/Emberley SAH

Here Esther and M.B. are shown as 3rd cousins. AncestryDNA thinks they share enough DNA to be 2nd cousins, so something is going on. Not only that, M.B. is also in Cluster 1. Martha is the administrator for M.B. Look at Martha’s tree for M.B.

There is Upshall again. I have been in touch with Martha and we both agree that Peter is a pretty good potential ancestor. He was born to Sarah Upshall who was a single mother in Haselbury Bryan, Dorset, England.  So far, I’m thinking that there is more than meets the eye to these SAHs.

This Just In: Another AutoCluster for Esther

While I am thinking about the Upshalls in other SAHs, I’ll look at another AutoCluster for Esther. Things are still a bit muddy. I changed the lower limit to 40 and the upper limit to 250cM and got almost 300 fewer matches for Esther. However the picture is still muddy:

Esther is down to 33 clusters, but the grey dots between clusters represents crossover in ancestral lines. M.B. who was previously in Cluster 1 is now in Cluster 19. Changing the thresholds changes the delicate balance of the clusters and the relationship between the clusters apparently.

Which AutoCluster Version Should I Use?

It seems like Newfoundland genetic genealogy is already complicated enough. There are intermarriages of lines and missing lines. I have just put in for a third AutoCluster for Esther at the default thresholds of 50-250cM. I am hoping that those thresholds will simplify things.

Take 3 with Esther’s AutoCluster

You can’t say I’m not trying.

This looks more manageable with 20 clusters and 220 matches. I’m ready to rock this AutoCluster.

Cluster 1: Dicks?

My notes for many in this Cluster indicate the Dicks family. D.M. in Cluster 1 has a good match and Dicks on her maternal side:

I was able to build out D.M like this:

However, I have been proposing that Elizabeth Collier could be Elizabeth Crann. That is something to keep in mind. It looks like D.M. matches Esther on Kirby, Dicks, Dicks wife Elizabeth, Shave and Burton. That is quite a bit.

Cluster 14 – Kirby/Emberley

My notes for this Cluster say Kirby and Emberley. AutoCluster sorts the clusters by size of match and this cluster has the second largest match.

Cluster 8 – Upshall?

I’d like to make a guess that Cluster 8 could be an Upshall Cluster. There are a lot of high matches but not a lot of answers there:

I’ll make it a working theory. The first person on the list is Jane. I couldn’t see any connection to Esther in her tree. The second person James said that his grandmother was Laura Upshall.

Laura Upshall’s Tree

I found a Laura Upshall from England and a Laura from Newfoundland born in Harbour Buffet. So I chose the Laura from Harbour Buffet and built out a fast tree at Ancestry:

Assuming this tree is right, Esther and James are 2nd cousins twice removed with the common ancestors of Peter Upshall and Margaret Burton. While I’m at it, I’ll add Margaret Burton to Esther’s tree. The good thing about Laura’s tree is that I don’t see any Dicks in it. This could rule out Cluster 8 from being a Dicks Cluster. Here is what I have so far:

I still don’t see any Shave Clusters.

Another Cluster 8 Tree

Next down on the list of Esther’s matches on Cluster 8 is someone I call Hat. Here is what I think is his tree:

I think the person taking the test is the son of Ella Grace Upshall, but I’m not sure. Again, I don’t see Dicks in there which is good. One other thing is that these trees also have Shave. So that is a possibility.

Cluster 8: Shave Or Upshall?

One way to tell might be by comparing Esther to her half Niece Joan, my mother-in-law. Joan is related on Esther’s Upshall side but not her Shave side. The Jane that I couldn’t connect to Esther from Cluster 8 is in Joan’s Cluster 41. I had that listed as an Upshall Cluster for Joan. James is also in Joan’s Cluster 41. Finally Hat is in Joan’s Cluster 41, so that is three for three.

A Tree for Eileen from Esther’s Cluster 8

Christina has a short tree, but her mother’s Reid name looks like a possible Newfoundland name. I assume that Christina’s mother Eileen is the one that took the test. I see from the 1940 Census that Eileen’s father was born in Newfoundland, so I guessed right:

Will Flint, Michigan lead back to Upshall?

The answer is no.

I wouldn’t be surprised if Sarah Ann Dicks was born in Harbour Buffet as I couldn’t find records for her birth and Harbour Buffett records are poor. I have that William Reid was born in Harbour Buffett in 1811.

Here is a tree for Lorna in Cluster 8:

I don’t see Upshall here. But Margaret Burton may have married Peter Upshall and she may be the daughter of Charles Burton. She did name what appears to be her second son Charles. It would have been customary to name the wife’s second son after her father. I know, a lot of if’s.

Christina From Cluster 8 and Her Tree

Christina’s tree looks hopeful.

Here is Madge and family in 1935 St. John’s West:

I can’t tell if Hattie is the same as Ethie. Unfortunately, I wasn’t able to get much further than Christina’s tree.

A Possible Upshall Tree

Now that I’ve reduced the possibility of Cluster 8 being Shave, it is more likely an Upshall Cluster. I’ll build a theoretical tree for Upshall with theoretical but possible common ancestors Peter Upshall and Margaret Burton:


I put this out there to see if it makes sense genealogically and with the DNA evidence.

Summary at Mid-Point

Here is my spreadsheet so far:

Subject to change.

An Upshall in Cluster 11

Here is Barbara’s paternal side of her tree:

Peter and Alice Upshall married in 1916:

Here is a marriage for Henry Upshall to an Elizabeth Smith:

Henry was said to be living at Little Harbour at the time of the marriage.

Madonna’s Cluster 11 Tree

Madonna shows her maternal grandparents at Ancestry:

I recognized the Collett name and built out Madonna’s tree with some help from other Ancestry Trees:

It’s not my greatest tree as I didn’t build out Susan Collett. I see a record showing a Peter Collett marrying a Susanna Hann in 1905:

That gives me a new line for my horizontal Upshall Tree:

B,A. On Cluster 11

B.A. appears to have an Upshall on his tree. I say appears because there are many trees posted by B.A.’s administrator. I picked the tree that most looked like B.A.’s initials and it had an Upshall in the line:

Solomon Upshall 1921

In 1921 Solomon was living among many Upshalls in Little Harbour:

I wasn’t able to build out past Henry Upshall. I did note one Ancestry Tree had this:

I suppose that is possible.

Cluster 10 and Phyllis’ Tree

Phyllis is missing her paternal side, but her maternal side has some familiar names:

A lot of these names are beginning to sound familiar after a while.

Building out Phyllis’ tree:

Dicks is a common ancestor, but there are other possibilities. With these clusters, I am looking for trends. The clusters are saying to me, in a particular cluster the DNA says that you are more related within this group than outside of this group. So in a sense, the clusters may be clearer than what the genealogy is showing.

Another Cluster 10 Tree: Not All Trees Are Created Equal

This tree is better, in a way, than Phyllis’. Tha maternal side is England and Toronto. That leaves the paternal side:

I built out this tree and found some common ancestors:

This person goes by ‘it’ for short at Ancestry. It is 2nd cousin once removed to Esther. I prefer it’s tree because it is less ambiguous. It’s one Shave/Burton line is the one that is in Harbour Buffett where Esther’s ancestors lived. Where was Shave on Phyllis’ tree? Shave may have been on her paternal side that Phyllis didn’t show

Richard’s Cluster 10 Tree

I could use another tree to confirm, even though I am pretty sure of Shave/Burton already. Richard has a small, but high-grade tree:

The reason I like his tree is that maternal side and paternal side are shown. Also it narrows down to a name I know instead of expanding out to many ambiguous matches. I sort of cut off Lucy Shave. Sorry, Lucy. Richard’s Tree shows two lines of connections:

However, the closer Shave/Burton connection puts Richard also at 2nd cousin once removed to Esther. Cluster 10 represents Esther’s fourth grandparent Line of Shave:

A Shave/Burton Tree


Here is Esther’s Cluster 10 Shave/Burton Tree:

Cluster 4

Cluster 4 is next on the GeneticAffairs Report. Daisy is Esther’s first match with 177 cM. Her tree says that she shares the Dicks ancestral name with Esther.

Daisy has a good tree:

Daisy has Joyce and Dicks at her 2nd great-grandparent level above. Here are two more generations on Daisy’s Tree:

This shows Christopher Dicks and his wife twice. Daisy descends from Rachel and Robert Dicks. I’m sure there is a Crann connection also, but this should be overshadowed by the Dicks connections.

That means that Esther and Daisy are 4th cousins once removed twice on the Dicks Line.

Match #2 on Cluster 4 – Julie

Julie shows her two parents on her Ancestry Tree. My first attempt to build out Julie’s tree was a disaster. I think that Julie attached her DNAresults to her mother’s side. I was able to fix this by going into Julie’s tree and going down one lever from her mother. This worked better and I came up with a Newfoundland Tree for Julie’s paternal side:

None of the names sound familiar, but at least I’m in Newfoundland instead of Ireland. I built out Julie’s tree a bit but didn’t find a connection to Esther.

I was able to build out Julie’s tree a little more:

The tree has William Henry Dicks from England. That means that the match could go back to England or that a descendant of Christopher Dicks moved back to England and then back to Newfoundland.

I’m ready for a new cluster.

Cluster 12 – Bridget and bam

I’ll start with bam because he has Newfoundland ancestors in his tree. Here is my build-out based on some Ancestry suggestions:


There are a few interesting things about this tree. First, it is possible  that this Charles Burton could be an Uncle or father of Esther’s ancestor Margaret Burton born 1825. Also The Frances Dicks could be the Frances Dicks I have as daughter of Christopher Dicks. I have this tree, roughly based on DNA testing:

However, I see that the first George in the tree must be wrong. He should be in a later generation. Also there is a discrepancy on the birth date of Frances Dicks. I have her here are born 1811, but 1805 may make sense also.

That still leaves the question as to whether this is a Burton or Dicks Cluster (or something else!). I think I may be able to figure out the answer to that question, but not today.

Cluster 20

This could be the last Cluster for now. The top match with a tree is G,K. Here is a clue from AncestryDNA:

G.K. and Esther both have a Joseph Dicks in their tree. I had added in Joseph on Esther’s maternal line. She had a Jane Dicks there that I couldn’t place. The Dicks on Esther’s paternal side were easier to place.

My Theory on Joseph Dicks

I think that the Joseph Dicks in G.K’s tree and the one in Esther’s tree could be the same person. In G.K.’s tree Joseph is born in 1818 in Oderin and has son Michael in 1869 with Mary Murphy. She could have been a second wife. In Esther’s tree, Joseph is born in 1810 in Famish Gut and has Jane Ann Dicks with Mary Griffith in 1841. If I’m right, that would make Esther and G.K. half third cousins. I had that Esther’s Joseph descended from Christopher Dicks. However, the tree that I made for G.K. has Joseph’s parents as John Dicks and Mary Corbett. That may make more sense.

One point is that the tree I make for G.K. has Joseph Bulley Dicks born in 1818:

However, G.K. has Joseph born in 1849.

Jerome’s Cluster 20 Joseph Dicks Tree

I notice that Jerome follows G.K with a later birth date for Joseph Dicks:

It appears that Jerome is 2nd cousin to G.K and they both descend from different daughters of Michael Dicks.

Beth in Cluster 20

Beth in Cluster 20 also has a Joseph Dicks tree but with the earlier Joseph Dicks birth date:

Esther’s Cluster Summary

This is a start:

I’m sure that the more I work on this, the more it will come together:

In general the matches between clusters seem fewer as you go down and to the right. That would mean that if I am right with Joseph Dicks, then that is one of the more unique lines. Cluster 20 represents a Roman Catholic Line also, and I believe that most or all of the other lines are Church of England. I see that I already had a 14 and 15 Cluster label, so my newer label for Cluster 15 should refer to the lower right of the green box.

Summary and Conclusions

  • Looking at Esther’s 20 Cluster Report was helpful. It was also a lot of work to build out and analyze trees.
  • I forgot to mention the Crann connection in New Zealand. This is the small Cluster 2. I believe that the younger Christopher Dicks married Elizabeth Crann, so it may be fitting that the small Crann Cluster was next to the large Dicks Cluster 1.
  • The clusters help to focus on where to look when comparing trees. The clusters at least suggest that the ancestors should be along the same line as each other.
  • Clusters are a good place to try out theories on ancestors. The theory I had on Joseph Dicks seemed to play out well. From my previous Dicks DNA project, I had tried to connect Esther’s Joseph Dicks line and was unsuccessful. This would explain the fact that the Joseph Line appears to be differenrt than the Chirstopher Dicks Lines.
  • I hope to continue looking at Esther’s DNA clusters at some point and comparing them with her half-niece Joan’s. For example, I would not expect that Joan would be matching Esther’s Cluster 20 as that is Esther’s maternal side and Joan matches Esther on Esther’s paternal side.
  • A lot of the progress is from reviewing the matches’ trees, but the AutoClustering helps focus and direct the analsysis of trees.





Sorting My Mom’s DNA with AutoCluster

I already sorted my mom’s DNA with AutoCluster last week. However, since that time, Genetic Affairs has changed the look of their AutoCluster Chart. They now cluster the clusters which makes it easier to tell which ancestral groups go with which

My Mom’s Ancestry

Mom, Gladys’, father is German but his German ancestors lived for quite a while in a German colony in Latvia. His parents were Rathfelder and Gangnus. My mom’s maternal grandfather Lentz was also German but his ancestors had been in Philadelphia since the American Revolution. Gladys’ maternal grandmother was Nicholson. Her family moved to Philadelphia from Sheffield, England.

The First AutoCluster

My first AncestryDNA AutoCluster for my mom looked like this:

  • Thresholds: 20-600 cM
  • Matches: 323
  • Matches not used in clusters: 29
  • Clusters: 48

I started writing a Blog on the results, but didn’t finish. Here is a spreadsheet for the above chart:

These clusters were sorted by the size of the cluster and I didn’t identify the first three clusters.

Mom’s New AutoCluster Results

I expect the new results to be more organized and show where the groups of matches belong compared to the other groups of matches:

  • Thresholds: 20-600 cM
  • Matches: 330
  • Matches not used in clusters: 28
  • Clusters: 49

I used the same thresholds in the new AutoCluster run. The results were similar but now the clusters are organized. Here is the new spreadsheet:


I note that Elise and Rowena are in twice. I don’t know if that messes up the results. I didn’t show all the clusters as they go off the page.

Elise shows as being in Clusters 5 and 6 which doesn’t make sense. She doesn’t show in Cluster 5 but shows as a dark gray row to the left and above Cluster 6. Rowena shows as being in her own Cluster 15 which I don’t show above.

Unraveling the Mystery of Mom’s DNA

The unraveling the mystery of mom’s DNA involves trying to figure out which parts of her DNA go with which common ancestors. The common ancestors are the common ancestors of her common matches. Her common matches are grouped together and those groups are grouped together, so let’s get started.

Here are mom’s four grandparent lines:

These shown are the first and 2nd great-grandparent levels. By location, the top two grandparent are Latvia and the bottom two grandparent lines are Philadelphia and Sheffield, England.

Cluster 1: Nicholson/Ellis

Cluster 1 is easy. It is headed up by mom’s 2nd cousin Carolyn on the Nicholson/Ellis Line:

Cluster 38 – Rathfelder

Next, I’ll go all the way down to Cluster 38. I believe that this is a Rathfelder Cluster:

I may only have one Rathfelder Cluster with the two sisters, Astrid and Ingrid.

Mom’s Maternal and Paternal Clusters

The above two Clusters may have set the edge for Mom’s Clusters, but I’ll check in more detail later. Here is my assumption so far:

Again, this is a guess based on two clusters. I will need to check this out. I also will want to try to identify Lentz and Gangnus matches, if possible.

Finding Lentz

Lentz matches have been difficult to find. Here is the Lentz tree with some of the descendant who have had their DNA tested:

The left branch has the closer matches, but they are also half Nicholson. Here is Radelle’s mom at Ancestry:

This is a little confusing because Radelle took the test and her mom, Delores shows in the tree. I became suspicious when I saw that Delores died in 2011. Radelle is in Cluster 32:


I now have three of my mom’s grandparents. However, does that mean that Nicholson has 31 Clusters?

More Nicholson

I can fill in one Cluster with Nigel. He has a large match with my mom going back to 1765 in Sheffield, England.

I should have John Nicholson’s wife as my mom could just as easily be sharing her DNA. Here she is:

I’m getting stuck on my mom’s maternal side, so time to switch to paternal:

Otis and Cluster 39

Here is Otis:

Here Otis is 3rd cousin once removed and 4th cousin once removed on my mother’s Rathfelder side. This Chart describes Otis’ relationship to my mom as 5th cousin, once removed on the Gangnus side:

That means the Rathfelder side wins out (I think).

Otis and the Colony Effect

The Colony is effect is this. You put a bunch of Germans in a Colony in Latvia and they want to marry other Germans:

Here is Otis’ Cluster 39 in blue highlighted. Astrid is in the cluster above and to the left of Cluster 39. Otis is the top left match of the blue cluster. He also has shared matches with mom in other clusters below and to the right.

Doing Some Latvian Genealogy

I did a search for Latvia at my mother’s AncestryDNA match page:

Robert shows his maternal grandparents coming from Latvia. That means I could try to do some genealogy on Roberts tree if I want. Robert is also in Mom’s Cluster 45.

The All-Latvia Database

I was able to find the Resch family at:

This is a good web site for Latvian research.

The Latvians like to Latvianize names. So I don’t know if Retsch is a German name changed to Recs or if Recs changed to Retsch. I also found Zamuels birthplace and birth date. The last column is place of origin. This shows as Riga for father and son. I usually look for Irsu Pag. which is Hirschenhof. That would link with my ancestors.

Robert has that Alma was born in Dresden, Germany, so I’ll look to Mazur and Rosenbach. I couldn’t find Rosenbach in the list. I did find some Martin’s in the Latvia Inhabitant list:

The closest Martin has his dad as Jēkabs.

A Latvian Secret Weapon

I was ready to give up but remembered I had a book on the Gangnus family. If Robert is related to me through that family, perhaps I could make a connection there.

I left out the bottom where it says Darmstadt 2003.

I looked up Retsch in this book and found one reference:

This reference says that Samual was born March 22, 1872 which is close to the April 3, 1872 I had above. Now all I have to do is make the connections. I have a feeling that the connections go back a way. What the above says is that Samuel married Charlotte Alma who was born 2 March 1867. Her parents were Johann Georg Gangnus and Marie Jacobine Schilling.

I see what happened. Robert had Charlotte Alma Gangnus as Alma Magnus. That makes sense. When I first saw my mother’s grandmother’s name written, I think it was written as Youganis.

Gangnus Production Update

Now I have two Gangnus/Gagnus families:

The good news is that I was about to give up on the Robert tree and then I remembered my Gangnus book. The bad news is that I’m getting lost in all these Gangnus families. However, I am starting to see our trees coming together in a confusing and interesting way.

If I understand this correctly, Robert and I are double 5th cousins. Robert and my mother are double fourth cousins, once removed. The other thing is that Robert is related on my mother’s paternal grandfather’s and grandmother’s side.

In order to display this on my spreadsheet, I added another row for Cluster 45:

Summary and Conclusions

  • The new autoclustering look helped show where the clusters grouped with each other. I wasn’t able to identify many more clusters specifically, but now I know in what area they should belong.
  • I was able to make a guess where my mother’s shared matches went from maternal to paternal
  • I looked at some paternal clusters. However, intermarriage in Hirschenhof, Latvia made it difficult to nail down DNA to a specific grandfather in at least one case.
  • I was able to build out Robert’s tree. Robert was in my mother’s Latvian Cluster 45. I used the All Latvia on-line Directory and a book I had on the Gangnus family in Latvia. However, after all that work, Robert appears to be equally related to my mom on both my mom’s paternal grandfather and grandmother’s sides.






AutoClustering My Mother-In-Law Joan’s AncestryDNA

I’m excited about looking at my mother-in-law’s DNA. I tried autoclustering her FTDNA results but had a difficult time identifying many of her clusters.

Making Joan’s DNA Fun Again

When I first started looking at Joan’s DNA several years ago, it seemed like a lot of her matches resulted in common ancestors. Then later, I saw that there was a lot of inter-marriage going on in Prince Edward Island (PEI) where Joan’s two paternal grandparents came from. Let’s take a look at the Geneticaffairs AutoCluster for Joan:

That’s not very clear, is it? My previous autocluster reports were in the range of three or four hundred matches. This report is quite large, with about 650 matches. Large is good, but it makes the chart difficult to view. To get the Chart above, I used thresholds between 25 and 600cM.

Joan’s Ancestry

Joan’s ancestry is one-half PEI, 1/4 Newfoundland and 1/4 Nova Scotia. The records are poor for Newfoundland and the Nova Scotia relatives are a bit obscure.

The first column has Joan’s great-grandparents. Ellis through Hopgood are PEI. Upshall and Dicks are Newfoundland. Daley and Rhynold are Nova Scotia. Here is a guess on how Joan’s autocluster will look:

It would be nice to sort the Ellis from the Rayner in the top square. However, there is some crossovers in the families as you go back in time. I’m also curious to look into Joan’s Newfoundland and more obscure Nova Scotia ancestry.

Let’s Get to the Clusters

First I start with the Identifying Spreadsheet. This is to identify Joan’s 66 clusters – or to at least get a start on them.

This goes down to Cluster 42, because the results went off my screen. However Brian at Cluster 41 is important.

Brian’s Upshall Match

Here is an Upshall Tree. I think I have it right:

Brian is Joan’s 1st cousin once removed. However, they are only related on the Upshall side because Fred Upshall’s first wife died and he remarried and had Gertrude and Esther. I drew my big green box starting with Brian in Cluster 41 in my initial guess.

Joan’s Ellis Side

Joan’s Cluster Chart is headed up by E.E. Here is E.E.’s Shared Ancestor Hint (SAH) with Joan at AncestryDNA:

E.E. is in Joan’s Cluster 1 and is a second cousin to Joan. E.E. is the top left square in this cluster.

The higher matches are on the top left and the lower matches are on the lower right. The Shared Matches fade out a bit from the top left to the lower right. Most of Joan’s matches with Newfoundland ancestry can be found in this cluster. That should include more of the Dicks relatives than Upshalls.

Now I have two out of 66 clusters:

These might not be the best names for these clusters, but that is what I am calling them right now. Cluster 1 has 105 members and Cluster 41 has 101 members, so those two matches represent clusters that total to over 200 matches.

Joan’s AncestryDNA Circles and Her Clusters

Joan has 22 Circles at AncestryDNA. These Circles point to common ancestors and should help to identify Joan’s clusters. One of the more obscure clusters leads me to Gordon with Rhynold ancestry:

Gordon and many others are in Cluster 61. This probably represents the start of Joan’s Daley maternal grandmother’s side:

Cluster 61 has been bolded and the Upshall Cluster is shown in the upper right of the image above. These may be other Newfoundland Clusters between Upshall and Daley/Rhynold.

Daley represents 1/4 of Joan’s DNA but a smaller percentage of her actual matches. I have now defined the three main areas of Joan’s ancestry on the clusters. They are: PEI, Newfoundland and Nova Scotia.

Separating Ellis from Rayner

I have distinguished three areas of Joan’s ancestry. I have Joan’s Ellis, Upshall and Daley ancestry. Now I would like to separate the Ellis DNA from the Rayner DNA. This is a little difficult due to crisscrossing of Rayner and Ellis ancestry. Here is some of Joan’s paternal ancestry:

Back to the Circles

Here is a Rayner Circle from Ancestry:

There are 21 in this circle. Hazel is a match with strong confidence. Yet, she appears in Joan’s Cluster 1:

I do see that while Hazel has two Rayner Lines, she also has an Ellis ancestor:

It looks like Joan may be matching on this Ellis Line rather than the Rayner side. Confusing, isn’t it?

The Mary Watson Circle

Mary Watson was the wife of Edward John Rayner. If Edward was in Cluster 1, shouldn’t Mary be also? Or can AncestryDNA somehow separate the two?:

Joan’s first non-close family relative in the Mary Watson Circle is Esther. Turns out Esther is in Cluster 13. Hence, my question above.


Looking at Esther’s tree, I don’t see Mary Watson:

Perhaps it is more obvious through other trees.

One of the next matches to Joan in the Mary Watson tree is Mary-Ann. Mary-Ann is in Cluster 12. Mary-Ann has one non-private person in her tree who is not a Rayner and not a Watson. At this point, I can choose to trust Ancestry’s Circles or trust them. I’ll assume that there is something to the Circles and add Cluster 12 as a Mary Watson Cluster.

Here is Joan’s green Cluster 12 highlighted:

Let’s Try a Mary Yeo Circle

Here is Mary Yeo.

Mary is Joan’s third great-grandmother on her paternal Rayner side.

Wanda is a top match in the Mary Yeo Circe, but she is in Cluster 1. Wanda also has at least one Ellis ancestor. I am beginning to question some of these Ancestry Circles. However, to be fair, I have had trouble separating out Ellis and Rayner by hand, so I’m sure a computer program would have the same problems.

One More Rayner Side Circle: Amelia Watson

Ronald is a top match in the Amelia Watson Circle. He has Gorrill, Hopgood and Watson ancestors. He is also in Cluster 7. Hmm…

An Additional Ellis Cluster

Kath is in Cluster 4:

However, Kath is in the Pring Circle. The Circles are confusing me right now, so I’ll have to ignore them. Note that Kath has two Shared Ancestor Hints (SAHs). Here is the second:

I suppose that is how Kath got into the Pring Circle. Fortunately both these ancestors are on the Ellis side. From the above, it appears that Richard Gorrill Married two Newcombe sisters. I’ll record this in my spreadsheet like this:

This shows that I have five PEI Clusters identified out of what appears to be a total of 40 PEI Clusters.

One More Cluster – #19

There is always one more Cluster to Identify. My next strategy is to look down the list of clusters from my AutoCluster Report:

I have a few notes for Heather and L.M. that indicate that they should be on the Rayner side.

More on Newfoundland DNA

I have written many Blogs about Dicks and other Newfoundland DNA. I will look into those matches now.

Crann DNA

Joan matches other with Crann DNA. Heather is from New Zealand and Joan and Heather’s common ancestors are likely Henry Crann born 1757 in Netherbury, Dorset, England and his wife Elizabeth Collens. This is a case where the DNA gets ahead of the genealogy. Heather is in Joan’s Cluster 46

Building Out Terrence’s Tree

Terrence is also in Cluster 46 and has a tree with four people. I am curious about his tree as his mother is a Crann. I have avoided building out any trees in this Blog, so I will build one out now:

This is Terrence’s mother’s grandfather’s line going right back to Jenry Crann and Elizabeth Collens. One interesting thing about this tree is that I have Richard Crann being born in Harbour Buffett where Joan’s Newfoundland ancestors lived.

Tyler Also from Cluster 46

In addition to Terrence is Tyler. I don’t have to build out his tree. His tree also goes back to John Crann. When I put Heather, Terrence, and Tyler in a tree, I get this Cluster 46 Crann Tree:

R.N. From Cluster 46

R.N. is Joan’s last match at Cluster 46 (at the threshold that I set). Turns out R.N. also has a tree on the New Zealand Branch:

Now Joan has symmetry in her Cluster 46 between Newfoundland on the left and New Zealand on the right.

Where is Joan in Cluster 46?

That is the problem. I don’t have good records for the match. I had proposed that John Crann had a daughter named Elizabeth who married Christopher Dicks.

The problem with this theory is that I don’t have any paper evidence. I already had Tyler in this tree, but I am missing Terrence. He needs to be added in. I note that at Ancestry all the trees that have a name for Christopher Dicks wife have Elizabeth Collier. There is one researcher who has Christopher’s wife as Elizabeth Crann but has no parents for her.

Summary and Conclusions

  • With the AutoClustering technique, I was able to break down Joan’s DNA into her three ancestral regions.
  • I had some difficulty in splitting Joan’s PEI Ellis and Rayner grandparent clusters. This may be partly due to a fairly high 600cM top limit for the clusters.
  • I wonder if I lower the top number will I get more clusters. There were a lot of people in the two main Ellis and Upshall Clusters.
  • I focused on one small Crann cluster with small matches but good trees. This cluster added to my previous work where I propose the Elizabeth Crann is the wife of the Christopher Dicks born about 1812.