Comparing My AncestryDNA AutoClustering To Two of My Siblings’ Clusters

I looked at my sister Heidi’s AncestryDNA clusters here and my brother Jon’s here. I used the bottom level of 25 cM and top-level of 600 cM for their matches. This resulted in 20 clusters for Jon and 23 clusters for Heidi. This is what my clusters look like at those same levels:

I have 37 clusters. These clusters are proportional to the number of 4th cousins or closer that we have a AncestryDNA.

In my previous tries at looking at my clusters, I chose match levels that resulted in first 5 clusters and then 76 clusters. 37 clusters seems like a good number, plus it will give me a good comparison to my two siblings.

Starting to Identify My Clusters

The thought behind clusters is that a group of clusters probably indicates a group of ancestors that are all along a particular line. I have sorted my clusters by DNA match. I have my top match in the cluster followed by the match amount in cM. Then I have the cluster number, my grandparent line and then common ancestors or other notes:

I continued the cluster numbers down in the order of their match levels. Cluster 17 is out-of-order in a sense. There are only 4 people in the Cluster 17 and one is my maternal 1st cousin’s daughter. So those matches could be on either side of my maternal grandparents.

Cluster 10 and Gladys

Here is how I am related to Gladys:

That’s assuming that I have this tree right. Gladys is my third cousin, once removed. We have a double Frazer ancestry. That makes me wonder how I am related to the other people in Cluster 10:

Ancestry shows me at the same relationship to Gladys as with the rest of the group. However, I’m not so sure about that.

I tried building out Debra’s tree and found some Irish ancestors:

Little and Burns were both from Ireland if I drew Debra’s tree correctly.

Here is John’s tree:

John appears to have two Burns Lines in his tree. So that is something to keep in mind in case this is more than a coincidence.

It looks like my top 5-10 clusters are all on the Frazer side.

Some Relatives from Russia?

My Rathfelder ancestors lived in a German Colony in Latvia. In Cluster 11, I have some relatives that were from a German Colony in Russia about 1,000 miles away from Latvia. I have read of at least one connection between the two colonies. These are my top 20 clusters. I seem to favor the Frazer side as more than half of the clusters are Frazer clusters.

Diving Further Into the Unknown Clusters

The next Cluster is 30. One of the two people in that Cluster, Howard, has a tree with this person:

This looks like the same Hannah that I have in my tree:

Hannah’s grandparents would be Howard and my common ancestors: Samuel Snell and Mary Head.

Cluster 16 and a New Ancestor Discovery

I can see this from this table from AutoCluster:

A note that I had put under my AncestryDNA match Bobby turned out to be helpful.

I have no idea who Seymore is and have no known ancestors in this area of the country. I suspect that we may have common ancestors in England or Ireland. From what I can tell, this match is on my Frazer grandparent side.

My Last Seven Clusters

Here is a summary of all my clusters:

My Clusters Compared to My Brother’s and Sisters’ Clusters

I wold like to see how my clusters compare to my brother Jon’s and my sister Heidi’s. It looks like my matches tend to the Frazer side. The process was a bit annoying, so I took the data files into MS Access and compared them there. I came out with this comparison:

This shows where my clusters are equivalent to Jon’s or Heidi’s.

Here is the same chart by match size:


I match Cheryl at 69.1 cM but her Cluster has no match with Jon or Heidi.

This shows that I split Heidi and Jon’s Cluster 1 into two. They are now my Clusters 2 and 1. Likewise, I split Jon’s Cluster 20 and Heidi’s Cluster 15 in two. They are my Clusters 21 and 15. One theory is that I am related on both common ancestral lines and Heiid and Jon are related on only one. My assumption is that my Cluster 15 is related to my Cluster 21. I split Jon’s Cluster 6 into my Clusters 10 and 28. However, I don’t match Heidi on my Cluster 28. I had already determined that My Clusters 10 ane 28 involved the same Frazer couple.

My maternal Cluster 17 is described by Jon’s cluster 8 and Heidi’s Cluster 14. These are not helpful because they are on my mother’s maternal and paternal sides. I need to lower the thresholds so Taylor does not show in the matches. She is a 1st cousin once removed, so she is related on both of my maternal grandparent sides.

Summary and Conclusions

  • I have 37 clusters. I matched Jon on 14 of my clusters and my sister Heidi on 19 of my clusters. This sounds about right as we should match half our our sibling’s DNA and I have more clusters than they do.
  • I liked running AutoCluster with a top cuttoff of 600 cM to get an idea of how to sort the clusters. However, once those clusters are sorted, it is good to lower the top cutoff. I lowered my top cutoff for my next run with my sister Sharon to 300 cM and got good results.
  • I like sorting the clusters by match size. This should put the more recent matches that are easier to identify at the top of the list.
  • I like to compare my clusters to my siblings’ clusters to see where I match amd where I don’t. I was also able to see where my clusters split my siblings’ clusters in two in some cases.

AutoClustering My Sister’s AncestryDNA

It seems like AncestryDNA is best suited for AutoClustering. Which is good, because many people have tested at AncestryDNA. In my previous Blog, I autoclustered my brother Jon. I was able to cross-reference his clusters to ones I had found for myself. In some cases there was no cross-reference. In some cases, my brother’s clusters helped identify my own clusters. In this Blog, I’ll look at my sister Heidi’s clusters at Ancestry.

Heidi’s Clusters look like this:

I have left out the names on the top and left for privacy. I like using 600 cM for a top limit and 25 cM for a bottom limit. For Heidi, this gives her 23 clusters. Heidi has 403 4th cousins or closer. My brother Jon has 381 4th cousins or closer at AncestryDNA and he had 20 clusters using the same upper and lower match limits that I used for Heidi.

Nigel – a Non-Clustered Match

First, I’ll mention Nigel. He is the first one on the AutoCluster Report who is mentioned as not being clustered. I think that this is significant. Nigel matches Heidi at 66 cM. This is a very high match for a 5th cousin once removed. Here is the Shared Ancestry Hint between Nigel and Heidi:

The match is high for our family, but not with other descendants of this couple. As a result, Nigel and Heidi are not in a cluster.

Clusters By the Numbers

By this, I mean that I like to look at the highest matched clusters first. These are easiest to identify. Cluster 1 has the most people in it and the closest matches. This is because I have a lot of second cousins from my prolific Hartley/Snell great grandparents.

Heidi’s Clusters 1, 14 and 7

Here Heidi’s results are below and my brother Jon’s are above. What is interesting is that the top matches in Heidi’s and Jon’s first clusters are the same. However, for the Taylor match, the clusters point to different grandparent lines. This could partially be because Taylor is the daughter of our first cousin. Taylor matches us on both maternal grandparent lines.

Here is a tree with Nigel who I mentioned above:

Taylor is Cindy’s daughter. I find it interesting that there is a Cluster 14 and 7. Cluster 7 is Nicholson, but not Lentz. Cluster 14 is Nicholson and Lentz, but as Cluster 7 is already Nicholson, does this mean that Cluster 14 favors the Lentz side?

Heidi’s Clusters 10, 5 and 2

Heidi already has more maternal clusters than my brother Jon. Gladys is an interesting match. The common ancestors between Gladys and me were both Frazer’s. From what I can tell two first cousin Frazer’s married each other.

Heidi’s Next Three Clusters – More Obscure?

One would expect the clusters to represent more obscure common ancestors as the match levels go down.

Here are the common ancestors for one of the people in Cluster 15 (William McMaster and Margaret Frazer):

This goes back to about 1790, so back to my 4th great-grandparents.

Here are my Parker/Hatch 4th great-grandparents:

They lived in Nantucket and Isaac had a whaling boat repair business there.

Cluster 9 goes into a black hole where I am stuck. This is likely on my Clarke or Spratt Line. Cluster 9 is also Heidi’s 9th cluster by size and already I am getting stuck identifying the ancestors.

That makes sense, though, because Jane Spratt above is my 2nd great-grandmother and I don’t know who her parents were. Two more generations out from Spratt would result in 3 new surnames that I don’t know about (or could only make guesses at).

Heidi’s Clusters 16, 17 and 18

These next three clusters came in order:

Anthony Snell is interesting as he fought in the US Revolutionary War. I don’t have specific common ancestors for Clusters 17 and 18. This brings us past the halfway point for Heidi’s clusters.

More Clusters for Heidi – The Brick Wall Zone

The bottom clusters for Heidi should be in the area where I am stuck on the genealogical paper trail side.

The question marks show that I am not sure who the common ancestors are for the above clusters. I have done some work on Heidi’s Cluster 21 matches. Here is my best shot at finding common ancestors at Cluster 21:


Here are the rest of the clusters:

In my brother Jon’s clusters, I only saw two maternal clusters out of his 20. Here Heidi has 7 maternal clusters out of her 23.

Here is how Heidi’s clusters compare with my brother Jon’s:

10 out of Heidi’s 23 Clusters had no corresponding cluster with her brother Jon. Two other of Heidi’s clusters (14 and 11) were not a perfect match with one of Jon’s clusters.

Summary and Conclustions

  • Heidi had about 30% maternal clusters compared to her brother Jon’s 10% maternal clusters
  • It was interesting to look at the specific ancestors that were in the clusters (when I was able to identify them). I was able to identify 10 ancestral couples
  • Many of Heidi’s clusters were not equivalent to her brother Jon’s clusters. This means that it is helpful to look at the different results for the different siblings.
  • Heidi’s clusters offer another piece of the puzzle in breaking down some of my family’s genalogical brick walls.




Back to the 1700’s With Joyce’s DNA

I was looking at Joyce’s Shared Ancestor Hints today at AncestryDNA. Joyce is my father’s first cousin. Here is an interesting match that Joyce had with Skylar:

This shows that 7 generations ago, Joyce and Skyler had the common ancestors of Samuel Snell and Mary Head. Samuel was born in 1708 and lived most of his life in colonial times. Samuel’s father was also a Samuel. He ran a tavern in Newport and is mentioned in many court cases. Here is a court case where the younger Samuel is also mentioned:

Samuel Snell of Newport, vintner, vs. Thomas Huxham of Newport, butcher, in the custody of the sheriff, for £17:2:9 due by book for money paid, wood, and drink sold and delivered and work done by book for money paid, wood, and drink sold and delivered and work done by the plaintiff’s son and servant Sam at sundry times beginning 16 October 1725 and ending 14 September 1726. Writ dated 16 February 1726[/7]. Accounting dated 8 March 1726/7 included a cord of walnut wood at 12s, money “paid John Platts on your account … my son Samuel helping you,” etc. Credit: mutton, veal, beef, etc. Several bills in the file.

The son and servant Samuel would have been 17 or 18 at the time of incidents mentioned above.

Colonial DNA

Along with the genealogical match there is also a DNA match to Skylar. I found Skylar also posted at Gedmatch. He and Joyce have this match:

Painting Joyce

This DNA can be painted to Joyce with a web tool called DNA Painter.

This is what Joyce’s colonial DNA looks like. This DNA is from Joyce’s Maternal side, so it is painted on the pink part of her Chromosomes 7 and 12. This is less than 1% of Joyce’s DNA. The further back in time the matches are, the smaller the matches are.

Some of Joyce’s English DNA To Go With the Colonial

Here is another of Joyce’s matches. I am more interested in her maternal side as that is where I am related.

This is a closer relationship. James Howorth was born about 1768 and lived in Bacup, Lancashire County, England. Anne is from New South Wales and is a 4th cousin once removed to Joyce.

This gets Joyce up to 1% painted:

Joyce’s Paternal Side

This is the side I’m not related on:

Here is Joyce’s 4th cousin. They appear to be related three different ways, but i’ll just pick the closest relationship. This appears to be Sumner at Gedmatch. Here is the DNA that Joyce and Sumner share at

Now Joyce is all the way up to 2% painted:

The light green didn’t show up well, but it is on Joyce’s paternal side. It is possible that some of these segments could go to Joyce’s other common ancestors with Summer, but that would have to be sorted out later.

Sumner and Joyce Have an X Chromosome Match

Summer and Joyce also match here:

The X Chromosome is interesting as it can only be inherited certain ways. Here is Sumner’s maternal side:

I have circled the likely path of X Chromosome inheritance for Sumner. The X could not be from Philip Winslow as the father does not pass down an X Chromosome to the son. Therefor, it is likely that this match comes from Lucy Chase.

This gets Joyce up to 3% painted. However, I have made a mistake as Lucy is on Joyce’s paternal side.

Summary and Conclusions

  • I started painting Joyce’s DNA
  • I painted two maternal matches and one paternal match
  • The paternal match (Sumner) also had an X Chromosome match with Joyce. This made it possible to trace that match to one likely ancestor instead of an ancestral couple.
  • All this DNA is from people who were born in the 1700’s.

AutoClustering My Brother at Ancestry

AutoClustering fans are happy that Genetic Affairs has the AncestryDNA autoclustering working again. I ran a report this morning for my brother Jon. I used an upper limit of 600 cM and lower limit of 25 cM. This gave me a manageable 20 Clusters.

I had been trying to get a similar autocluster for myself, but had trouble getting it work for me. First, I notice that there appears to be a connection between Clusters 1 and 2 based on the grey squares.

Clustering By Size

I like to cluster by match size. That means that I sort my cluster list by largest match:

I push the cM arrow twice. This should put the arrow pointing down which will put the larger matches on the top. The highest match in this case is also Cluster 1 with the most people in it. Many of these people are my Hartley/Snell relatives who have tested at AncestryDNA.

After that, I see my Clusters 8 and 4.

Clusters 1, 8 and 4

Cluster 1 is easy. This has many of my Hartley 2nd cousins. They descend from Hartley and Snell. I know one of the more distant relatives in this group descends from the Snell side only. The Snell side gets back to Colonial Massachusetts. My second great grandfather Isaiah Hatch Snell was born in 1837.

The top match in Cluster 8 is my 1st cousin’s daughter:

That would normally only identify this Cluster as maternal. However, in this case, I know that I am related to Otis on the Schwechheimer and Gangnus Lines. These two families lived in a German Colony in Latvia, where some of the families intermingled. Our common Schwechheimer ancestor was born in 1772.

Cluster 4: Nicholson/Ellis

This Cluster is lead by Carolyn. I have been in touch with Carolyn and Joan and know that they both descend from Nicholson and Ellis. They were both from Sheffield, England on my mother’s side. William Nicholson was born in 1836.

Here is a summary so far:

This is good news. Out of the top three clusters, I have three out of my four grandparents represented. I know common ancestors.

The Next Three Clusters: 9, 6 and 18

Cluster 9 gives me my fourth grandparent side. The match is with Ron. Our common ancestors are Clarke and Spratt on my Frazer grandparent side. Our common ancestor Thomas Clarke was born about 1823.

Cluster 6 is on my Frazer/Frazer side. Clarke/Spratt is from the mother of my Frazer grandmother’s side. Frazer is from her paternal side. This line goes back a ways, but it has been well researched.

Cluster 18 has only two people in it, but it is a great cluster as it represents my Pilgrim ancestry. The first match in the Cluster and I descend from Harvey Bradford, who is a descendant of William Bradford from the Mayflower. Harvey Bradford was born in 1809.

Here is a summary of Jon’s top six Clusters:

The pink represents maternal and blue is paternal. Frazer/Frazer means that I had two Frazer ancestors who married each other.

Clusters 5, 10 and 14

At some point these Clusters will be more difficult to nail down.

Cluster 5 appears to center in on my Parker ancestors who lived on Cape Cod and Nantucket.

Cluster 10 has some Spratt names. This name is my biggest brick wall. My Spratt ancestor died young in County Sligo, Ireland and I can’t find much information about her.

Cluster 14 is not obvious to me. YK and John have a shared match with Gladys from Cluster 6. The third person has a Frazer tree. I would say that Cluster 14 is another flavor of my intermarried Frazer Lines.

So while Cluster 14 was not obvious at first, I was able to figure it out through Shared Matches.

Clusters 7, 11, and 2

I am now getting deeper into the less obvious clusters.

Some people in Cluster 7 match Ron. Ron and I share Clarke, Spratt and McMaster heritage back in Ireland.

I have been in touch with Patricia from Cluster 11. She has uploaded to Gedmatch. The match is definitely on my Frazer side and that should hark back to Ireland. My guess is the Clarke/Spratt Lines.

Cluster 2

Cluster 2 is a large one with connection to my Hartley 2nd cousins in Cluster #1 based on the gray squares. Just because there are many in a cluster does not mean that the cluster is easy to identify. This is the 12th cluster by size of match. There are 18 members in the Cluster. Peter has the highest match to Jon. Peter also has 62 Shared Matches at AncestryDNA.

Next, I’ll look at some of the trees from Cluster 2 Members. Candy has this ancestor in her tree:

This is her only listed ancestor in the area where my colonial Massachusetts ancestors lived. Looking at another Ancestry Tree, I find these parents for Betsey:

I see only one Swift in my genealogical list, but many Wing’s. So that is a possibility.

Another Cluster 2 person has Wing in his ancestry and other surnames from the area around SE Massachusetts where my ancestors lived.

Cross-referencing Jon’s Cluster 2

Next, I’ll look at my Clusters to see where Jon’s Cluster 2 people are. Peter is Jon’s top match in Cluster 2. Peter is in my Cluster 1. In my previous Blog, I identified my Cluster 1 as my Colonial Massachusetts matches. In fact, the first 12 in Jon’s Cluster 2 are in my Cluster 1.

William is in my Cluster 1, but falls below the 25 cM level for Jon. William also has a Wareham ancestor:

There are other possibilities.

Here is my 8th cousin Linda from my Cluster 1:

According to Ancestry, Linda and I match at 23.8 cM and we are 8th cousins with common ancestors in the 1660’s. Right now, this couple is as good a guess as any other.  However, this couple is out nine generations from Linda and me. At that level, I would have 32 couples that would be possibilities. These 32 are just my Massachusetts Colonial ancestors who lived around that time.  All I have to do is disprove the other 31 couples or link my Cluster 1 members or Jon’s Cluster 2 to Finney and Warren.

Here is a summary of my top 12 Clusters:

At this point, I could give up or forge on into the unknown.

Forging On Into the Unknown with Clusters 3, 19 and 12

I’m at a loss for Cluster 3. For one thing, this is my brother Jon’s Cluster and I don’t have many notes on his matches. Perhaps a cross-reference to my clusters would help. Unfortunately, none of the people in Jon’s Cluster 3 are in any of my clusters. It’s a mystery. I suppose autoclustering more siblings may help.

Kitty from Jon’s Cluster 19 is in my Cluster 24

Bonnie is in Jon’s Cluster 12. Again I don’t see any of Jon’s Cluster 12 members in any of my clusters. Bonnie has a Hulme ancestor from Manchester, England that might be worth pursuing.

Jon’s Last Five Clusters

I recognize Jon’s Cluster 20. One member has a McMaster ancestor that I believe is related on McMaster and Frazer sides. If I am right, our common ancestor William McMaster was born about 1790.

Cluster 13

None of Jon’s Cluster 13 members match my clusters. Fortunately Catriona who has a private tree is on Gedmatch and I can tell she is related on my Frazer grandparent side.

Cluster 15

Jon has a Shared Ancestor Hint here, so that makes things easier:

This match is also part of a Snell and a Luther Circle at AncestryDNA. This is another of Jon’s Clusters where I have no members in my clusters.

Cluster 16

I don’t see anyone in Jon’s Cluster 16 that is in any of my clusters.

Jon’s 20 Cluster Summary

By Cluster:

Comparing Jon’s Clusters To MIne

I was able to cross-reference Jon’s clusters to mine in most cases. However, 30% of the time, Jon’s clusters were not found among my clusters. Also some of Jon’s clusters that I was able to decipher more or less, I had not figured out on my clusters. Finally, Jon has a match with someone who goes back to our most recent male Bradford. This is a match that I don’t have, but the cluster is one that has been identified.

Summary and Conclusions

  • I autoclustered my brother Jon’s matches at a lower level of 25 cM and upper level of 600 cM. That was a good level for Jon and resulted in 20 Clusters
  • I looked at Jon’s clusters starting with the largest matches. The higher match clusters were easy to figure out. At about halfway down the list, the common ancestors began to get more difficult to figure out.
  • I was able to find many common ancestors. I tried finding common ancestors for one of my Colonial Massachusetts clusters, but that was difficult.
  • Many of Jon’s clusters with matches near the last half of Jon’s list had no corresponding cluster for my matches. I found this to be interesting. This would lead me to look at more of my sibling clusters.
  • 18 of 20 (90%) of Jon’s clusters were on his paternal side.
  • Finally, I cross-referenced Jon’s clusters to many of my own clusters. This showed where Jon’s clusters did or did not match mine. In some cases, Jon’s clusters identified some of my own clusters that I had not figured out yet.



Ancestry AutoClustering Back in Action

I noticed that the Genetic Affairs Facebook site had a recent post. They said that as a Christmas present Ancestry AutoClustering was back in operation with some new controls to limit problems with the autoclustering. Ancestry AutoClustering has been popular. That is because AncestryDNA has the largest database of DNA-tested people but they are lacking in analytical tools.

My AncestryDNA AutoCluster

When AutoCluster first came out, I tried it at the low default settings. I wrote a Blog about those results here. Here are my annotated results:


I was impressed with the results and even though my clusters were small based on the default parameters, I liked the simplicity of the five clusters.

Here is my latest try at autoclustering. Now I used defaults that were 600 cM on the high end and 9 cM on the low end:

Now I have gone from five clusters to 76.

My Genealogy and Deciphering Some of the 76 Clusters

This tree goes to 16 branches. I suspect that 76 branches could go back at least two or three more generations than above. I have a lot of Hartley relatives as my great-grandparents had 13 children. My great-grandmother Snell had colonial Massachusetts ancestry. That means that I have a lot of 2nd cousins.

My Hartley 2nd Cousins

My Hartley 2nd cousins are not found in Cluster 1, but in Cluster 4:

These are my top 13 clusters. In my previous analysis, the present Cluster 4 was #1. By expanding the matches out to more distant matches, the new Clusters 1-3 beat out my former #1 Hartley 2nd cousin Cluster. Along with my 2nd cousins in Cluster 4 above are a few more distant cousins.

Massachusetts Colonial Ancestors – Cluster 1

Cluster 1 appears to include many of my more distant Colonial Massachusetts ancestors going back past my Hartley side. My closest match in Cluster 4 is my father’s cousin Joyce. My closest match on Cluster 1 is Jonathan – a relative of Joyce. Previously, Jonathan was in my old Cluster 1 also. Now he is a ringleader for my Colonial matches.

Other than Jonathan, I cannot pinpoint exact common ancestors for matches in Cluster 1 at this point.

My Largest Matching Clusters

Next, I am going to change my strategy. I will now sort by match on my Cluster List:

I clicked on the cM button until the arrow was pointing down. This gives me the clusters with the largest matches. Hence, the matches that I am likely to know about. The highest matching cluster is #4. #12 is the 2nd highest match. That is because it includes my 1st cousin’s daughter (on my mother’s side). That means that Cluster #12 could be either on my mother’s mother’s side or my mother’s father’s side.

The next Cluster by size is #1 with Jonathan.

Cluster 16 – Nicholson

The next cluster by size goes off my present image, so I will need to ratchet down the image. Cluster 16 has only nine people in it, but I have been in touch with many of them. The known people in this group descend from William Nicholson and Martha Ellis:

Cluster 27 – Clarke

Cluster 27 is important to me. Clarke is my largest brick wall. I will have to go down yet another level for Cluster 27.

I’m starting to use the Key for these higher number clusters.

My Top 23 Clusters by DNA Match Level

Here are my top clusters by match level in a spreadsheet:

This shows that the highest matches are on the paternal sides and on that paternal side, most of the matches are on my Frazer grandparent side.

I can also sort by cluster:

This shows that I am missing Cluster 3 even after looking at my top 23 clusters.

Cluster 3 – Mom’s Side

That makes me curious about Cluster 3. From the match list, I see that the top match is at 27.1 cM. This person has a large private tree, but hasn’t logged in to Ancestry for over a year. This group of matches is a bit of a mystery. I know that this cluster is maternal and probably the Lentz rather than the Rathfelder side as the Rathfelder matches are on the rare side.

Old Cluster and New Cluster

My original AutoCluster was done at conservative default levels and resulted in five clusters.

The old Cluster 1 is found in new Clusters 1 and 4. 2 is now 6 and 27. 3 is now 11. 4 is now 17 and 71. 5 is 19.

AncestryDNA Circles

It occurs to me that it would be helpful to compare clusters to the AncestryDNA circles. Here are my circles:

Nicholson, Ellis and Lentz are maternal and the rest are paternal. Nicholson and Ellis are both in Cluster 16. This points out an error I made on my spreadsheet:

I previously had my Nicholson Cluster as 11 and it should have been 16. My mother’s Lentz circle was emerging and the few matches were either not matching me or too low to be in a cluster.

The Mary Pilling Circle is interesting as this goes back to England. However, those in the circle who are not my 2nd cousins are a match to the circle and not to me.

Descendants of Anthony Snell Circle

I have a similar problem here. There are two people who are not second cousins to me that match me by DNA, but they match at levels below 20 cM. If I check the shared matches of one of these matches, I see that he matches Fred from Cluster 30. That is perhaps a hint as where I may find a common ancestor with Fred. Shared matches with another person in this circle also lead me to a three people who are in Cluster 30.

I believe that the Betsey Luther circle is somewhat redundant. She was the wife of Anthony Snell. Finally, the Churchill Circle. I match second cousins and others in the circle match those second cousins or closer matches. If I run clusters for others in my family, these relationships may be helpful.

This shows that three of my circles are associated with my second cousins in Cluster 4. Shared matches from the Snell circle brought me down to Cluster 30. The two circles for my mother’s side were the husband and wife Nicholson and Ellis. My mother had another Lentz circle but the matches were too low for me. When I look at my mother’s matches, I may find closer matches.

NADs and AutoCluster

NADs are New Ancestor Discoveries. Here are my NADs:

I have no idea who these people are.

The Long NAD

Here are some of the people in the Long NAD:

The orange indicates a match to me. So these are like circles or clusters also. The only difference is that these NADs are pointing to ancestors that I don’t know about. I may not know about them because they may not be my ancestors or the ancestors may be further back than the ones Ancestry is pointing to.

Brenda is in Cluster 7. I didn’t try to identify Cluster 7 above as it wasn’t in the top 23 clusters. This means that I can associate Cluster 7 with my Long NAD. I associate the Long name with Ireland. However, this family was from North and South Carolina. Angela is also in the NAD and in Cluster 7. She also matches Ron who is on my biggest brick wall – the Clarke/Spratt Line. Ron is in Cluster 27. Perhaps that indicates a relationship between these two Clusters. I did find one person who is in Cluster 7 who is not in the Long NAD. I’m not sure why. There are 21 in Cluster 7 and 31 in the Long NAD.

The Weems NAD

John Weems was from Tennessee. I see his connection even less than with Seymore Long. My matches to people in this group, when they do match, are below 20 cM. That means that I don’t have an analogous Cluster to this NAD.

Summary and Conclusions

  • I’m not done playing with AutoCluster yet. There is still more to explore.
  • My original AutoCluster looked at matches between 50 and 250 cM. In this AutoCluster run, I chose limits between 9 cM and 600 cM. The spreadsheet showed matches as low as 9 cM, but the html cluster chart showed matches only down to about 20 cM.
  • As I had so many clusters, I found it useful to look at the clusters with the highest DNA matches. These are the clusters that were, for the most part, easy to identify.
  • I compared the 76 cluster analysis with the 5 cluster analysis I did.
  • AutoCluster does a great job of condensing huge numbers of AncestryDNSA matches and putting those matches into categories.
  • AutoCluster gave me a sense of how many matches I had that were maternal or paternal and from which grandparent side those matches came from.
  • Next, I would like to look at a lower threshold of 25 cM to narrow down the number of clusters that I get.
  • I looked at how AncestryDNA circles related to Clusters.
  • Next I looked at my two NADs. One NAD had an analogous Cluster. The second NAD had matches that were two small and didn’t have an analogous cluster.




Raw DNA Phasing Six Siblings with One Parent – Part 2 Heterozygosity

Homozygosity – Zero to Eighty in One Day

In my previous post, I discussed Homozygosity. In that post, I got my brother Jim from zero phased to 80% phased in one day. Although raw data phasing is considered advanced, the principal of homozygosity is very simple. It just means that you have two alleles at a location that are the same. If your parent has two alleles the same, then you got that allele from that parent’s side. If you have two alleles the same, then you got that allele from both your mother and your father.

Heterozygosity – Two Different Alleles at a Location

Heterozygosity is a little more complicated. It means that you have two different alleles at the same location. Genetics tends to be binary which to me is very simple. Binary is yes or no. You either have XX alleles at a location or XY at a location. A heterozygous results is XY.

Whit Athey and Heterozygosity

Whit Athey has a paper on Raw DNA Phasing here. This is his third principle:

Principle 3 — A final phasing principle is almost trivial, but it is normally not useful because there is usually no way to satisfy its conditions: If a child is heterozygous at a particular SNP, and if it is possible to determine which
parent contributed one of the bases, then the other parent necessarily contributed the other (or alternate) base. This principle will be very useful in the present approach. 

Where is Jim Heterozygous?

I need to look at Jim’s Raw Data File. I’ll ask Access to find Jim’s alleles that are different:

Jim is heterozygous at a little under 200,000 locations:

Where am I going with this? In the last line above, Jim is AG. If I know mom is A, then Jim has a G from Dad at that location.

Getting Dad Alleles From Mom

In this Query, I am taking all of JIm’s Mom allele’s that have no corresponding Dad alleles. These are the allele’s that he got from his Mom being homozygous. Then I linked those results to Jim’s heterozygous results. That ends up looking like this:

There are over 96,000 locations where we can fill in a Dad allele for Jim. In the first line above, Jim has a C from his Mom. Jim’s results are C and T, so the T has to be from Jim’s Dad.

Putting It All Together – Adding Jim to My Other 5 Siblings

I could figure out how to get the T into the JimFromDad Column above. But I really need to get Jim into the Table I already have with his 5 other siblings and Mom. It would be nice to add Mom’s FTDNA results to that table also. Right now that Table has 26 columns and I want to add more.

Here is the structure of the existing 5 sibling table:

I wasn’t too consistent on my capitalization. The sibling Dad alleles are grouped together as are the sibling Mom alleles. This is for comparison. These sets of Mom and Dad alleles will form a pattern that will determine the crossovers. The above table is called tbl5SibsHeteroMomtoDad, so it is at about the same stage that I am with Jim.

I’ll try this query to add in Jim’s Alleles 1 and 2:

Here I made an unequal join, but I don’t think that will work. I want everything from Jim’s list and everything I already had in the 5 sibling list. This will probably call for an Append Query.

In order to perform an Append Query, I need to have the same column headers. I copied the 5 sibling table and pasted it as a six sibling hetero mom to dad table. Then I added some columns for Jim:

I’ll also add some Mom and Dad allele columns for Jim. Next I open up Jim’s original download table into a Query:

I select Append at the top and choose the Table I want the data to be appended to:

I choose View to see what I have and it shows 720,449 records which sounds right. Then I choose Run.

This didn’t get me what I wanted. It added an extra row for Jim. When I sort by RSID, it looks like this:

It is giving Jim an extra row for his results, which I don’t want. Abort this mission.

A Right Join and a Left Join?

I can go back to my original thought. However, it will take two steps.

First I want to note that there are 942,647 rows in the 5 Sibling Table. There are 720,449 in Jim’s raw data table. I don’t want to lose any data along the way. I put an ‘is not null’ into Jim’s allele 1 column and got 720449 rows of data, so one query was enough. I like this so much, I’ll make a Table out of it:

This didn’t work so I tried repairing and compacting the Access database again. That seemed to solve the problem.

Now I have a new six sibling table. For five siblings, I have the Mom and Dad alleles of the first three steps. For Jim, I just have his raw data included so far.

Mom Vs Mom – Ancestry and FTDNA Results

Now I am wondering if I need to add Mom’s FTDNA raw DNA data to my table. Mom has 701,478 rows or positions at AncestryDNA. Mom has 711,398 rows at FTDNA. That is about 9,000 rows difference, so I guess it is worth it. It could make the Table more complicated for comparisons. If I can can combine, mom’s alleles into two columns instead of four, that would be better.

Here is my comparison of FTDNA vs. AncestryDNA for my mom:

This query will return all the RSID’s that are in FTDNA but not at AncestryDNA:

That is over 23,000 results. I will need these, if I am to recreate Jim’s results.

Getting Mom’s Extra FTDNA Results into My Six Sibling Table

First, I created a Query to find out how many RSIDs mom had from FTDNA that were not already in the six sibling table:

This tells me Mom has 17,935 positions tested that are not in the Six Sibling Table. However, if those are positions that Jim has, I will want to add those also. I checked and Jim has 17,835 positions tested out of mom’s 17,935. I was curious, so I checked Jim’s positions that weren’t in the Six Sibling Table and got 20,622. These are the details that bog me down.

Appending to the Six Sibling Table

I want a good Six Sibling Table, so I’ll append the 20,000 positions that Jim has that are not in the table.

Here is my Append Query:

The Query says to add only Jim’s raw DNA data to the Six Sibling Table that isn’t already there. When I view what is to be appended, I get the right amount of rows.

When I hit the Run button, I get this error:

I was wondering if that would be a problem. I don’t want the extra rsid column. I need to save the underlying Query first. I did that and had the same problem, so I made a table out of the Query.

That looks better. I think this will work:

This worked. So now my master table has 963,269 rows. Bigger is better as long as it is good data.

My plan was to add my mom’s alleles, but so far I have only added Jim’s. When I now check the positions that Mom has that are not in the Six Sibling Master Table, I only get 100. There are actually only 99 extra rows as one was a header that I deleted:

I’ll follow the same procedure. I’ll make a small table for Mom and then append it. I’m not sure of the significance, as Mom may have no siblings corresponding to these alleles at this time.

Here is the new Master Table with Mom and Jim appended:

One More Master Table Adjustment

On the Six Sibling Master Table I added a place for Jim Dad and Mom alleles:

I probably should have done this before I phased Jim. However, the advantage is that I have Jim’s results separate from this table that I can check on. I can now re-do the processes to get Jim’s phased alleles or try to copy what I had into this master table. [Note I try to copy Jim’s results below, but the results are not good, so I end up recreating his results in the Master File that has the results of all six siblings in it. See section called Plan B below.]

I’ll try to use an Update Query to get Jim’s phased alleles into this master table.  Here is my Google search for Update Query:

Actually I thought of an easier way:

Here I took the whoe Six Sibling Table and replaced Jim’s phased alleles where he had none. I only get one shot at this, so before I do this, I’ll add Jim’s heterozygous phased alleles to his two homozygous alleles.

An Append Query for All of Jim’s Phased Alleles

I appended Jim’s Heterozygous phased alleles to his homozygous phased alleles.

Here is the point at which Jim’s phased alleles are based on what he got from his mom to what he got from his dad. There are only two problems:

  1. The name of the Table is now wrong, so I need to change it;
  2. I never added in the alleles that Jim got from Dad. That is OK as I have the information to do it.

Adding Jim’s Heterozygous Dad Alleles Based on Mom’s Results

Now I am back to where I was before I took a detour of incorporating Jim and my Mom’s FTDNA results into my existing five sibling table.

Here I’m going to cheat a little and look to see what I did in the past:

Here’s my sister Lori. Back when I knew more what I was doing, I had an Update Query whcih said, ‘When Lori’s allele 1 was not the same as her allele 2 [heterozygous] and Lori had allele 1 from mom, put Lori’s allele 2 in her Dad spot’. Seems like that should work for Jim.

When I press View, I didn’t get any results. I have a guess as to the reason. This may have to do with the situations where Jim got his Mom allele and he had no results for himself. I tried i this different ways and could not get this to work unless I took out the expression: <>[Jimallele1].

This makes me think that something is wrong with the Table. I checked for duplicates in the table and got 96,222.

So this is good to know. At rs1000002 there are two results. One has Jimallele 1 and 2 rfesults and one does not. However, at rs10000300 there are no Jim alleles and there is only one result.

Plan B – Work with the Six Sibling Master Chart

I checked the Six Sibling Master Table for duplicates and didn’t find any. I think I’ll just work with that Table.

Here is Step one from Whit Athey:

Principle 1 — If a person is “homozygous” at a location—that is, having the same base on each of the two chromosomes of a pair, then obviously at that location it is possible to know with certainty that both chromosomes of the pair have that base at that location, but this is an almost trivial form of phasing. 

I had a little practice trying to get the Update Query to work. Now I’ll try it on Jim’s results in the Master File. Unfortunately, I am still getting no results. I decided to go ahead and run the Update Query even though I saw no results in the View mode. This was after making a backup up the Six Sibling Master File. It looks like the update worked.

The Update Query was quite simple. It said if Jim’s allele 1 and 2 were the same, then give that allele to his Dad and Mom side.

Updating Mom’s Homozygous Alleles to Jim

The next Update Query will be similar:

This says if Momallele 1 and 2 are the same, give Jim one of those on his maternal side. Here is the warning:

Here are some of the results.

I hope to catch the blue line in the next query.

Updating Jim’s Heterozygous alleles where Jim has a Mom allele.

This Update Query says when Jim is heterozygous and he already has his allele 1 in the mom spot, put allele 2 in the Dad spot.

I am down to a mere 34,000 rows on this Update.


Next, I want to switch the alleles:

When Jim’s allele 2 is in the Mom place put Jim’s allele 1 in the Dad place. That should fill in these blanks:

Here is a summary of what I have for phased alleles for me and my siblings:

One interesting thing is that Jon has 751,171 maternally phased alleles. Jon only tested at 668,942 positions. The additional results must be where Mom had results at positions that Jon didn’t test at. That is assuming that I didn’t mess up somewhere.

One More Query for Fun

This is looking for childrens’ missing alleles from Mom when Mom has two alleles that are the same. I found a few:

These are likely positions that were not tested by my siblings. I made a quick Update Query to add those Mom alleles in for my siblings.

Summary and Conclusions

  • I started out looking at my brother JIm’s heterozygosity. I found out where he could have an allele assigned to his paternal side in the case where we knew his materal allele.
  • I worked on getting Jim’s results into the master file I have for his other 5 siblings. I also added some more of my mom’s alleles that were from FTDNA and not included in her previous AncestryDNA resutls.
  • I tried to get JIm’s alleles phased before I brought them into the five sibling file, but I ended up with duplicate results.
  • I decided to work with the Six sibling file which had no duplicates and recreate Jim’s phased alleles based on the principles of homozygosity and heterozygosity. I was able to do this quickly with Access Update Queries.
  • I now have a large master file with 30 columns. These columns have the raw data for my mom and her six children as well as their alleles that have been phased so far. I will be working with the last 12 columns in the upcoming Blogs. These are the patrernally and maternally phased alleles. They will form patterns that will tell me where the crossovers are.



















Raw DNA Phasing Six Siblings with One Parent – Part 1 Homozygosity

I have written many Blogs on raw DNA phasing with my siblings and my mother. I have done this phasing using a Whit Athby paper and MS Access. I had my last sibling tested this past Summer, so thought I would see how his phasing would work using this method. The goal of this phasing would be to get four files of data representing the DNA from my four grandparents. I have four such files already, but they were created by M MacNeill a while ago and he didn’t have all my siblings’ data at the time. I would like to learn how to upload these files on my own.

Jim’s Raw DNA

Jim was my last brother to be tested. He was tested at FTDNA as that was the kit I had at the time. The first step is to find Jim’s DNA download from FTDNA and extract it. However, before I do that, I need to know what build to download. As I look at my old blogs, it appears that I was working in Build 37. Gedmatch has historically used Build 36. However, Gedmatch is being migrated to Genesis. Here is a comment I found on Facebook:

All Genesis tools natively work in B37 *meaning that all matching is done based on B37), but we decided to map all of the B37 positions to B36 and B38 when printing out segment start/end positions, with the choice given to the user which to display.

We will begin to migrate this to other tools as soon as we can. I hope you find it useful.

Build 37 DNA

All this to say that I want Build 37. I assume that I used Build 36 for Gedmatch, so I’ll do a new download for Jim:

I chose Build 37 Raw Data Concatenated. Unfortunately, my computer wants me to find an app to extract this file.

In the past, I have used Notepad, so I’ll try that. The gz file is about 6.4 MB. I can see Notepad was the wrong thing to open this with:

So I guess I need Winzip. I downloaded that and then opened Jim’s file. It opened as a csv file, but I saved it as an Excel File as that is what I will be using in Access. Jim has double A DNA:

Actually the DNA at the first position of his tested Chromosome was AA. He got an A from his dad and an A from his mom. Jim has a lot of DNA

This shows that Jim has 720,450 tested DNA positions. That is pretty good. However, there are some positions that don’t have results indicated by –. Between my mother, me and my five siblings there are about 4 million autosomal results to look at.

One thing that I notice that is different from this AncestryDNA file:

AncestryDNA has a separate column for allele 1 and allele 2. That would be better for me as I am trying to separate these alleles out.

This looks better. However, when I try to import this into Access, I get this error:

My guess is that Access does not like the dashes where there are no results. So I’ll take out every dash in Jim’s DNA results. That was close to 60,000 dashes. I tried that, but I still got the error. One on-line suggestion was to compact and repair the database. That seemed to work, but there was this problem:

I didn’t realize that there was a new header at line 702542. That imported like this:

Also for the AncestryDNA files, the X Chromosome shows as Chromosome 23, which should work better. My import to Access took out the ‘X’. After I removed the internal header and changed the X Chromosome to 23, I imported Jim’s raw DNA with no problems.

Giving Jim some Maternal DNA

Now Jim’s DNA is in shape for doing something with it. The next step is pretty simple. Every time my mom has two alleles that are the same, we know that allele is maternal for Jim. I originally tested my mom at FTDNA, so it would make sense to download her DNA from there.

Importing Mom’s DNA to MS Access

I already learned a few things from downloading JIm’s DNA from FTDNA. I used the same steps for my mom, except that I didn’t delete the dashes to see if that would make a difference. That gave me an error, so I deleted the dashes. Now I am in business.

My mom has 711,398 locations tested at FTDNA. This is a bit less than Jim’s  720,450 tested locations.

Next, I want to see what happens if I compare Mom’s ID’s with Jim’s ID’s:

Here at Access I have an equal join between the RSID fields of both tables. That results in 709,632 positions that Jim and Mom have in common. When I compare the positions between the two, I get 712,452. That is more than my mom had, so that doesn’t make sense. Actually, I shouldn’t be comparing by position, because those are positions along each of the 23 chromosomes. There may be repeats. That is good to know.

Where is Mom Homozygous?

If Mom’s Allele #1 is the same as Allele #2, that is called homozygous. I’ll perform this simple query on Mom:

I asked Access to show my where my Mom’s Allele 2 is the same as Allele 1:

Mom has that in 500,995 places. However, next, I need to get rid of the blanks:

I added Is Not Null to the criteria on the Result Column:

That gets me down to 496,136 homozygous positions. That means that more than 2/3 or almost 70% of Mom’s results are homozygous. Those are the alleles that will be Jim’s maternal alleles.

Where is Jim Homozygous?

Where Mom is homozygous, we’ll add a Mom allele to JIm. But first, where Jim is homozygous, we will add a Mom and Dad allele. I created a simple query in Access:

I’m creating two new columns for Jim. One will give me the alleles that Jim got from his Dad and the other will give me the alleles that Jim got from his Mom. In the criteria row I have that Jim’s allele 1 must equal Jim’s allele 2. When that happens, put in Jim’s allele 1 into the column. That gives me this:


That gives me over 500,000 rows of paternal and maternal alleles for Jim. However, I do have blanks. When I filter the blanks out, I get 491759 rows. That is a fast way to get almost 1 million alleles for Jim. Next, I’ll make a table of this query in Access. When I do this, I notice that Access has changed my query:

Access liked this better as it was simpler. I would think that JImallele2 does not have to be there twice, so I took one out and got the same result:

Access is trying to teach me to make better queries.

Adding Maternal Alleles from Mom

Here is a summary of where we are for Jim:


Just by assigning Jim’s own homozygoius alleles to his paternal and maternal sides, he is now 71% phased. I also see that mom had 496,136 homozygous alleles. These need to be added to Jim’s homozygous results. However, I want to be careful:

  • When I add Mom’s alleles, I don’t want to erase the ones I already gave to Jim
  • There may be homozygous alleles that mom had that Jim didn’t even test for. These could be added to Jim as bonus alleles.
  • In adding mom’s homozygous alleles to Jim’s list, we also have to add in where the position of those alleles are on the Chromosome and the RSID.

First, I note that mom has 496,136 homozygous alleles. This is more than Jim’s homozygous alleles.

First, I’ll create a query for Mom’s homozygous alleles.

Here I want there to be a non-blank result and I want Mom’s allele 1 to be the same as allele2.

Next, I’ll check to see how many of Jim’s homozygous alleles are the same as mom’s homozygous alleles.

I’ll do this by an equal join on the RSID which is a unique identifier. Here is what I get from this query:

However, there are still blanks there. I had trouble getting rid of the blanks, but I can temporarily get rid of them by filtering the results.

This gets rid of about 17,000 blanks.

This tells me that Mom has 496,136 homozygous alleles, but 381,721 of those Jim already has. That means we need to add 114,415 maternal alleles to Jim’s list. That would get his AllelesFromMom up to 606,174.

Next, I want to get a list of all of Mom’s homozygous alleles that Jim doesn’t have, so we can add them to Jim’s list. There is a little trick to getting this in
Access. First I create an unequal join:


On the query above on the left are all of Mom’s homozygous alleles On the right are Jim’s homozgous alleles that match Mom’s homozygous alleles. The #2 radio box is checked. That means I want everything on Mom’s side and everything where the RSID’s are equal. However, in the criteria, I’ll put an ‘is null’ on JIm’s side:

This adds 97,451 of Mom’s homozygous alleles to JIm. This is less than the 114,415 that I was looking for. One guess is that these are positions that Mom had tested that Jim did not. Somewhere I lost 7,000 of Mom’s homozygous alleles. Or this may have to do with the blanks in some of the tables. I was able to get rid of the blanks in Jim’s table and the new number came out right:

Adding 114,000 Maternal Alleles to Jim

Now that I’ve found 114,000 maternal alleles for JIm, I’d like to add them to his table. There are probably a few ways to do this in Access. One way is called Append Table. I’ll try that as I will need that later on in the process. If only I remembered how to do that. I could put Jim’s table into Excel and just add Mom’s table. However, I’m not sure Excel will appreciate the large files.

The directions that I found for Append Query said to use the data you want to copy first. That was in this Query:

What I want to add is from a Query called Mom Homo Jim Missing. These were Mom’s extra alleles. I chose to append these to a table called Jim Homozygous. But on second thought, I want it going to a new table, so I’ll copy Jim Homozygous and call it Jim Plus Mom Homozygous. First I want to review the results using the view button. I guess it looks right. It only shows the records to be added. Then I push Run and I get a warning saying that this cannot be undone.

Here is what the Appended Table looks like:

This is the point at which the appending took place. What I wasn’t expecting was that Access added the ID. This is the ID that Access originally assigned to the raw data. So now I have Jim’s ID’s and Mom’s ID’s in the same Table.

Phased Allele Update Alert

These two operations based on homozygosity alone put Jim’s phased alleles at over one million. Bing, bing, bing. Jim is already almost 80% phased. Maternally, he is close to 88% phased.

Other Phasing – Visual

I’m not the most experienced raw data phaser in the world, but I have worked on three, four five, and now six sibling raw data phasing. I have also done a lot of work with three, four, five and six sibling Visual Phasing. Here is Chromosome 1 using the Steven Fox Spreadsheet:

I can use the raw data phasing to confirm the Visual Phasing. I can also use the Visual Phasing to know where to look for crossovers. For example, I already see a problem with the map above in the bottom right corner. I will need to change the crossover designations there.

The other reason stated at the top of the Blog is that I should be able to create a file to upload to Gedmatch for each of these four grandparents. That could make searching for DNA matches easier.

Summary and Conclusions

  • I started phasing the sixth of six siblings based on homozygosity.
  • Using homozygosity alone, I got my brother Jim up to 80% phased.
  • Raw data phasing is considered an advanced topic, but the basics are quite simple. If you have two alleles that are the same, one must be from your father and one from your mother. If you are a parent and you have two alleles that are the same, you had to have passed down that same allele to your child.
  • I also used MS Access which is best suited for large databases.
  • My goal is to get four grandparent files to upload to Gedmatch (or Genesis). In the past, I have run out of steam on these projects.
  • I will be able to use my past work on visual phasing as a roadmap to finding crossovers and assigning grandparents.
  • I should be able to use my past raw data phasing experience to streamline the process.
  • With six siblings, I am expecting good results. However, as in the Visual Phasing process, the more siblings you have, you will have more combinations of sibling comparisons you have to look at.
  • Next up, I expect to look at heterozygosity.




















Fun With an AncestryDNA Lentz Circle

My Lentz Line has been difficult to nail down. The genealogy has been difficult and it has been difficult to assign a lot of DNA to Lentz ancestors

My Lentz Circle at AncestryDNA

Ancestry has been helpful in the Lentz area. Here are my AncestryDNA Circles:

Lentz is one of my smallest circle with 9 members:

Six of those 9 members are from my family. That leaves two other groups with a total of three people in them. In the Deborah Family group, there are two Deborah’s. They appear to be mother and daughter. I built out the tree of the mother and found a common ancestor in John Lentz. Then I found the tree of the daughter Deborah and she had already built out her tree as seen here:

John Lentz is on the younger Debbie’s mother’s father’s father’s line or her great-grandfather Davenport’s Line. This matches up well with my Lentz Web Page:

I was unclear as to whether John had one or two wives. Debbie has identified the wife as Elisabeth Riehl. I didn’t follow the line down of William Andrew. However, I have more information on my Ancestry Tree, which puts my Web Page out of date:


Lentz DNA

One interesting thing is that I do not match either Deborah at AncestryDNA. They do, however, match my mother and some of my siblings. Here is my mom’s match with the elder Deborah:

What is more interesting is that the younger Debbie uploaded her DNA results to Gedmatch. This is what the match looks like between Debbie and my Mom:

By DNA, my mom, Gladys and the younger Debbie could be fourth cousins. However, Debbie and her mom match my mom at about the same amount of DNA. That means Debbie’s mom passed down all the Lentz DNA that matches my mom to her daughter. This DNA match is on the shortest Chromosome.

Visual Phasing for My Siblings – Chromosome 22

I performed visual phasing on my DNA. Here is what I had for Chromosome 22:

This matches up with what Gedmatch shows as Debbie’s matches with my family:

In this case the reportable matches start at about 15M, so that is where Jim, Heidi and Lori have Lentz DNA shown in green on the left hand side of my Chromosome 22 map above.

A Lentz DNA Tree

I have drawn a tree of the Lentz descendants who have had their DNA tested. I had missed Debbie, so she is not there yet:

I am on the left side of the tree. I also descend from the Nicholsons and get a lot of matches with that family. The right side of the tree is more specific as I have no Nicholson relatives there, but the relationships are further out. I am already tracking two people from the William Andrew Line there.

Here are the two Deborah’s added in:

This shows that my mom is a fourth cousin to the elder Deborah and I am a 5th cousin to the younger Deborah.

Here is how Debbie matches Radelle, Al and Stephen on Chromosome 12:

This suggests triangulation between these four people which would indicate a common ancestor:

My mom matches Radelle and Deborah, but on different Chromosomes. Hence, the Ancestry Circle.

Painting Debbie’s Match to My Mom

This is what I had previously for my mom’s John Lentz DNA based on her match with Radelle. That match is in dark green.

I need to add Mom’s Lentz DNA to Chromosome 22:

This doesn’t look like much, but it doubles what my mom had on Chromosome 22 previously.

Summary and Conclusions

  • Reviewing my AncestryDNA Circles lead me to a Lentz descendant who I had overlooked.
  • One of the people in the Circle had uploaded her DNA to Gedmatch. I had seen her match before, but didn’t know exactly how we connected on my mother’s line.
  • Because Debbie uploaded her DNA to Gedmatch, I was able to tell exactly where she matches different Lentz descendants.


My Children’s Maternal Genealogy – Part 5: Gately

In my previous Blog, I showed that John Edward Cavanaugh’s mother was Louisa Gately.

Louisa Gately is my children’s maternal 2nd great-grandmother. I find it interesting that many records I’ve seen for Louisa show that she was born in England and that her dad was born in the West Indies or Jamaica and her mom was born in Ireland.

Here is Louisa Gately in 1860 Lowell:

Even though I mentioned Louisa was said to have parents from the West Indies and Ireland, this census has them as being from England. Louisa was part of a good-sized family. There appears to be 24 years between the oldest and youngest child. This means that Mary married very young, or William remarried. This Census seems to indicate that her parents were both born in England.

Five years earlier in 1855, the family was living in the house of Thomas Freeman in Lowell:

In Louisa’s marriage record, she gives her mother’s name as Catherine. This is perhaps a different person than the Mary listed above.

The last Census Louisa appeared in was in 1920:

Here Louisa is with her Daughter Ellen and niece Ellen A Ryden or Byden. This Ellen may have been the daughter of Ellen Gately who was Louisa’s sister or half sister.

Ellen A Ryden

The older Ellen A Ryden died on March 1, 1901. Her parents were listed on that record:

This gives us a mother for Louisa.

Tracing the Gately’s Across the Ocean to England

The next step is to see where the Gately’s lived in England. This must be the family on Regent Road in Salford:

Here is current day Regent Road to the West of Manchester, England:

This record gives a further refinement on Louisa’s mother’s name:

It appears that Catherine Etherington died in Lowell 15 years after she married in Manchester, England:

William Gatley/Gately Born About 1815 in the West Indies

It appears that William Gately (or Gatley) married three times and died in Lowell on July 25, 1895. Here are his parents listed on his death record:

I see them as Joseph and Jane Savage. They were both born in England, so may be possible to trace. I’ll check William’s other two Lowell marriages. William’s third marriage was in Lowell in 1874. He married:

Elizabeth’s last name is transcribed as Kate. Interestingly her mother was a Hartley. William’s parents are just given as Joseph and Jane.

Here is William’s 2nd marriage:

This is the Mary we see in the Lowell Censuses. Again, William’s mother is Jane. Int means publishment of intention of marriage. Perhaps William’s mother’s name was given as Frances in that publication. I also see what looks like an ‘I.’. Perhaps this means Ireland. If that is the case, the William was from Ireland but in the intentions of marriage record, he is from the West Indies. I suppose that both could be true.

Here is part of William’s Oath of Allegiance:

It looks like William signed his name more as Geatley than Gately. Here is the family in 1850:

William’s Parents: Joseph Gatley and Jane Savage

In the 1841 Census for Salford, England, William was listed as a Gatley, so I’ll go with that. A logical place to look for Joseph and Jane is in a marriage record. Here is one possibility:

Here a Joseph Gatliffe married Jane Savage on June 5, 1808. The timing seems right and Gatliffe sounds close to Gatley.

Here is Leigh – 9.5 miles West of Manchester:

I searched for births to Joseph and Jane Gatley in Lancashire County and came up with one:

Perhaps the family moved to the West Indies, had William and moved back.

Warrington is between Liverpool and Manchester.

An Ancestry Clue

Here is an Ancestry Tree Hint for Joseph:

I have two choices here. I can accept the hint, or I can not accept it. If I don’t accept it, then I’ll have to do my own research. I think I’ll accept the hint. It seems reasonable. The names are right and I have already come across the places of Salford and Warrington. I can only assume that James had children and some of his descendants either looked up his ancestry or kept track of family history.

Once I entered James Gatley in the tree, I got this further hint:

It seems like James was a fustian cutter, so this occupation must have run in the family. I found a question on-line from Andy who was wondering what his fustian cutting ancestors did and he got this answer:

Hi Andy

Fustian Cutter / Weaver 
A person who lifted and cut the threads in the making of Fustian, formerly a kind of coarse cloth made of cotton and flax. Now a thick, twilled cotton cloth with a short pile or nap, a kind of cotton velvet. A long thin knife was inserted into the loops and the threads cut as it was pulled through, stretched between rollers. The cloth was then brushed to raise the pile. Fustian is the old name for corduroy / A weaver of Fustian 

best wishes & happy hunting 🙂

A Summary for Agnes Cavanaugh

In this Blog, I looked at Agnes’ father’s mother’s line which was Gately or Gatley in England. Possibly even Gatliffe.

I had shown previously that  John E Cavanaugh’s mother was a widow when he was born.

The Warren Family

My top guess for John’s father is John J Warren. I don’t like seeing the Potential Father above as it gives a bad hint, so I’ll add John Warren in:

Here is some more on John Warren:

John died two years after Louisa’s son John was born in an accidental drowning. The death was recorded in Amesbury and John Warren lived in Lowell. The death record gives John’s parents as Jeremiah and Mary Warren. They were both from Ireland.

James had an older brother Jeremiah. Here is the family in 1855:

There were no women in this house at the time of the State Census.

This also fills in all eight maternal second great-grandparents for my children, Heather and JJ:




  • My children have roots in Lowell
  • The Gatley’s or Gately’s were fustian cutters in the area of Manchester, England before coming to the US
  • I haven’t found records tracing Louisa Gatley’s father to the West Indies or records of her mother from Ireland.
  • William Gatley lived quite a long life. A bit of a sketch could be written up about him.
  • I’m starting to look into the Warren family. They appear to also have Irish roots.





Leeds Color Analysis at Gedmatch

I have created Leeds Color Analyses at AncestryDNA, FTDNA and MyHeritage. I thought that I would try a Color Analysis at Gedmatch. Gedmatch has DNA results from 23andMe, AncestryDNA, FTDNA and MyHeritage, so it will be interesting to compare the results.

Adding Color to Gedmatch

I’ll start by going down my One to Many Match List at Gedmatch:


The people above the green box are too closely related to work for the Leeds Method. The people in the green box share great grandparents with me on my Hartley side.

Leeds Method for the Hartley’s

I’ll put my Gedmatch number in the first spot and my father’s cousin Joyce’s Gedmatch number in the second section:

Choosing ‘Display Results’ gives me this:

There are perhaps 100 or so of these results. The way these people match me are on the first ‘Shared’ column. The way they match Joyce is found in the second column marked ‘Shared’. I would like to go down to about 15 cM with my matches. The problem with this list is that there are no names. I do, however, have Gedmatch numbers and emails. I copied my shared matches with Joyce that matched me down to 15 cM. That was 151 matches.

Working with MS Access

It seems that I need to work with MS Access to make this easier. Unfortunately, I’m a little rusty at Access. First I set up a new database in Access. Then I imported my 151 matches with Joyce into Access. Then I copied my ‘One to Many’ match list at Gedmatch into Excel and took out the columns I didn’t need. Then I imported that spreadsheet into Access also. It sounds like a lot of work, but it saves time in the long run.

My pared-down Gedmatch Spreadsheet looks like this:

It’s too difficult to get rid of the buttons, check boxes, and arrows, so I just leave them there.

Here is what my two tables look like in Access:

I just need to connect these two tables by the Gedmatch ID#. That will create a new table with the Gedmatch ID# and name.

Here is the design of my query:

The ID is the Gedmatch # from the People Who Match Both Kits (me and Joyce). One thing that was important was that I added a ‘Y’ in the Hartley column. That was in lieu of a color.

When I view the results, I get this:

I now have Gedmatch ID, name, match amount to me and that they are in the Hartley group. Access tells me I have 151 people in this Query. This saves looking up 151 Gedmatch ID#s and copying and pasting the names into a table.

Carolyn and the Nicholson Clan

The next non-Hartley on my ‘One to Many’ list is Carolyn. I followed the same procedure for Nicholson, but this time I added in whether the match had a tree at Gedmatch:

Anita and Rathfelder Matches

I did the same for Anita. I chose down to 10 cM on the people that matched both Anita and myself but got this as a result in my Access Query:

The query showed only the results above 15 cM. This is because my One to Many List at Gedmatch only includes 2,000 matches.  Currently, my smallest match on the One to Many list is 13.4 cM. There are a few ways around this. One is to use the Tier 1 list of matches. Another would be to use a list of my maternal matches. However, I will just keep this small list for now. So far, the only problem I see using this method is that I don’t include the original person that I was comparing everyone to. So I need to go back into my list and add in Anita, Carolyn and Joyce.

Emily – Frazer and McMaster

Emily and I share Frazer and McMaster Ancestry. I am able to find 443 matches shared between Emily and myself. These matches correspond with my FTDNA AutoCluster Analysis:

The Frazer cluster above is the first orange one. It corresponds to many matches on Chromosome 20. When I add all these matches, this is what I get:

  • One surprise is that Judy who is the lead person for Lentz/Nicholson also shows up in the large Frazer/McMaster group. When I run my paternally phased kit, I don’t see Judy on my match list, so there must be some glitch there.
  • I am somewhat skeptical of all the green matches.
  • The column with the GED/Wiki information should come in handy.

Summary and Conclusions

  • I was able to satisfy my curiosity as to what a Leeds Color Analysis would look like for my Gedmatch matches.
  • I have made sure that some of my most important matches are posted at Gedmatch.
  • This is a good baseline analysis. It may be possible to improve on this analysis by use of paternally and maternally phased results.
  • After seeing the results, it turns out that my Rathfelder cousin Catherine had a slightly higher match with me than Anita, so I could have used Catherine’s results to come up with the Color Analysis.
  • Using MS Access sped up the process in creating this Gedmatch Color Analysis.
  • It would probably help to have an extra column to indicate which matches have a common ancestor with me. Or these people could be highlighted in some way.

I took my advice from the last bullet:


One other anomaly was the that the highlighted Lentz/Nicholson common ancestor for Joshua came out as a blue Hartley shared match. Perhaps there was some glitch with Gedmatch. Below a match level of 30 cM, it is difficult to find common ancestors with a few exceptions.