After writing over 50 Blogs on genetic genealogy, I realized that I hadn’t written a blog on my Hartley YDNA. I have written on Frazer YDNA and my wife’s family YDNA (Butler), but not one just on Hartley YDNA. This will not be on all Hartley YDNA as I know the most about mine. There are other Hartley Lines that aren’t closely related to mine.
Many reading this blog will know already that the YDNA is used often in Surname studies. This is because YDNA is passed mostly unchanged from father to son. I say mostly because there are slow changes that occur. These slow changes are what make the differences in the different STRs and SNPs. STRs and SNPs are the 2 major types of YDNA of importance in genetic genealogy. In this Blog, I’ll write about my own STRs and SNPs and how they relate to each other. I’ll also look at a few ways of analyzing YDNA results. There is a lot to cover here.
SNPs – Single Nucleotide Polymorphisms
SNPs are formed due to genetic mutations and are very specific and unambiguous. They can be used to trace one’s line back to a genetic Adam and place one into a specific group of people. Here is the broad difference between SNPs. They are listed between the letters A and T below.
My Hartley Line is broadly R1b. My Frazer line is R1a. They split off at some point and appear to have taken a more northerly route through Europe. R1b is the most common YDNA in Western Europe. Further, there are 2 branches that are common within R1b. These 2 types are listed by their test names. They are R-U106 and R-P312. In England, the R-U106 represents the Anglo Saxons. They came from the areas around Germany. It turns out that I am R-P312 and further L21. See the bottom left of the tree below.
L21 is known for the Irish and Scots. But there are also English L21. Actually, I would like to think of myself as British. The British represents the older stock in England whereas the Anglo (hence English) Saxon are the late comers. More of the U106 are found in the Southeast England where the Anglo Saxons entered. The L21’s are found more in the North and West of England and in Ireland.
For some reason, I was relieved to find out that I was R-L21. I guess I liked the idea of being associated with the old timers vs. the invaders. Also, even though the Celts are not a genetic group per se, they have been associated with R-L21. Here is a map of England in 600 A.D showing the British/Anglo Saxon split.
More on L21
It took me quite a while testing my YDNA to find out that I was L21. There are many levels of subdivisions below L21. Here is an L21 Tree that is almost 2 years out of date. On it, I tried to place some of the Hartleys that had tested up to that point. Some that I wasn’t sure of I put in the upper left of the chart.
At that time, I had put my ancestor, Robert Hartley in the L513 Group (dark yellow) and one step under that at S5668. Due to in a large part, people doing a Big Y test, many new SNPs have been discovered and placed in the tree. Now R-L513 has it’s own Tree.
Finally, I have tested positive for Z17911 and Z17912. These are equivalent SNPs. The people listed on the main tree are ones that have taken the Big Y or equivalent tests. Once I get my results, my name will show above with Merrick and Thomas – or perhaps in my own group.
As far as I know, Z17911 is the end of the line or what has been referred to as a terminal SNP. However, Big Y testing may reveal more. There are also SNPs which are called private or family SNPs. One or more of these may be found in my BigY results for the Hartley family.
STRs – Short Tandem Repeats
The STR was the first type of YDNA to be used for genetic genealogy. I think of these as a stutter in the DNA. These are extra copies that happen in specific areas of the YDNA that are noted and used for comparison purposes. Standard tests range from 12 STRs to 111 STRs or more. The more you test, the more you pay. Each of these STR locations have their own rates of change. There are the fast changing STRs and the slower ones.
My Hartley STRs
Here are some of my Hartley STRs. First I’ll explain the headings below. Dark blue is the first panel of 12 SNPs. Maroon represents the faster changing STRs. The next set of lighter blue is up to 25 STRs. The next lighter blue is up to the 37 level. The lightest blue on the right is STR 38 to 67. I didn’t include all my 67 in the image below.
This image is small, and it is taken from the Z17911 group. These people have tested positive for Z17911 and are listed in the FTDNA R1b-L513 Project. The rows of numbers are the STR values (or numbers of repeats). The rows are:
- Minimum value (in this case of those that have tested positive for Z17911)
- Maximum value
- Mode – this has also been used to approximate an ancestral value for the group
- Hartley (me)
Genetic distance (GD)
There are a few ways STRs are used. One is GD or Genetic Distance. When I compare my STR test to another Hartley, for example, it counts the number of differences between the two tests. Some of the numbers in the rows above are highlighted in either purple or pink. The purple values for the 4th line (Hartley) are less than the mode. The pink values are more than the mode. So in the first 37 STRs for my results there are 6 highlighted values. That would be a GD of either 5 or 6. There are 2 ways of counting. For the 5th maroon named marker there are 4 values. There is a method called Infinite Alleles Model which would only count any changes within that named maroon region as one.
Note that of these 6 differences or GD’s in my results, 4 are in the slower moving areas and 2 are in the faster moving areas. I note that at Family Tree DNA (FTDNA) I am not shown as related to any within my Z17911 Group. However, that is OK. For 37 STRs my highest GD is 4. I don’t think FTDNA shows higher than that. For 67 STRs, FTDNA’s highest GD is 7. This is because, when more STRs are compared, more GDs are allowed to make a match. I further note that at 37 STRs, I match 3 Hartleys, one believed to be descended from a Hartley and 2 non-Hartleys. At the 67 YDNA match level at FTDNA, I have the same person believed to be descended from a Hartley and 3 other with the Hartley surname. So it seems like the FTDNA system is working. However, to get the matches that are further away, one must look at a SNP project or surname project.
where is the common ancestor for STR matches?
FTDNA uses a TIP Report to guess how closely related I am to my YDNA matches. My closest match at the 67 STR level is at a GD of 4. That isn’t very close. However, close is relative.
The first one on my YDNA match list is Sanchez – believed to be of Hartley descent. The TIP Report tells me this:
The second on my 67 STR YDNA match list has a Hartley surname. We also have a GD of 4 and the TIP Report looks like this:
Notice that the TIP Report shows a better likelihood that I’m related to Hartley than Sanchez. This is because the TIP Report considers the speed of change of the markers. The markers that are different between Hartley and myself are faster moving ones than the ones that are different between Sanchez and myself. As there are only averages of how often these markers change, this is not an exact science. The tables just show likelihood of when we may have had a common ancestor.
strs used to predict the r-L513 SNP
Here I should mention the difference between a haplogroup and a haplotype. I mention it partially, because I forget which is which. A haplogroup has to do with a SNP. Examples of a haplogroup are R1b, L21, etc. Sometimes the smaller groups are called subclades or subgroups. According to Wikipedia, “Subclades are defined by a terminal SNP…”. So my Z17911 would be a subclade.
Apparently there is more than one definition of haplotype. The one I am thinking of refers to a specific grouping of STRs that stands out. One such grouping of STRs (haplotype) defines the R-L513 Haplogroup. Before the L513 SNP was discovered, people analyzed the STRs and noticed certain patterns. Based on those patterns, the STR results were put into different groups. One such pattern was (and is) DYS406s1>=11 and DYS617=13. When people testing their STRs found these 2 values, they were almost always L513 as confirmed by their SNP testing. So for the longest time, the group was called the 11-13 Combo group rather than the L513 group. Let’s look at the top of the L513 YDNA results page to see if this pattern is true:
Notice that there are a few here that are different, but these may represent rare mutations. In my Z17911, we all meet the criteria.
Strs predicting Z17911 SNPs
I noticed in the L513 Yahoo Mail Group that I belong to, there were some predictions based on STRs that there could be more Z17911’s. Here is part of a post from March 2016 on the Yahoo L513 Group from the administrator,
“Below is a list of the people I’ve added in the last three weeks, the project I found them in and their predicted variety. This is sorted by variety label.
293533 William Hartley b. 1745 d. after 1807 Hartley 513-5668-16357-16343-17911-JM
372104 Sanchez, b. Spain L513 513-5668-16357-16343-17911-JM”
Sanchez believes he has a Hartley ancestor. So it is interesting that I will likely have more company at the Z17911 SNP. Here is another interesting post from the administrator of the L513 Yahoo Mail Group in October 2015 to Jared who felt he was mis-grouped:
Hi Jared, I mis-grouped you. I will fix. I intended to put you in the “J” STR variety/cluster. I’m not positive you are in “J” and could be in “H” or a little different yet. It’s hard to make judgements on this, particularly at only 37 STRs.
Here are all the people that I’m aware that off modal values for STRs 390=25 389i=14 458>=18 449>=31 464c=16 and high CDY numbers. You might actually fit in better with the Phillips and Vaughan side of “J” than the Merrick or Thomas.
We think this group is all Z17911+ but I’m not sure. I would say you are Z16343+ at he very least. Z16343 also marks the “H” variety people (Hayes/Pillsbury). No guarantees.
f307773 Smith R1b-L21>DF13>L513
fN56253 Gilroy R1b-L21>DF13>L513
fN114296 Gilroy R1b-L21>DF13>L513
f275990 Hartley R1b-L21>DF13>L513>S5668>Z16343>Z17911
f280251 Hartley zzL21suspect
f117349 Hartley zzL21suspect
f200669 Head zzL21suspect
f160646 Phillips zzL21suspect
f271571 Phillips zzL21suspect
f158089 Phillips zzL21suspect
f160637 Phillips zzL21suspect
f113390 Phillips zzL21suspect
f306961 Phillips zzL21suspect
f116935 Vaughan zzL21suspect
f160729 Vaughan zzL21suspect
f271772 Vaughn zzL21suspect
f105064 zzzUnk(Phillips) zzL21suspect
I am the first Hartley mentioned above. Then there are 2 others that may be Z17911. So that means that rather than me being all alone at Z17911, there may be 4 other Hartleys joining me. That is progress. Based on the L513 Administrator’s (Mike’s) STR analysis those 4 would be Z17911. Here are my STR values highlighted in blue with Mike’s Z17911 signature STRs.
I meet all the Z17911 signature STRs which makes sense as I have tested positive for Z17911. These predictions can save a lot of money for people testing SNPs. Rather than testing a series of 4 or 5 SNPs to see where they are on the SNP Tree, they can just test for Z17911 to see if they are positive for that.
Using STRs to Create New SNPs
ISOGG is the International Society of Genetic Genealogists. They have a guidelines for naming new SNPs:
The objective of the ISOGG Tree at this time is to include all SNPs that arose prior to about the year 1500 C.E. This guideline may be measured through STR diversity or alternative evidence.
Where a new terminal subgroup is being added, STR marker results or other evidence described below for two men with the new SNP are needed.
To be accepted the SNP must be observed in at least two individuals and must meet the STR diversity requirement. A SNP that does not meet this requirement will be classified as a Private SNP (see definition above).
The STR diversity requirement is met if the following conditions are satisfied:
- If the SNP is a Non-Terminal Branch SNP, no further proof of diversity is required.
- Genetic distance is calculated using the Infinite Alleles Model (IAM). A marker for which there is a null value in one sample must be discarded from the calculations. Otherwise, most laboratories use the IAM.
- All markers tested by both individuals must be compared.
- If 74 markers (or fewer) are compared, the minimum genetic distance to meet the diversity requirement is 5.
- If 75 (or more) markers are compared, the diversity requirement is a minimum of 7%, computed by dividing the genetic distance by the number of markers compared, and rounding to the nearest integer value.
This is what happened when my Terminal SNP was accepted. Usually, one would be looking for a low GD for a match, say. Here, for the addition of new SNPs a higher GD is needed to show that the SNP is not a private SNP. Here is another message written June 2015 by a fellow Z17911 from the Yahoo L513 Mail Group that I’m in:
I tried to figure the Infinite allele GD for the three current SNP-tested members of Z17911 (if I understood DYS464 and CDY correctly):
Hartley/Merrick = GD 14
Hartley/Thomas = GD 12
Merrick/Thomas = GD 10
I hope this is helpful.
Charles Thomas 8633
Mike followed up with:
Yes, Charles. It looks like Z17911 and Z16855 are clearly public making upstream Z16343 public too.
And the rest is history – at least for my little branch of the YDNA tree.
Analysis of STRs Using the RCC Method
The RCC method may be somewhat obscure to some, but I find it very interesting. This method uses STRs to create trees of descent, like the SNP trees I showed above. As it uses STRs and not SNPs, it is helpful as a check to the validity of the SNP trees. The RCC method was developed by Bill Howard. In November 2014, Bill came up with the tree below based on 67 STR results. I was at the top of the list in that study of a relatively small group of people.
Note how this method mirrors today’s SNP tree:
The RCC method show that Z16855 branched from Z17911 out of Z16343 at over 60 RCCs. For this 67-marker analysis, 1 RCC = 38.05 yrs. So that would be over 2300 years ago. The present year is considered as 1945-1950. Hartley shows as splitting from Merrick and Thomas at about 30 RCCs. That is over 1140 years from 1945 or around the year 800 A.D. As there were no surnames at that point, this would explain why Hartley, Thomas and Merrick could be in the same grouping. The closest RCC to Hartley at the time of this study was Gilroy. An RCC of 18 translates to 685 years. This brings us up to about the year 1265 A.D. Surnames in England were being sorted out around the 1400’s.
Here is my interpretation of the RCC 67 STR Tree with SNPs and dates added:
Assuming that the vertical line at RCC 30 represents Z17911, it appears that there is room for at least one other SNP on the Hartley Branch that includes Gilroy, Phillips, Vaugh[a]n and Griffin.
Comparing two Rcc studies (67 Vs. 111 Strs)
More recently, at the end of March 2016, Bill Howard ran the data for 555 L513 testees that had 111 STR markers or more. I have only tested for 67 markers, so I was not included, but there was one Hartley in that group. He does not show up on my match list as I count that I have a GD of 10 with him at the 67 STR level. This is beyond the match limit of 7 for FTDNA.
Here is the small section of the 555 that included the Hartley I mentioned above.
Now the vertical dashed lines happen every 20 RCCs. For this study, the RCC = 44.8 years. Mike Walsh, the Administrator of the L513 Project looked at this and felt that, based on his experience with SNPs, that the 44.8 may be a bit high and mentioned a factor of 34.65 years that he thought may work better.
Here is my interpretation of the 111 STR RCC Tree with dates and SNPs. One RCC = 44.8 years.
First, because there are fewer results at 111 STRs, this spreads out the branching. I don’t know who Pitt is. In the previous study Z17911 and Z16855 branched at about 490 B.C. Here, it appears to be in a similar location, I guess about 440 B.C. In the 111 Tree ZS849 branches off in the 1400’s Vs. the 1600’s in the 67 STR Tree. I would assume that the previous study could be slightly more accurate due more available results at the 67 STR level. However, the results are quite close to each other.
Historical 37 STR RCC Tree from September 2014
All these RCC Studies reminded me of a study done in the old days – back in 2014. At the time, I was amazed at how close Bill Howard got to the SNP tree with just using 37 STRs. At the time, I had recommended that the results of 21 L21’s be included in the study, but Charles was too quick in sending 14 L513 results to Bill Howard and Bill gave us this tree:
Charles said that 1 RCC should equal 43 years. I’ll put what we know now onto the 2014 RCC tree.
The main difference in the older study is that the Z17911/Z16855 branching is shown at a later date (A.D. Vs. the newer studies’ B.C. dates). Also there is an Evans in my group here. I’m not sure who he is.
So Which is Better, SNPs or STRs?
Most people tend to like SNPs over STRs. SNPs may be considered UEPs or Unique Event Polymorphisms. It is the unique part that makes them better. I like the way my L513 Administrator, Mike Walsh says it,
Some people say have used the words that SNPs trump STRs. That’s probably the correct general perspective. Assuming the specific SNPs considered are actually very stable Unique Event Polymorphisms (EUP), any SNPs that differentiate are most important and therefore provide fencing for which do additional evaluation using surnames, genealogy, geographies, etc. AND STRs.
STRs may back mutate, which is a hidden weakness in a way. Say that you have a perfect match with someone based on STRs. One of those STRs may have mutated and back mutated. This would mean that you are not a perfect match, but a GD of 2. There is not an easy way to know if that has happened or not. So that introduces some uncertainty. However, that is not to say that STRs are not important. I feel as they are underrated by many and should still be considered for the reasons I mention in this Blog and in the section below.
Summary, Conclusions and Comments
- I’m looking forward to my BigY results to see what they may include
- I am currently classified as Z17911 – a relatively recently discovered terminal SNP
- By STR signatures, there appear to be 4 other Hartleys who would test positive for Z17911. These Hartleys should be encouraged to take the Z17911 SNP test.
- I have used a similar method to analyze STRs and predict my own SNPs before I tested positive for them.
- STRs are useful for determining relatedness to other STR matches using GD and FTDNA’s TIP Report
- The TIP Report also gives an estimate to the Most Recent Common Ancestor for YDNA matches.
- STRs are also useful in determining whether a new SNP is private or public using ISOGG guidelines
- The RCC analysis is useful in creating STR trees and for confirming SNP trees
- The RCC analysis can also give a time period for the branching of different SNPs and families.
- STRs and SNPs complement each other