Chasing Down Some Massachusetts Colonial DNA

Recently I was contacted by someone I knew in high school who said, “Who knew we were related? Skot had tested his DNA at Ancestry and had found me as a Shared Ancestor Hint. Ancestry compares your trees and if there is a match in ancestors and a match in DNA you are put on a list.

Shared Hathaway Ancestors

Skot’s and my genealogy research both lead to Simon Hathaway and Hannah Clifton.

I have the above chart to my grandfather and Skot’s grandmother. The chart says that Skot and I are seventh cousins. Simon and Hannah were born in the early 1700’s and married in Rochester, Massachusetts. This is interesting as Skot and I both grew up in Rochester.

Does Skot and My Shared  DNA Point to Hathaway and Clifton?

AncestryDNA doesn’t show that the DNA you share is the same DNA of your shared ancestor. It sort of implies that but doesn’t prove that. To prove that, we need to use triangulation and have chromosome browser. I asked Skot to upload his DNA results to Gedmatch where we could compare the DNA results. Here is what my match with Skot looks like at Gedmatch.com:

This shows that we match on Chromosome 10. I have a paternal phased kit at Gedmatch, and Skot also matched me there. That match shows that we match on my father’s side who had the Hathaway ancestors, so that is good.

Further, I have mapped my Chromosome 10 and it shows we match in an area where I got my DNA from my Hartley grandparent and not my Frazer grandparent whose parents were from Ireland. That is also a good sign:

This map shows me as J on the fourth bar. The Hartley is in orange and for me it goes from position 32M to 114M. According to Gedmatch, I match Skot from 68M to 77M, so that is well within my orange Hartley grandfather DNA area.

Triangulation of DNA

Triangulation of DNA is when A matches B, B matches C and A matches C. This is fairly easy to do. Once this triangulation occurs, it indicates a common ancestor. It is more difficult to find the common ancestor of that triangulation for various reasons. The next thing I look at is my sister Lori’s spreadsheet of matches. These matches have tested at various places and uploaded their results to Gedmatch.com. I’m looking at Lori’s matches because she matches Skot also, and because her test is more recent, so I have more matches for her.

Lori’s biggest match is 54, but that is with me. Lori matches Skot from about 68 to 77M, so these all start before that point. A few end before then. Lori has other matches in this region. Lori’s matches tested at AncestryDNA, 23andme and FTDNA. I tend to prefer AncestryDNA matches as the family trees are easier for me to read.

Lori’s first match of 22 cM is with Cheryl. Skot and Cheryl match at about the same spot and about the same cM as Lori and Skot match. That means the three triangulate.

Now the Hard Part – Finding the Common Ancestor

Cheryl has over 25,000 people in her tree. Does she have Hathaways or Cliftons? At Ancestry, Cheryl and Lori are not Shared Ancestor Hints to each other. According to AncestryDNA, the common surnames between Lori and Cheryl are:

However, Baker and Schmidt appear to me on my mom’s side, so I won’t look at those. Phillips and Warren didn’t show anything obviously helpful. When I click on Cheryl’s White, I get this:

This is interesting as I have ancestors in Dighton on my Snell Line and also White and Hathaway ancestors. With a little trial and error, I see that Elizabeth Hathaway’s mother is Elizabeth Talbot. That is one of my ancestral names also. Elizabeth’s parents according to Cheryls were Jared Talbot and Sarah Andrews. I have a match in that couple. Here is my tree:

This is what I meant when I said that finding common ancestors among triangulated matches was not easy. I’m not happy that Lori and Cheryl’s common ancestor is from the 1600’s, but at least we found a match. Perhaps we will come back to Cheryl. Right now, a tie-breaker would help. Hathaway/Clifton or Talbot/Andrews?

Skot’s Genealogy

Here is the spot of Skot’s genealogy where Ancestry has us matching:

Note that Ancestry simplified the situation a bit. We are matching on Simon Hathaway and Hannah Clifton. However, we also match on Arthur Hathaway. It is even more confusing than that because Arthur Hathaway was also the father of Simon Hathaway by his first wife Maria Luce. Wow. Then Skot has more than one Clifton in there.

Shamus Match

One of my good matches at Chromsome 10 in this area of interest is Shamus. He matches me closely at 43.8 cM by FTDNA and 39.4 by Gedmatch.com. According to FTDNA, we share the following surnames:

Barstow Cook Swift Samson Talbot Taylor Townsend White Wing Ward

I looked through these names, but saw no obvious connection before the 1700’s.

Sarah Match

Sarah matches Lori at 18 cM. She is at FTDNA. Her surnames that match are:

Clark Hatch Jewett Johnson Lutzelburger Lutzelberger Lombard Richmond Spooner Smith White Wing

At least between Shamus and Sarah are the common White and Wing names. By the way, Sarah has a different last name at Gedmatch and FTDNA, but I assume that she is the same person. Actually there is a way to prove it, because FTDNA has a chromosome browser. Here is how Sarah matches me using FTDNA’s chromosome browser:

Again, the DNA part is easy. It is the genealogy that is a bear.

Here is Sarah’s White and Wing connection:

Here is how I connect:

Again it is not a very satisfying connection. We connect only on Daniel Wing at the top. Our ancestors appear to be from two different mothers and Daniel who was born in 1617. I wasn’t able to place Sarah’s Hannah White.

I didn’t find out much about Joanne or Joanna Hatch. I did read an account of a family tradition that said that Joanna and Bachelor Wing were cousins.

At this point, I’m ready to call it quits.

Summary of Genealogy Linked to DNA

So far I match:

  • Skot on Hathaway/Clifton – early 1700’s Rochester, MA
  • Cheryl – Talbot/Andrews 1640’s Dighton, MA
  • Shamus and Sarah – Wing 1617 Sandwich, MA

I’m sure there are other connections.

Continuing to Work Down My Sister Lori’s Match List

There are some 23andme matches, but I have no idea how to find their ancestry without contacting them. Next I see Michelle. I am able to find her using a Chrome add-on to AncestryDNA which I think is called DNA Helper. She matches at 22 cM at Gedmatch. Oddly, she matches at 27.6 cM at AncestryDNA where the matches are usually less than at Gedmatch. Unfortunately, her tree is private. I have been in touch with her by email and she says she is related to the Hatch family somehow. The next match is Sean at FTDNA, but he has no family tree.

Summary and Conclusion

  • The DNA shows that there is a common ancestor between the paternal matches that I have on a particular segment of Chromosome 10
  • Finding the one common ancestor of a triangulated group is difficult
  • It is likely that there are holes in the ancestry trees of these Chromosome 10 matches. If all those holes were filled in, then the common ancestor may become apparent.
  • While I was doing this exercise I filled in some missing ancestors on my Jewett line. One ancestor was a Reverend up in Rowley which I found interesting. So this exercise wasn’t a total waste of time.
  • Skot and I still likely match on Hathaway and Clifton. However, the DNA tests we both took don’t necessarily point to those two ancestors.
  • At this point, the only triangulated ancestors I found in this Chromosome 10 group was Daniel Wing from Sandwich b., 1617.
  • In summary, the DNA is saying that there is some kind of colonial Massachusetts ancestry passed down. However, whether that ancestry is from Dighton, Rochester or Sandwich, MA or even somewhere else is not clear.

 

 

 

 

Part 7 – Raw DNA From 5 Siblings and a Mother – DNA From Mom

I’ve spent my last 6 Blogs on this topic finding out which alleles came from my dad. In this Blog, I would like to work on finding my siblings’ and my alleles that come from mom.

The Ironic Step of Phasing – Mom Alleles from Dad Alleles

I call this ironic step in that it was my mom that was tested for DNA. Based on her results we found out a lot of the alleles that her children got from our dad who passed away quite a while ago. Now, we use those alleles we got from dad to figure out which alleles we got from mom. From the Whit Athey Paper referenced at the ISOGG Web Page on Phasing:

If a child is heterozygous at a particular SNP, and if it is possible to determine which parent contributed one of the bases, then the other parent necessarily contributed the other (or alternate) base.

 

First I copy my FillinOne Table to a MomfromDadOne Table. Then I’ll do a query on that.

This says where I am heterozygous, and I have an allele from dad, I want to see where I’m missing one from mom.

I have over 50,000 of these which will be easy to update. I will want to put Joelallele2 in the blank where JoelfromDad = Joelallele1. Then I will want Joelallele1 in the JoelfromMom space when my allele from Dad is Joelallele2.

I ran this query twice for each sibling, so 10 times. This updated 50-60,000 alleles per sibling, so about a quarter of a million alleles altogether.

Finding Mom Patterns

Now that I have filled in more alleles from Mom, it should be easier to find Mom Patterns. Here is a Query to find Min and Max for the AAAAB Pattern:

Results in:

This saves a lot of time and gives me the start and stop positions of all the AAAAB Mom Patterns. In my previous look which I now see as premature, I only found 2 AAAAB Patterns. Now thanks to my MomfromDad update above, I have at least 17 AAAAB Patterns. The only drawback is that if there is more than one AAAAB Pattern within a Chromosome, it will not show that. However, if I run all the Mom Patterns, and find overlapping Patterns, that can be reconciled later. In fact, I see an overlap already:

The first AAAAB Pattern I found was 162-233M which I did see as large. I already had found an AAABA Pattern from 192-249M. This could mean that AAAAB goes from 162-192 and that the 233M AAAAB pattern was just an outlying singleton.

I also recall that I want ID’s, so I’ll add that to my query:

Because I have so much new information, I’ll put this into a new spreadsheet:

AAABA Mom Pattern

I just have to change the Query slightly to get the AAABA Mom Pattern:

The results of this Query go into the new spreadsheet. This spreadsheet will be sorted by Chromosome later.

I added a column for IDEnd minus IDStart:

Where this is zero, it would indicate a single Pattern.

I went through all the Mom Patterns and got a spreadsheet of 194 rows that need to be reconciled. Here are Chromosomes 1 and 2 sorted:

Reconciling Chromosome 1

I have added in a column for possible assignment of a crossover to a sibling. Note that up to about 20M everything looks OK. There are discrete Patterns. ABBBA to AABBA is a change in the second position which belongs to Sharon. The change from AABBA to AABBB goes to Lori. Then the AABBB is the same as BBAAA which goes to ABAAA. That would be my crossover [Joel].

I did a Query showing where all the alleles were filled in for the Mom Patterns:

This shows where my Crossover is at ID # 8984. I have added a few more columns to my Mom Pattern Spreadsheet to add the more refined cut points:

Next I’ll look at 77M.

As best I can tell, there are two single AABAB’s in the middle of an AABBB Pattern. Next I will want to find the start of that AABBB Pattern. To find that I do a query to look for the AABBB Pattern in Chromosome 1. That Query results in more AABBB Patterns.

A Problem

I have a problem in that it appears that the Mom Patterns of AABBB and AABAB appear to overlap each other on Chromosome 1. I assume that means that I did something wrong.

refilling the dad patterns

That means that I should go back and fill the Dad Pattern back in:

First I recreate a Fill-in Table using the old Three Principles Table. Then I do update queries on that. Hopefully these numbers will work:

Back to Mom Patterns From Dad Patterns

Just so I’m not going backwards, I’ll redo this step. I copied my revised fill-in Table to a revised Mom from Dad Table. This time I’ll keep track of the alleles for fun:

So in retrospect, I don’t know if I made a mistake with the Dad fill-in’s or in the Mom fill-in from the Dad Pattern. Hopefully, there were no mistakes this time.

 

Hartley YDNA and STR Tree: New Results

This Blog follows on my previous Blog on the subject.  In that Blog, I drew a two person 111 STR Hartley Z17911 Tree. Hartleys that are fairly close to me are assumed to be positive for the SNP Z17911 which was my terminal SNP.

When I look at the new Hartley results, I get the following Hartley Z17911 111 STR signature:

A few points from this new signature:

  • Previously, I was not able to have a 111 STR Hartley Mode. Now with three testers, that is possible. I fudged the mode for 576 as there were three different results: 17, 18 and 19.
  • The first Hartley on the list above is what I was calling the Bradford, West Yorkshire Hartley in the previous Blog
  • The second is the new tester with ancestor William Shepherd Hartley from Manchester, England.
  • The third on the list is me.

A New Hartley SNP

Previously, my terminal SNP was Z17911. Now there is a new shared Hartley SNP called A11132. Here is the SNP tree from the R-Z16357 Project web site:

Thanks to testing by another Hartley with Quaker Pennsylvania and NE Lancashire roots, I have moved down the tree past A11138 to A11132. I am guessing that other Hartleys that am related to by STRs will share this SNP. That means that the Hartley STR Mode I mention above, will also likely be the A11132 Mode.

Some Genealogy For the Newly Tested Hartley

This is part of what I was given for the ancestors of the New Hartley:

William was born in about 1851 in England. (1), Lancashire to be exact (2). His parents were Thomas Hartley and Hannah Shepherd (2).
I was able to find William’s Birth:
From there I found the 1851 Census:
This was a big deal as it shows that the father Thomas was born in the little village of Wray, Lancashire in the Northwest of Lancashire. Thomas’ wedding record was helpful in giving a middle name.
Name: Thomas Townson Hartley
Gender: Male
Marriage Date: 27 Mar 1826
Marriage Place: Manchester, Lancashire, England
Spouse: Hannah Shepherd
FHL Film Number: 1545585
Reference ID: pg152 ln456
From there I searched using the Lancashire Online Parish Search:
Baptism: 26 Aug 1804 St Margaret, Hornby, Lancashire, England
Thos. Townson Heartley – Son of Christopr. Smith Heartley & Mary
Born: 11 Aug
Abode: Wray
Occupation: Hatter
Register: Baptisms 1790 – 1805, Page 47, Entry 5
Source: LDS Film 1526204
Further searching lead to another Christopher Hartley ancestor:
Baptism: 3 Jul 1774 St Wilfrid, Melling, Lancashire, England
Christopher Smith Hartley – Son of Christopher Hartley & Alice
Abode: Wray
Performed at: Hornby Chapel
Register: Baptisms 1752 – 1781, Page 49, Entry 5
Source: LDS Film 1849660
Here is Wray and Hornby in NW Lancashire:

Here is where the Smith name comes in:

Marriage: 16 Dec 1752 St Wilfrid, Melling in Lonsdale, Lancashire, England
Christopher Hartley – Wray in this Parish
Ann Smith – Hornby in this Parish
Notes: X [in left margin]
Register: Marriages 1752 – 1754, Page 1, Entry 6
Source: LDS Film 1849660

Actually, the Bishop’s Transcripts show that Ann may have been Alice:

This is where my easy searching stopped. I did get further than I did on my own Hartley line. We now have a Christopher Hartley for our new YDNA tester probably born around 1725 who lived in Wray, Lancashire in 1752

The reason I go through all the genealogy is that it is interesting to match up the historic Hartley homelands with the DNA. Here is a map with our three testers:

To the upper left of the map shows a circle around Hornby for our new tester. My ancestors were just south of Colne and the other 111 Hartley STR tester had ancestors in Thornton, near Bradford. The distance between Thornton and Wray is probably no more than 35 miles as the crow flies.

Back To the DNA

With my new Hartley 111 STR Signature, I get this tree:

  • Again, it seems obvious to split the two groups by the 455 STR. 455 mutates 0.16 times every one thousand generations. I don’t know about you, but to me that seems like a pretty rare thing. My thinking it that this just happened once.
  • The next three slowest STRs are 540, 1B07 and 445. I had all those mutations, so that puts me by myself. Those three STRs are in the 111 panel, so I won’t be able to check those against other Hartleys until more Z17911 Hartleys test to 111 markers.
  • This groups Thornton and Wray together even though they are further away from each other geographically.
  • How could this tree be dated? If we take the Hartley Mode date to be the beginning of surnames, this could be around 1300 or 1400. A wild guess would be the that the Wray/Thornton ancestor could be about 100 years after that.

A New 67 STR Z179111 Hartley Tree

I say Z17911 Hartley Tree, because there are other Hartleys in other SNP groups that would not be closely enough related to be in a STR tree. First, we need a new 67 STR signature. This signature should be more accurate than the STR signature up to 67 STRs that was done for the 111 STR Tree. This is because there are more Hartleys that have tested 67 STRs.

  • I kept the Hartley mode for 455 as 11 even though it is technically 12. This is because at the low mutation rate, I didn’t think that it could have mutated up and down again in the time frame we are looking at. If I am interpreting the mutation rate correctly, there would be a 16% chance of this STR mutating in about 3,000 years.
  • In the previous analysis, I was the furthest away from the Hartley 111 STR mode. Here, I am the closest. This is because a lot of my differences were in the 111 STR Panel.
  • My inclination is still to separate the two groups of Hartleys by the slow moving 455 STR.

Here is the new 67 STR Hartley Tree:

  • What I was calling the Lancashire and West Yorkshire Hartleys, I’m now calling the Hartley 1 Line and the Hartley 2 Line.
  • I had already grouped Bradford, West Yorkshire and Hartley #3 by 449 and 576. Now I’m grouping our new tester with the West Yorkshire, William based on 389b and CDYb.
  • The Wray, Lancs Hartley and the W Yorks Hartley would be quite a ways apart from each other geographically. Yet they seem to be related by YDNA. Perhaps the Wray, Lancs Hartleys had their roots in West Yorkshire.
  • Joel and Quaker Hartley are the two that have taken the big Y tests. They are both also identified by the A11132 SNP.

Summary and Conclusions

  • Hartley YDNA has been in its infancy but is starting to grow. This is thanks to those Hartleys that have had Big Y tests and STR tests.
  • It would be interesting to see if all the Hartleys in this study have the Z11132 SNP. It is possible that this could be the Hartley SNP. However, this is based on only two Hartleys testing positive for it so far.
  • The 455 STR marker seems to be important in splitting the two Hartley branches. It will be interesting to see if that marker also corresponds to a specific SNP.

 

Cousin Holly’s Hartley DNA Results

I have many 2nd cousins. Over 100 I’m sure. My Hartley great grandparents had 13 children. All their descendants in my generation are 2nd cousins.Holly is one of those 2nd cousins. My first recollection of Holly is that she was creating a bit of commotion at our Town’s ball field. I was probably about 5 years old at the time. I had an impression that she may have been a relative but I wasn’t sure. Holly was challenging the local boys in a foot race and beating them. I was thinking that she was one cool girl.

So far on my Hartley side, those in gold below have tested and uploaded and uploaded to Gedmatch.com:

Note that Patricia and Beth are also first cousins to each other.

Here’s Holly’s grandmother Grace Hartley. I borrowed the photo from Holly’s Ancestry Tree:

Does she look like Holly? I think so. Except I don’t picture Holly as looking as serious.

All the Hartley cousins in the chart above have James Hartley and Annis Louisa Snell in common. But we won’t know which – easily. Another point is that everyone has eight great grandparents. So all the second cousins get 2/8 or 1/4 of their DNA from these two great grandparents. That is, on average. Here are the numbers of how Holly matches the tested Hartleys:

The Gen is how far it seems that the common ancestors are away based on the DNA match. James, my dad’s 1st cousin seems 2.5 away. That is just right for a 1st cousin once removed. Holly should match her 2nd cousins on average at a level of three. That is because our great grandparents are 3 generations away from us. Because of the random way we get our DNA, however, Holly is more closely matching Joel, Beth and Patricia and is further away matching on my four siblings.

The X Chromosome Rule

There is a rule that the X Chromosome does not pass down from father to son.

That means that no X Chromosome from Greenwood Hartley got passed down to any of us. That also means that no Hartley X Chromosome got passed down to anyone in my family. That is why Holly matches James, Beth and Patricia on the X Chromosome and only incidentally matches Lori and Heidi from my family.

Here is how Holly matches James, Beth, Patricia and incidentally my 2 sisters.

Holly and Jim have a longer match as they are more closely related (1st cousin, once removed). As a rule, the more closely you are related, the longer the segments.

Shared Autosomal DNA

Holly and I share this much DNA:

By comparison, here is my overall Chromosome map before I add in my DNA matches with Holly:

On my map, the James Hartley/Annie Snell part is shown in darker blue. It looks like Holly’s DNA could add quite a bit to my map. Ideally, if I could test enough relatives, the dark blue whould fill up 1/2 of my paternal chromosome. The other half should be from my paternal grandmother who was a Frazer.

Here is Holly’s DNA added in. I also added a maternal first cousin who contributed to my first substantial X Chromosome match:

Remember I get no X Chromosome from my dad (top part of each line). So that has to be blank on the X Chromosome.

Next I’ll add in 1st cousin once removed Jim to Holly’s map:

Jim’s contribution to our great grandparents is in blue. Notice that now the X Chromosome is kicking in.

Adding beth’s DNA to Joel and Jim

Here is the addition of Beth’s DNA:

Note that Holly has a lot of matches on Chromosomes 5 and 9. That must mean that Holly got most or all of her paternal DNA on that Chromosome from her Hartley grandmother, Grace May.

Kicking it up a notch

Next I’d like to add my siblings’ results to ‘the other matches on Holly’s Chromosome map. My siblings’ results plus mine should be similar in size to Holly’s matches with Jim, my dad’s first cousin. It takes 5 siblings to get about the same DNA as you would have for one parent. While I’m at it, I’ll add Patricia.

This is all Holly’s DNA that she got from James Hartley and Annie Snell, her great grandparents based on the matches that we’ve looked at so far. I probably should have lumped Beth and Patricia together as they have the same Hartley grandmother [Mary], but I didn’t.

Separating the Hartley and Snell DNA

One thing I would like to do would be to separate the Snell DNA from the Hartley DNA. If I could do this I could find matches that were just Snell or just Hartley. The DNA matching is about narrowing down the possibilities. The best way to do this would be to have a match that is known to be a Snell but not Hartley or a Hartley but not Snell. Unfortunately, I don’t know of any such people. The next best thing to do is to guess. One way to guess is called phasing by location. So, say I have a match with a lot of ancestors from colonial New England, but not Lancashire. And I would need to know that I match this person on my Hartley side (not my mother’s side). I would say that this would likely indicate DNA from the Snell Line. That is because the Snell ancestors go back to Colonial New England and the Hartleys came later from Lancashire, England.

My Chromosome 16

Here is a section of the first part of my Chromosome 16 matches (without the matches’ names) in spreadsheet form:

Each line represents a different match with someone. About half way down this list I have a match with Ned at 39.93 cM. I don’t know who our common ancestor is, but Ned has a lot of colonial New England ancestors, including the Warren Pilgrim family. I also am descended from the Pilgrim Warrens, but it is generally thought that a DNA match that large would be likely to last that long.

Triangulating with ned

Triangulation shows what common ancestors unknown DNA matches may have. Triangulation is when you match someone’s DNA, they match another person’s and you and the other person all match. Successful triangulation shows that all the DNA came from the same ancestor.

Here is my match with Ned:

Here is Holly’s match with Ned:

To close the loop, I have to match Holly in the same area of Chromosome 16:

No problem. This shows that Holly, Ned and I share an ancestor. By Ned’s Ancestry Tree, we think this is a New England Colonial ancestor, but we aren’t sure which New England Colonial ancestor it is. However, as Annie Snell has New England Colonial ancestors and James Hartley doesn’t I am pretty sure I can assign this segment to Annie instead of James.

This means I can update my Chromosome map with my first New England Colonial piece of DNA represented by Annie Louisa Snell on Chromosome 16. This is shown in light blue:

The other interesting thing about this piece of DNA, is that it not only is from Annie Louisa Snell, it is also from some New England Colonial person – the one I haven’t figured out yet that we have in common with Ned.

Other New England Colonial Connections Between Holly and Me

AncestryDNA recently came out with a new feature called Genetic Community. That feature lumps you into a group with a bunch of other people based on your DNA testing. One of those groups is called Settlers of Colonial New England. Here are my Genetic Communities (or GCs).

Notice I get a Likely rating for those Colonial Settlers. Holly, on the other hand, has one Genetic Community:

She gets a Very Likely. That means she is super Colonial New England. Holly has a Connection Link under her Settlers of Colonial New England. Under that link is another link that leads to “…a list of all 238 of your DNA matches who also belong to this Genetic Community.” Under my similar link I have 110 DNA matches. However, Ned that I mentioned above matches me under Settlers of Colonial New England. He doesn’t match Holly in her list for some reason – even though I showed that we triangulate. In addition, Holly and I match each other on our lists of DNA matches under Settlers of Colonial New England.

Summary

There’s plenty more I could have written about, but I’m a gonna wrap it up:

  • Holly is more Colonial than I. I expect her other non-Snell ancestors contributed more in this area
  • I looked at a way to separate out ancestral DNA when other reference matches are missing
  • We are getting a good group of Hartley/Snell descendants that have had their DNA tested and have uploaded to Gedmatch.com for comparison
  • I never knew Holly looked so much like her Hartley/Snell grandmother.

An Updated Z17911 Hartley STR Tree

In my last Blog on the subject, I wrote about a Hartley Z17911 STR Tree. Since that time, I created a broader Z17911 STR Tree. However, that broader tree was not the best idea. Soon after creating that tree, I found out that at least one person in that tree was actually in a new SNP group further downstream from Z17911. This was based on Big Y and SNP testing. Within not too long from creating my tree, the SNP tree as created by Jared Smith went from this:

to this:

The link to Jared’s Website is here.

So, while Goff appeared previously to be in my SNP group, in fact, he was not. He was as far as 4 SNPs away. That means that any closeness in STRs could have been coincidental. When comparing SNPs and STRs, the rule is that SNPs take precedence.

A STR Tree for Hartleys Only

At this point, it seems to make sense to create a Hartley only STR tree. There is still no guarantee that Hartleys that are related to me by STRs will have the same SNP results as me. However, I think that it is more likely than not that they will.

Since my previous Blog, there have been two new Hartley STR testers. I have the results for one of those that tested at 67 STRs and one I don’t have results for yet who tested at 111 STRs. Previously, there was one other Hartley testing at 111 STRs. I have had my STRs tested indirectly through the BigY test. YFull analyzed 500 of my STRs – although some of the results were inconclusive. That means that there are three Hartleys with about 111 STRs tested, but I only have the results for two. I should be able to create a very simple tree from that.

The First Ever Hartley 111 STR Tree

At least I think it is the first. Those in the group I’ll call West Yorkshire Hartley,  and me. My ancestors are from Lancashire, so I’ll be Lancashire Hartley. I think that this will be interesting as I feel that the Lancashire Hartleys predated the Hartleys for West Yorkshire. However, I get the impression that my Hartley YDNA administrator favors an earlier date for the West Yorkshire Hartleys. Here are the differences in 111 STRs between a West Yorkshire Hartley and a Lancashire Hartley:

There are a few interesting things from the numbers above:

  • The 16357 Mode is the SNP above Z17911, so it would be older.
  • STR 449 could be a back mutation. It goes from 32 to 31 and back to 32 for West Yorkshire Hartley.
  • The 455 STR has an orange number above it. That refers to the slowest STR mutation rate. As that is the slowest STR rate and my result is the same as the 455 modes, I infer that my STR test represents the older Hartley version. However, a sample of 2 is not much.
  • I am a GD of 14 from the West Yorkshire Hartley.
  • Both the West Yorkshire and the Lancashire Hartley are a GD of 7 from the Z17911 mode. That would have given us a tie for the oldest STR profile if we hadn’t considered the effect of mutation rates.
The simple 111 STR Hartley tree

This Tree is a bit on the conceptual side. However, it does point out some things:

  • These two Hartleys likely descend from a common Hartley. However, at this stage, we don’t have the 111 STR Mode for that common Hartley.
  • The STR mutations are therefor shown to Z17911 rather than to a common Hartley.
  • As mentioned above, I favor the theory that the West Yorkshire Hartley Line originated in Lancashire. This is partly based on something called the founder effect. That means that due to the large number of Hartleys in the Colne/Trawden area, it is possible that the area was a founding area for the Hartleys. However, the distance between the Lancashire and West Yorkshire Hartleys is not far.
  • I did not include all the STRs for simplicity. The slowest marker is shown in orange.
  • The three last slower moving STRs (540, 445 and 1B07) are in the 111 panel, so will not show up in the 67 STR analysis.
  • I have the year of 1075 (125 years per STR mutation) shown above. This is supposed to represent a difference of 7 GD. However, I don’t know if that date should represent the Hartley Mode or the Z179111 Mode. If the date were to represent the Hartley mode, then that would likely be at the beginning of when Surnames were beginning to come into use.
  • As the overall GD difference between the two Hartleys is 14, I don’t see how the difference to a common Hartley ancestor could be less than 7.
  • There is also the possibility that these two Hartleys had a common ancestor just before the implementation of surnames and that due to this relationship, common area of origin or by coincidence they both took on the Hartley surname

Back to 67 STRs

Let’s keep the above tree in mind as we get down to the six Hartleys with 67 STRs tested. Checking the tree I made in a previous Blog, I see that Lancashire Hartley (me) and West Yorkshire Hartley were at opposite sides of that Tree:

In the above tree, Hartley #2 is the same  as West Yorkshire Hartley.

The New 67 STR Hartley Tree

The Hartley we want to add is believed to have Quaker roots in Lancashire in the 1600’s. He also is taking a Big Y test which is exciting. The results for that exploratory YDNA test will likely show us the first Hartley family SNP. I currently have many private SNPs. However, once the Quaker Hartley tests, his SNPs that are in common with my now private SNPs should become the new Hartley family SNPs. Here are the new Hartley 67 STR results:

  • Due to the fact that there are now 6 Hartley results, this causes there to be a tie in some of the modes. In these cases, shown with a 3 in the bottom row, I used the older values. This ended up in also being the lower values.
  • I chose to make a split on STR 455. This STR has the lowest mutation rate of those in the table. I didn’t think it likely that these last three results would have mutated independently.
  • This split also separates the two Lancashire Hartleys from the two West Yorkshire Hartleys
  • Again, the Lancashire Hartleys tend to be the older group as they are closer to the Hartley mode by one GD (STR difference).
  • For these markers the Z17911 Mode is identical with the Hartley Mode. This suggests that Hartley is an old Surname.  This result agrees with the 111 STR analysis above.

A New 67 STR Hartley Tree

Here is my interpretation of the above data in a tree form:

  • The Hartley Mode results are shown in 2 boxes at the top of the Tree. This is meant to represent a common Hartley signature or the signature of a common Hartley ancestor in the distant past.
  • I split the two branches at the top based on the slow moving STR 455. These two branches appear to represent a Lancashire Hartley Branch and West Yorkshire Hartley Branch
  • On the Lancashire side, Sanchez and Joel are together due to their STR similarities
  • Similarly, Hartley #3 and Bradford West Yorkshire Hartley are together as due to their similarities
  • It appears that the Quaker Hartley’s mutations happened between the Quaker Ancestor and our Hartley tester. However, these mutation would be spread out up to the common Hartley Lancashire ancestor. The same would be true for the Hartley tester with the West Yorkshire ancestor William Hartley. However, his mutations would be spread out up to a common West Yorkshire ancestor under the above scenario.
  • Based on the above point, the Quaker Anc. and Wm. Anc. boxes in the Tree above are not really needed.
  • An early split between these two branches could explain the parallel mutations. For example, Sanchez and W Yorkshire William both have double mutations at location 398b. However, they are shown in different branches and not grouped together. Under my scenario, these two double mutation would have happened independently over a long period of time.
  • Unique mutations are in bold italics.
  • Adding the mutations up the tree gives the GD to the Hartley mode. The double mutations must be counted as two.
  • A rough guess for dating the tree would have the Hartley mode at 1100. The split between Lancashire and West Yorkshire at 1300. The further divisions around 1500. These dates are give or take 100 years or so. The bottom line represents tested Hartleys living today.

Here is the streamlined version of the new Hartley Z17911 Tree with some rough guesses on timeframes:

Summary and Conclusions

  • There would be other ways to draw the 67 STR Hartley Tree. This one seemed most logical to me.
  • The addition of a new Hartley 67 STR tests helped to define a Hartley ancestral mode. It appears to have defined a Lancashire and West Yorkshire branch of Hartleys
  • A pending BigY test should result in one or more Hartley Family SNPs.
  • It is possible that there are unique SNPs for the two Hartley branches shown as coming from Lancashire or West Yorkshire. However, it may take a BigY test from a Hartley from the West Yorkshire Branch to confirm this.

A Z17911 STR Tree

Previously, I wrote a Blog on a STR Tree for Hartleys that were likely Z17911’s. In this Blog, I would like to look at others that have tested to be Z17911 or are likely Z17911 due to STR patterns. Since my last Blog, a lot has been going on in the little area of Z17911.

Z17911 in the L513 Tree

Z17911 is a small group under the L513 Group. L513 is a group under L21 which is a part of R1b. The L513 Tree is presently bursting at the seams:

One of the larger branches of L513 is S5668. That takes up about 2/3 of the lower left of the tree above. Here is a blowup of the Z16357 Branch of S5668.

At the time that I wrote the last blog, Merrick and Thomas were in the same location under an unnamed SNP. Now it has been named as BY11573. The placement of Merrick and Thomas below Z17911 was a result of my Big Y Test. Now Bennett has also taken a Big Y and found to be BY1157.

Enter Jared Smith on the Z17911 Scene

Jared Smith has been a large contributor on the Z17911 scene of late. He tested positive of Z17911 recently and has ordered a Big Y test. He is not to be confused with the Z16357 Smith above. Jared has developed an excellent web page called The R-Z16357 DNA Project. Jared has also created a discussion list for Z16357. Here is Jared’s updated version of the Z16357 Tree:

The part that I am most interested in is Z17911 and BY11573.

My First Attempt at a Z17911 STR Tree

First I took the 15 people listed as having STR results at the FTDNA L513 project. There are 6 that have tested positive for Z17911. There are an additional 9 that the administrator has put into a JM STR Cluster. The administrator figures that based on the STRs, they should also be Z17911’s. According to the administrator, Mike Walsh:

“You can see the “J” people 390=25,26 458=18,19 449=31 446=14. I would call this the “J” STR signature.”

I looked at the significant STRs for the 15 known or suspected Z17911’s and got this:

This was just for the first 37 tested STRs. I have the STR names at the top. I have the mode for L513, S5668 and Z17911. I tried to group the YDNA testers by patterns in their STR values. The GD is the Generational Distance. That means that the Phillips are closer to the Mode and Bullock and Bennett are furthest away. That would mean that Phillips should have the oldest pattern and Bennett the newest.

Here is the tree I built based on the above:

My intention was to have the oldest STR groups branching at the top and the newest branching nearer the bottom. I note that when I built my STR Tree for the Hartleys, I did it the opposite way.

The Problem with my first Z17911 STR tree

The tree was OK based on the way I did it. However, it did not account for one very important thing:

The STRs should account for the fact that the BY11573 SNP derives from Z17911. SNPs are the anchor and STRs may vary. Maurice Gleeson has promoted this type of analysis. In the old days, there were not as many SNPs. Now, due to Big Y type testing, there has been a tsunami of SNPs and it is now possible to incorporate them into STR analysis. When I added the SNPs to my STR chart, I noticed something interesting:

It took a while to see it, but I saw that all the BY11573 men had 13 or more for DYS439. All those who were Z17911 and not positive for BY11573 had a DYS439 of 12. Then I decided to sort my chart by DYS439:

Next I changed the DYS439 Mode for Z17911 from 13 to 12. This created a new oldest line of Gilroy. If DYS439 is the break between Z17911 and BY11573, then Phillips is now in the older, more signature BY11573. The results of a pending Phillips Big Y test will tell us for sure soon whether Phillips is BY11573 positive or not.

More SNP Structure

Jared Smith built a more  detailed SNP tree here based on recent testing information:

Here is the Z17911 part I’m interested in:

I would expect that the STR tree would follow the SNP tree. Here is a simple SNP/STR Tree with a few signature STRs that I have added in on the left top and bottom:

What if DYS439 = 12 is Z17911 and DYS329 = 13 is BY11573?

The Z17911’s I’m talking about are negative for the SNP below of BY11573. Until more testing comes in, that is the out on a tree limb assumption I’m making. Based on that and some other Hartleys that have had the YDNA tested, here is a spreadsheet for Z179111 positive and BY11573 negative people.

This Chart does not show DYS439 as these are all of the above have a value of 12. In the Chart above, I note a Gilroy/Goff/Smith signature of DYS391 = 11 and DYS576 = 16. That leaves the Hartley signature as DYS391 = 10 and DYS576 = 17, 18. I went back to the older S5668 Mode to get a feel for the overall direction of the STR mutations.

Z17911 STR Tree

Here is the tree I drew from the above STRs.

I tried to learn how to make these trees using two different methods, so it gets a bit confusing. In this method, only two lines are allowed to come out of each box. I like that method, but it required me to put in a Hartley Ancestor box under the West Yorkshire Hartley Ancestor box. On the bottom line, Gilroy probably has the oldest Z17911 signature. The Hartleys on the right have the newest signatures. Actually Wm. Hartley going up has the most STR changes (7), so I suppose he would have the most recent STR signature. Jared Smith has noted that I am positive for the SNP A11130, so it will be interesting to see if this is a defining Hartley Family SNP or not. Above I made a guess on the West Yorkshire and Lancashire Hartley split based on the knowledge that one of the Hartleys has West Yorkshire ancestors and that I on the bottom right have Lancashire Hartley ancestors.

Some BY11573 Patterns

I’m not ready to build a BY11573 Tree yet. However, I did note some BY11573 patterns.

Interestingly, most of the places where I found patterns were on the BY11573 positive people shown in darker blue above. If I were to draw a 37 STR BY11573 Tree at this time, it would just include those above highlighted in blue. The actual list of names was taken from Jared’s website and includes other names.

Next Steps

Next we wait for pending tests to come in and others who may decide to test. We are also awaiting analysis of the Bennett Big Y test from Alex Williamson at the L513 Page of the Big Tree.

Beth’s Hartley DNA

In this Blog, I will be looking at Beth’s autosomal DNA. That is the DNA that she got from both her parents. However, I am more interested in Beth’s father’s mother’s DNA as she was a Hartley and the DNA that we share would be Hartley DNA.

Hartley Tree of DNA Testers

Here are those closer relatives that have had their DNA tested and uploaded to Gedmatch.com:

Here Hartley is shown as green and Snells are shown as yellow. The DNA testers are in gold. Any DNA that the four DNA testers have in common will belong to James Hartley and Annie Snell. However, it will be difficult to tell which. Any DNA that Patricia and Beth share could also belong to Charles Nute which Jim and my family will not share. Here is an example of that on Chromosome 1.

Here is a photo believed to be Mary Hartley with her sister Nellie:

Hartley and Nute DNA On Chromosome 1

This is a Chromosome browser from Gedmatch.com showing where Beth shares DNA with Heidi (1), Joel (2), Sharon (3), Jim (4) and her first cousin Patricia (5). Is the DNA that Beth and Patricia share Hartley DNA or Nute DNA? To find that out we can look at Patricia’s DNA browser. If she shares DNA in this same area with Heidi and Jim, then it will be Hartley DNA.

The above Browser shows Patricia matching Beth (1), Jim (2) and Joel (3). This means that the DNA that first cousins Beth and Patricia share in Chromosome 1 is Nute DNA. If I were to map Patricia’s maternal Chromosome 1, it would probably look like this:

This shows that Patricia got her green DNA (matching Jim and me) from her Hartley maternal grandmother and her pink DNA (matching Beth) from her Nute maternal grandfather.

First Cousins Vs. Second Cousins

First cousins share two grandparent as their most recent common ancestor. Second cousins share two great grandparents and get their shared DNA from one of them. The first cousin DNA matches will be larger in general. The second cousin matches will tend to be smaller.

First cousins

As shown above, first cousins will share the DNA from two of their grandparents. In the case of Patricia and Beth, those two grandparents will be maternal grandparents. The catch is, that when two first cousins match each other, they won’t know which grandparent they match on. They just know that it will be one or the other. In the example above, we did know which grandparent matched because of other second cousin matches.

second cousins – Two common Great grandparents

Second cousins have as their most recent common ancestors two of their great grandparents. But again they won’t know which great grandparent they are matching on.

The best way to identify which great grandparent the gold people match on would be to have a third cousin that is only related on the Hartley side OR the Snell side. I don’t know of anyone in this category right now, so I’m a bit stuck. I would like to figure out which DNA is which. The main reason is that I’m stuck on the Hartley genealogy. I know that Greenwood’s father was Robert, but before that, I’m not sure. If we could find another Hartley relative going back then it might break down the Hartley brick wall.

Any Other Way To Separate Hartley DNA From Snell DNA?

There is one main difference from James Hartley and Annie Snell above as it relates to their DNA. James was born in Bacup, Lancashire, England and Annie was born in Rochester, Massachusetts. All of James ancestors would also have been born in Lancashire. On the other hand, all of Annie’s ancestors that would produce matches go back to Colonial Southeastern New England. That means that if we find a match that is from England and has no ancestors in the United States, there would be a good chance that that DNA match was through the James Hartley side.

Beth’s X Chromosome

First, let’s look at my family. There is  no Hartley X Chromosome sharing with this group because the X-DNA does not travel from father to son.

Second, look at Beth compared to Jim:

Beth got one of her X Chromosomes from her dad. This was the same X that he got from his mother Mary. Jim got an X Chromosome from his mother. She got it from James Hartley b. 1862 and Annie Snell. So Beth and Jim have James Hartley and Annie Snell in common.

These pieces of blue where Beth and Jim match represent DNA that they share from James Hartley and/or Annie Snell.

How do Patricia and Beth compare by X-DNA?

Next we will look at Patricia and Beth. They will share X-DNA with their grandmother Mary Hartley. Beth’s dad got no X-DNA from his Nute dad, so Beth and Patricia will only match on Mary Hartley.

Note here that Beth and Patricia share some X-DNA from their grandmother that isn’t shared between Jim and Beth on the left side. They also share a longer segment at the right hand side than Beth and Jim shared. However, Jim and Beth shared a segment from 123 to 138M that wasn’t shared between Patricia and Beth.

Let’s See How Patricia Compares With Jim

The only comparison left is between Patricia and Jim.

I compared the three comparisons and came up with a bit of an X Chromosome map. In the first match between Beth and Patricia, I have that match in red. On the very right there are three matches, so I have that as great grandparent 1. We don’t know which great grandparent it is – just that it is the same one. On Jim’s map, it is his grandparent 1. Going from right to left on Jim’s map, he changes from getting his X-DNA from grandparent 1 to grandparent 2. However, Patricia and Beth continue to match on great grandparent 1. In the middle there are no matches, so we can’t tell what is going on. Also the two reds and one blue on the left may actually be two blues and a red as we don’t know how they match with the segments on the right.

Beth’s Hartley (and Snell) Chromosome Map

If we look at all the matches Beth has with Jim, my siblings and me, we will have a map of her known Hartley (and Snell) DNA:

I didn’t use the DNA shared between Patricia and Beth as they are first cousins. As such, they will share Nute and Hartley DNA and it will not be as easy to tell which is which. So second cousins are good for these maps. The red is in the bottom part of each chromosome. That represents the paternal chromosome. We have not mapped any of Beth’s maternal chromosome. If Beth were to look for Hartley or Snell matches, it looks like her best bet would be on Chromosome 12.

For comparison, here is my Chromosome Map.

On my map, the blue corresponds to Beth’s red Hartley DNA. We seem to share a stretch of Hartley DNA on Chromosome 1. But where Beth has a long stretch of Hartley DNA on Chromosome 12, I have none.

 

Mapping My Chromosome 20 Using My Raw DNA Results

In a past blog, I mentioned My Big Fat Chromosome 20. That blog is also referenced on the ISOGG Chromosome Mapping Page. This particular Chromosome had puzzled me for a while due to the preponderance of matches I was getting there. I used visual phasing and determined that the overload of matches was on my paternal grandmother’s Frazer side rather than the Hartley side. I had previously supposed that the Hartley side held the key to all my matches as that side had colonial Massachusetts roots. Since that time, I had my brother’s DNA tested. He is shown as F in the bottom row below. I thought that his results might add some clarity to Chromosome 20.

chrom204sibs

Rather than clarifying things, I just got a shorter version of what I already had for Jon (F) than I had for myself (J) and my two sisters. The problem is the phenomenon of close crossovers at the beginning and end of each chromosome.  Jon also has quite a few matches in Chromosome 20 (unlike my sister Sharon who had Hartley DNA in most of her paternal Chromosome 20). He has almost 30% of his phased matches there according to his match spreadsheet based on Gedmatch.

Going to the Source – Raw Data Phasing

I have been learning how to phase my raw data based on a Whit Athey article, MS Access and the work that M Macneill has done. The Whit Athey Paper describes how to manipulate the raw DNA data of one parent and four siblings to get Dad Patterns and Mom Patterns. I have found these patterns to be useful.

Dad Patterns

Even though my dad never had his DNA tested, based on the certain principles, I have come up with a spreadsheet that shows for various sections of the chromosomes matching patterns that I have with my other three siblings. I use A’s and B’s to give a generalized pattern. The patterns will be in the order of Joel, Sharon, Heidi and Jon. Here is my Dad Pattern spreadsheet showing Chromosome 20:

dadpatternchr20

I find my gap to next column handy. The first thing that I notice is that there are not many large gaps. If there were very large gaps, that might indicate an AAAA pattern where all the siblings match (in this case a paternal grandparent). One thing that I added today is a Start and Stop. This is the first and last tested position of the Chromosome. This is good to know in case a pattern is hiding at the beginning or end of the chromosome. Let’s just look at the second line of the spreadsheet. This shows that there is a pattern of ABAB from position 0 to 10M. This means that the first and third people (Joel and Heidi) match the same paternal grandparent and the 2nd and 4th siblings (Sharon and Jon) match the other paternal grandparent.

In the third row of the spreadsheet, a new paternal pattern starts (at 10M). This is ABAA. Now sibling 1, 3, and 4 (Joel, Heidi, and Jon) match each other. The difference between ABAB and ABAA is in the last position where I have Jon. He switched from a B to an A and now no longer matches Sharon, but he does match his other three siblings on the paternal side. As Jon is the one that changed, he gets the paternal crossover at this position.

A few other notes
  • These patterns are gradual. That means that there can be only one change at a time.
  • If it looks like there are two or more changes, then either something was done wrong or you have to invert the A’s and B’s
  • For example, above in row 4, I have an AABA pattern that goes to and ABAB. On face value, it looks like three changes. However, AABA is the same as BBAB. Actually it is the first B changing to an A. This is my position A, so I have a crossover around 54M on the paternal copy of my Chromosome 20.
  • These areas of patterns are also used to fill in bases received from Dad or Mom in the particular areas that the patterns occur in each chromosome.
  • If there are only three siblings tested, these patterns are not as informative.
Mom Pattern spreadsheet

I would not want to leave mom out. Here is the pattern of her 4 children matching on the maternal side:

mompatternchr20

Like the Dad Pattern Spreadsheet, everything looks well behaved as there are no large gaps between patterns. Also there are no gaps at the beginning or end of Chromosome 20. So there you have it. That is the phased DNA for myself and my other three siblings. But it doesn’t jump out at you and I don’t have a map yet. That is where I bring in the MacNeill <prairielad_genealogy@hotmail.com> Spreadsheet.

MacNeill’s Excel Spreadsheet

I adjusted MacNeill’s Chromosome 1 spreadsheet by replacing default numbers for Chromosome 20. Then I added in the locations I had in the spreadsheet above. Those are the Start36 and Stop36 columns. The 36 refers to Build 36 locations which Gedmatch uses. After that I colored in the bars to be consistent with the visual phasing I had done previously.

chr20map1

Actually, I now see that I colored Sharon’s paternal  bar backwards. She should have mostly Hartley (blue). This transposition also carried through to the next image, but I corrected it in the final image. I like having labels, so I copied this into PowerPoint and added some:

chr20map2

Next I add any appropriate cousin matches for Chromosome 20. I also made the sibling names on the left a little bigger. My mistake above on Sharon’s paternal bar is corrected and verified by her large paternal Hartley cousin match with Jim below.

chr20withmatches

I had to bring this back into PowerPoint to re-add the surnames. The places where the cousin matches start or stop may be crossovers for me and my siblings. From comparing the top part of the chart to the bottom, it should be obvious which crossovers are for me and my siblings and which are for the cousins. The good news is that the raw DNA phasing confirms my initial visual phasing done in January, 2016. The raw DNA phasing just filled in what I was unable to. The other good news was that there were significant cousin matches on both the paternal and maternal side of Chromosome 20 to make sure that all the grandparents were identified correctly. Since I did the original visual phasing last January 2016, I have gotten the DNA results of 2 more cousins. Also one additional cousin who previously had her match to only me at 23andme uploaded her results to Gedmatch.

Notes/Summary

  • The hard work in Raw DNA phasing is assigning all the bases of the siblings to the correct parent. Then patterns are discerned and noted.
  • The fun part is mapping out the results.
  • Raw DNA phasing and mapping is more accurate and complete than visual phasing. However, it takes a lot of work and works best when there is at least one tested parent.
  • The comparison of the raw DNA mapping to the actual cousin matches points out the fuzzy boundaries noted by others. This may be seen in Sharon’s short Lentz segment. Her cousin Judy match (who has Lentz ancestry) appears to exceed the length of Sharon’s Lentz segment.
  • Out of the four siblings, Sharon is the one who didn’t get the huge dose of Frazer ancestor matches. That means that she would be the best for looking for smaller matches at Gedmatch.com. Her smallest match is 9.3 cM (5.9 Gen) and my smallest match at Gedmatch is 10.7 cM (5.2 Gen).
  • At a glance, one can see who is the best person for finding matches with each of the four side of the family. For example, I received a full dose of Lentz DNA on Chromosome 20. Here is my Lentz grandmother (b. 1900) in her younger days. Her DNA is represented in yellow in the charts above.

emma

Using M MacNeills Raw DNA Phasing Spreadsheet and My Problem Chromosome 10

I have written many blogs about phasing my own raw DNA. One of the things that was bothering me while going through the process was the presentation of the results. It is possible to phase millions of bases using the raw DNA results from one parent and at least 3 siblings. But once the DNA is phased, how can those results be best portrayed? In my previous Blog on the subject, I was able to figure out a fairly simple way to show my results, but the outcome was not totally satisfactory.

chr7patmatmap

I liked how I was able to get the grandparents’ surnames at least in the first 2 bars. I also liked how I had a simple scale at the bottom. However, one of my bars went too far. Also, my simple chart started at zero and Chromosomes start at different positions. I was able to fix the bar going too far today. Excel makes these bars based on distance rather than positions, so one of my equations was wrong.

I told M MacNeill <prairielad_genealogy@hotmail.com> of my concerns and he sent me his spreadsheet. One feature I really liked about the MacNeill Spreadsheet is that it had a place for cousin matches at the bottom. Below is the first Chromosome where I used my phased raw data from my mom and 3 other siblings to create a MacNeill Chart.

chromosome15macneill

Sharon’s maternal first little segment didn’t work out perfectly, but that didn’t bother me. I know that the beginning and ends of Chromosomes can have small problematic segments. Note at the bottom that my match to Carolyn in yellow shows where my maternal crossover is in the upper part of the chart where I go from red to orange.

My Chromosome 10

I am looking at my Chromosome 10 because, for one thing, I have had trouble trying to visually phase this Chromosome in the past. Here is my attempt at visual phasing from early in 2016:

chr10visphase

Here is another try including additional cousins that tested:

10r1visphase

Note how different the maternal (lower) side is. I switched most of the maternal grandparents around.

Here is the MacNeill spreadsheet showing just the cousin matching part:

cousinmatch10macneill

I have some good matches here. Blue is Hartley, green is Frazer, yellow is Lentz. Red is Rathfelder. This makes it clear that my chromosome is mapped wrong. I need more Hartley and Lentz. The above chart includes my brother who I had tested not too long ago.

Here is another try with my brother’s DNA results included:

10visphase3

My sister Sharon (S) has a better look now on her maternal side. I got rid of the small purple segment.

Looking At the Raw DNA Phasing – Paternal Side

I have two spreadsheet summarizing the results of the many hours of work it took to phase my family’s DNA  from the raw data. One spreadsheet is for the paternal side phased DNA and the other is maternal. I have patterns for both sides. They are based on the order of my siblings: me (Joel), Sharon, Heidi and Jonathan. So an ABBB pattern would mean that Sharon, Heidi, and Jonathan all get their DNA from one grandparent, and I get mine from the other. Here is the paternal spreadsheet:

dadpatternchr10

These patterns go logically one to the other. The first pattern goes from AABA to AAAA at position 2,605,158. The B changed to an A in Heidi position, so the crossover goes to her at that position. I have a column called GaptoNext. This is based on the number of tested SNPs between patterns. When this number is large, I suspect an AAAA pattern. That was the case above highlighted in yellow. Except there is a problem. To go from ABAB to AAAA means 2 changes, and there should only be one change (or crossover) at a time. This caused me to look at the bases.

A Paternal pattern missed

Here is what I found.

chr10patternmissed

I had missed an AABA pattern at Build 36 Position 30,683,878. I took another look by setting my MS Access query so that Sharon and Heidi would have a different base from Dad:

chr10rawpatterns

This shows that the there is a change from ABAB to AABA even sooner than I thought between ID 400008 and 400045. This is an ID I created that sequentially numbers the tested SNPs. You can see another way I missed this pattern, because I didn’t fill in the missing bases. TTC? should be TTCT. CCT? should be CCTC.

What does the missing pattern represent?

The pattern of ABAB TO AABA is actually my crossover (Joel). It is a bit more difficult to see than the others. That is because the ABAB pattern is the same as BABA. The change of BABA to AABA is my change of the first B to the first A. Naturally, I put myself in the first position. In rough terms, that gives me a paternal crossover at about position 30.5M. This is a good location as it does not interfere with a large match that I have with an unknown paternal DNA relative named Shamus:

shamus

Here is my corrected Dad Pattern for Chromosome 10:

dadpatternchr10corrected

I have gone from 6 to 8 crossovers as the previous correction lead to another one. I also took out one of Heidi’s crossovers that I had wrongly identified. So fixing one problem fixed a lot of others. It helps to describe the start and stop of each pattern and to describe each crossover. The important results are the person and the last Position column. These show who the crossover belongs to and where that crossover occurs on the chromosome. I then entered the paternal crossover results into the MacNeill Spreadsheet and got this:

patchr10chart

I took out the large space between the siblings. The problem is that the space is now the same as between the maternal and paternal phased part for each sibling. Excel has no happy medium that I’ve found.

The blue is Hartley and green is Frazer. The raw phasing in the upper part of the chart matches with the cousin matches below. It is interesting that some of the cousin matches define the crossovers. For example, the Jim to Sharon match gives Sharon’s crossover. Also the Paul to Sharon match gives Sharon’s other crossover. The Paul to Jonathan match gives Jon’s first crossover.

The Maternal Side

Hopefully resolving the maternal phasing will be easier than the paternal side. My visual phasing only showed four crossovers. Here is my unfinished spreadsheet showing 5 crossovers (under the Person column):

maternalchr10

Here, it looks like I already added an AAAA pattern to the end. That was because the AABA pattern ended at about 114M and the Chromosome itself ends at about 135M. My GapstoNext column showed that gap as almost 20,000 SNPs. My question now is: should I add an AAAA pattern to the beginning also? Perhaps. An AAAA pattern means that 4 siblings match and all got their DNA in that area from their maternal (in this case) grandmother. Those results were consistent with how I had the visual phasing done. In fact, the visual phasing indicated that the 4 siblings should all get their maternal DNA from the Lentz side up until about 60M. Let’s take a closer look. This gets at my first note above in the spreadsheet image. There were only 3 single SNPs showing the AAAB pattern and they were spaced a long way apart – over 10 Megabases each. In this case, I will disregard those 3 widely spaced patterns as some type of mistake and stay with the AAAA pattern. Once I made the change from the AAAA pattern to the AAAB pattern, that brings us up to about 60M for my (Joel’s) first crossover. That seems to fit well. That leaves us with 4 crossovers – one per sibling as opposed to the two per sibling on the paternal side.

First I’ll compact the Gedmatch browser results, then show the raw DNA Phasing results on the MacNeill Chart:

gedmatchcheckofrawphase

chr10phasemap

When I compare the results, I see a problem I had with the visual phasing. The next to the last crossover looked to belong to Sharon, but instead it belonged to Heidi. Also Jon’s second paternal crossover should have been marked as an “F” above. That was just a typo. The third J for Joel crossover that I had above was not a crossover. In the middle, the 2 close crossovers of J and S should be instead S and J if I’m reading the MacNeill Chart correctly. It looks like all the FIRs and HIRs, etc. match. Once I did the raw DNA phasing, it is obvious how the gedmatch browser results had to match the raw DNA phasing results. Before, I did the raw DNA phasing it was not so obvious.

I’m happy with the results. I get to pick whatever colors I want for the four grandparents. It still would be nice to have some sort of labels or color key. After a hard day of phasing DNA, it is rewarding to see the results displayed so nicely. Thank you Mr. MacNeill.

A few observations:

  • The 4 siblings did not inherit any Rathfelder DNA (brown) on the left side of Chromosome 10
  • Lentz DNA (yellow) is missing from the right side of the Chromosome for the same 4 siblings
  • As I have my mother’s DNA results, that would make up for the missing DNA from those 2 maternal grandparents
  • Short segments of Hartley DNA (blue) are missing near the beginning and near the end of the Chromosome (i.e. none of the four siblings inherited Hartley grandfather DNA in those areas).

Summary

  • M MacNeill has the best display that I am aware of for mapping phased DNA.
  • The final mapping is like the final exam where previous mistakes are brought out, but there is a chance to correct them.
  • The phasing process is difficult, but there are built in checks and balances to find and correct mistakes or missed patterns.
  • The raw DNA phasing procedure (I use the Athey method) would generally be used if a parent has been tested and the visual one is used if a parent has not been tested. However, the visual phasing as developed by Kathy Johnston is important to use as a framework for the raw DNA phasing as well as a check for the end result.
  • The raw DNA phasing results appear to be better than what I was able to get using the visual phasing. Not because the visual phasing method is bad; more because I have not mastered it.
  • If you are using someone else’s spreadsheet, it is a good idea to know how they work in case anything goes wrong.
  • After writing many blogs on visual and raw data DNA phasing, it is nice to see everything come together using the MacNeill Spreadsheets and Charts.

DNA Phasing of Raw DNA When One Sibling is Missing: Part 10

In this Blog, I would like to portray my phasing results in an Excel Bar Chart if possible. This has been one of the most difficult parts a phasing my DNA for me.

I have looked at Stacked Bar Charts in Excel as they seem to be the closest to what I am looking for. Today I looked at a method for producing Gantt Charts at ablebits.com which seems to hold some promise of application for DNA mapping:

bar-chart-excel

I had my Maternal Patterns’ Starts and Stops from my last blog. I took those and converted them to Build 36 and put them in a spreadsheet:

momcrossoverstable

Start is the ID# I was using. Start36 is the Chromosome position of the Start of the pattern in Build 36. App ID is the approximate position of the Crossover. Then I have that same location in Build 37 and Build 36. Following the logic in the Ablebits.com tutorial, I have the first Maternal Crossovers for Chromosome 7 in my simplified Chart:

matfirstxover7

I got this by choosing the Build 36 column and choosing Insert Stacked Bar. I suppose a better Title would have been Chromosome 7 Maternal Crossover rather than Build 36. This was taken from my Column Header. The goal is to get a 2 color bar above. However, I already see a problem. The bar needs to be different colors for different people. Well, I have to start somewhere.

Next, I put in the next crossover location for each person. I took this position and subtracted from it the first Crossover to get a length.

step2crossexcel

You may note that the Bar Chart inverts the original order. It gives Sharon a 4 which is now on top. Here is my visual phasing of Chromosome 7 that I am trying to replicate:

chr7visphase

My Excel Bar Chart order is Sharon, Jon, Joel, Heidi. My visual phasing order is Sharon, Joel, Heidi, Jon. The 2 maternal colors I have above are green and orange representing Lentz and Rathfelder. If I keep orange as Rathfelder, that means I want to change bar 2 and 3 (Joel and Jon) on the Excel Bar Chart. One way to do this is to move over the first Crossovers for Joel and Jon in my spreadsheet:

modchart

However, that made the 2 male siblings’ first maternal grandparent match too long. I needed to move the start over 2 places in my spreadsheet:

mat7revised

Now the Chr7 Maternal Crossover column can be called Lentz and the 2length column can be called Rathfelder.

Next, I added another column for the next Lentz portion of DNA:

chr73rdxover

I was hoping that if I named the next column Lentz, that Excel would give me the same blue as the first Lentz. I was able to right click on the gray and change it to blue. I then added another Rathfelder segment. For this to work in Excel, a Rathfelder length is added rather than a start and stop location.

chr7xover3

Again, I had to reformat the Excel-chosen color to be consistent with what I had for Rathfelder. I chose the last position for Heidi and Sharon as the highest that I had as this was their last segment. After a bit of wrangling with Excel, I was able to get this:

chr7

So that is the presentation. However, I notice that on my visual phasing, I had 5 segments for Jon and only 4 here. I missed his last Rathfelder segment. I had ended Jon’s Chromosome too early. Here is the correction:

chr7corrected

It still looks like one of Jon’s crossovers in the middle of the Chromosome may be off, but I’ll have to figure that out later.

Paternal Bar Chart

Now that I have something that looks like a maternal Chromosome Map, I need the paternal side to go along with it. It looks like if I add 4 more rows to my spreadsheet, I may have it.

I did this and I added Hartley and Frazer (my paternal side grandparents) to the right of the maternal side grandparents. I had to make a new chart that came out like this:

chr7matpat

Here #4 is my Paternal DNA. I found it a bit disconcerting that my paternal side was longer than the maternal. Here I’ve added a bit of formatting and made the colors consistent (one color per grandparent):

chr7patmatmap

Well, I guess I’ll just leave this imperfect. It will give me something to work on later. I did change the scale from millions to M’s to be easier to read.  The above shows that Jon and Heidi share their paternal grandfather’s Hartley DNA un-recombined on Chromosome 7.

Summary and Conclusions

  • Learning how to phase my raw DNA has been interesting and time consuming
  • Delving into the A’s, G’s, T’s and C’s promotes understanding of one’s DNA
  • I owe a lot to M MacNeill and Whit Athey in learning how to do this phasing
  • Due to the data intensive nature of phasing, I would recommend the use of MS Access or some other database software.
  • An understanding of Excel or similar spreadsheet software is also important.
  • I had tested my brother Jon as an afterthought. It turned out that his test results were important in determining the phasing of the 4 siblings.
  • I have the overall skeleton of the phasing with crossovers. There is still a lot of work to complete the individual Chromosomes and trouble shoot problem areas.
  • Further, I have not worked on the X Chromosome due to the different nature of that Chromosome. My brother and I are already phased. My sisters are not.
  • Once these maps are done they will be a reference to all matches to my 3 siblings and myself.