AutoCluster Endogamy tool at GEDmatch.com (Part 3)

Continuing on with GEDmatch’s AutoCluster Endo tool, as I play around with the settings.  I kept adjusting my mother’s kit and could not find the right settings.  Basically, I could not produce any clusters simply based on the settings expected with non-endogamous populations.

What I ended up doing was adjusting the Min average segment cM lower and lower (from 15cM down to 9cM) each time until I got it right at that threshold where I knew it would produce the larger (endogamous) cluster.

I repeated the same thing under Shared Match Filter by selecting 9cM (screen shot shows 10cM which is what I used for myself) for Min average segment cM

So with my mother, it seems that she currently does not have any decent size matches to cluster.  I would have noticed that since I normally would sort her matches (as well as my own) by Largest Segment size.

Interestingly with my own matches, I noticed something unique and unexpected and I know it had to do with the Shared Match Filtering which I will go over in a bit.

But first I want to stress the importance of knowing what the Min average segment cM would be and why.  Again, given the past 11 years I have been trying to weed out all endogamous matches, and I have noticed that the largest segment size for many of the matches rarely exceeds 12cM.  I noticed this at FTDNA, 23andMe and at GEDmatch.

I was able to go through my Ancestry matches, also the same for my mother and one of my cousins and have seen what that average size looks like.  So taking a closer look at what that looks like (using Ancestry matches as an example).

Whether it is in the predicted 2nd cousin range, 3rd cousin, 4th cousin, or distant cousin, the average size you will see (based on the total shared cM divided by the number of segments) is around 8cM.  I identified my endogamous matches in the closer (2C range), but never completed the entire lengthy list going down to the 4C range except for the known relationships where it is highlighted in yellow.

I highlighted an actual 2C1R who falls in that predicted 4C range (remember that at the 3C level, you will not match about 10% of them) and whose average segment size is 8cM.  While those numbers (total shared divided by #segments) seem to be similar to the other matches in that range, the longest segment size is more than 20cM.  That is something that we rarely get, at least with this closely predicted range.  This relative does have a few other lines or branches that are of the same endogamous background, which explains why there are many segments.

Now take a look at one of my cousin’s average size segments even with their endogamous matches predicted to be in the 1st cousin range.

That average segment size is still 8cM due to the number of segments for that given total shared cM.  These are a cousin’s matches so I did not take the time to highlight and identify their known relatives, but it should be obvious which ones are the true, close relatives.

Okay, now that the average segment size is defined, and identified why that number is what we see, you have an idea what number to utilize (assuming that your average segment size is also below 12cM) in this tool and what amounts would remove all the endogamous matches.

These are the results for my matches utilizing the 10cM average size segment.  It’s not too small, although I could have easily made that slightly larger, like 15cM.

(Sidenote, you can easily zoom in & out of the cluster.  So having a lot of matches in a cluster and trying to reduce it is a plus!)

So it divided my grandfather’s matches (blue) from my grandmother’s matches (orange).  But if I took a careful look at just my grandfather’s side, this is what I noticed.

Focusing on the blue cluster first, I have a pair of 2nd cousins who are siblings (indicated in white) and another 2nd cousin and her 1st cousin 1x removed (my 2C1R) in the blue cluster which is my grandfather’s side.  The two white 2C’s grandmother, and the other 2C’s grandmother (the 2C1R’s great-grandmother) were sisters to my grandfather.

Then we have the 2C1R in red whose grandfather was a brother to my grandfather’s mother.  The (yellow) 2C2R was a 2C to my grandfather.  But here is a clearer picture of how each match is connected to me and each other.

[“k” is kāne or male, “w” is wahine or female]

So going back to that cluster (above), you can see where the problem is where the 2C2R on my grandfather’s father’s side is matching the 2C1R on my grandfather’s mother’s side.

Ideally, the clusters should be separated by pairs of ancestors.


Looking at the details of the two clusters,  I can see why it was able to separate my grandmother’s matches from my grandfather’s.  Remember, I normally do not get any separation when it comes to my maternal side.  Even with my paternal side being a different population, I have paternal relatives whose other side belongs to the same endogamous (Kanaka Maoli) population and can generate gray marks indicating that they are part of more than a single cluster.  Some other endogamous populations will have a lot of these, maybe you might too with your matches.

Only 2 other matches are making up the second cluster.  My 2C and a 3C1R to both me and my 2C.  I share with the 2C a total of 241cM across 13 segments, and the largest segment 40cM.  With my 3C1R I share a total of 148cM across 10 segments, and  largest segment of 40cM.  Comparing those two cousins with each other, they share a total of 108cM across 10 segments, with the largest segment being 26cM.

Since I selected 20cM for the minimum largest segment, it pulled up these actual relatives of mine, all of whom have the largest segment size as small as 29cM, and as large as 41cM.  These high settings helped remove the endogamous matches.  Not only that, it was able to at least separate my grandmother’s side from my grandfather’s side.  What is important to know is that at GEDmatch I do not have any close enough relatives on my grandmother’s father’s side, only on her mother’s side.  Maybe if I had a few close relatives, we could have seen how my grandmother’s father’s side would mix with her mother’s side as well?  Who knows.  It is just amazing to me that this tool, given the opportunity to adjust these parameters could help break the matches into actual clusters.  I am speaking from an endogamous perspective and how we have to deal with the high amount of closely predicted shared matches.

AutoCluster Endogamy tool at GEDmatch.com (Part 2)

In my last blog entry  AutoCluster Endogamy tool at GEDmatch.com (Part 1), I covered briefly about the settings that are adjustable when ready to produce clusters and what is suggested for Polynesians. I also mentioned Leah Larkin referring to different levels or degrees of endogamy based on the average size segment and given that size, what works best or how to approach your DNA matches.  I had to play around with it quite a bit in order to get decent clusters.

First, understanding the high settings put in place when you select “Highly Endogamous.”

So the minimum largest segment is 30cM, which is what I’ve been promoting for nearly 11 years.  This was based on my own observation when I tested a second 1C1R and could compare that 1C1R with another 1C1R who had tested a few months prior, and are 2C to each other.  Given that we have a lot of predicted 2nd to 3rd cousin matches (100cM – 300cM) where the largest segment size rarely would exceed 20cM.  With these two 1C1R who are 2C to each other, I noticed that their largest segment was 41cM.

It wouldn’t be till about 2 or 3 years later when I heard others somewhat following that same analogy and seeing the significance of a largest segment size indicating a closer ancestral connection versus an endogamous one.  This was specific with Ashkenazi Jewish background and how they seem to be set on 20cM.  By this time, I had already determined that 30cM would be best.  Also, with other Polynesians who share a true 2nd to 3rd cousin relationship, their largest segment would be larger than 20cM.  And among all the endogamous matches, it rarely would exceed 20cM.

So that was the reason why I determined that 30cM was a good amount to be used in the Min largest segment cM.  And while this blog entry is specific for this new AutoClustering tool for endogamy at GEDmatch, I have noticed that at MyHeritage, even with the endogamous matches that the largest segment size could exceed 30cM. However, what would also be indicative of an endogamous match vs. a truly close 2nd to 3rd cousin match, is the number of segments.

Taking a closer look and right next to the minimum largest segment size is the number of largest segments.  Thought this was interesting and not sure if it’s necessary or not.

 

I am assuming that when it asks for minimum largest segment followed by the number of largest segments, that would mean it will have your smallest — largest segment size set at whatever number you have selected, times whatever you selected under number of largest segments.  In other words, if you select 100cM min largest segment size, it will require that the smallest size you have is not smaller than 100cM.  And the number of largest segments, say I select 10, it would require that you have at least 10 segments no smaller than 100cM.

You rarely would get a largest segment of the same size, or at least not that I have seen in both endogamous and non-endogamous matches.  After all, these direct to consumer DNA testing companies are showing you the size of the largest segment that you have among all of the matching segments that you share.  This is probably why I initially was not generating any matches/clusters simply because I had it set to 2.  So my suggestion is to change it to 1.

So all of those parameters are allowed under the Primary Match Filtering section.  Then you have the Shared Match Filtering section which is nearly identical to the Primary Match Filtering section except you also have a minimum shared cM between shared matches which is what you also see when you run an autocluster with MyHeritage or Genetic Affairs directly.

With this parameter, you can tell it how much your DNA matches must share with each other to be considered to be put into a cluster.   And what I did was set it to as low as 100cM since I have hundreds of matches from 100cM up to 200cM.  My advice is that if you’re not admixed, or rather you have less foreign branches, definitely increase that higher than 100cM.  It was easier for me to guess the numbers to use since I know how many matches I have, and out of these varying ranges of shared DNA, how many matches I would have for each.

For example, I have hundreds of matches predicted to be 2nd cousins (Ancestry).  That is I have hundreds of matches sharing as low as 200cM and as high as 649cM.  In the predicted 3rd cousin range I have over a thousand of these type of matches which range from 90cM to 199cM.  And predicted 4th cousins, more than 24,000.  These range as low as 20cM and as high as 89cM.

In my previous post I showed an example of what my autocluster looked like from MyHeritage and that I sorted it (by total shared cM) from the lowest to the highest.  The lowest was 108cM, and from there it slowly went up.  I had 12 matches sharing 108cM.   I also had 12 matches sharing 109cM.  The number of matches sharing about the same amount can be a lot.  So understanding this will help you decide the best numbers or amounts to use when creating your clusters.

I am hoping that others from various endogamous groups start utilizing this new tool and am really curious how it will affect their research, expecting it to be for the better!  Since I am still trying to generate various clusters by constantly adding in varying numbers, I will not be posting any examples of what they look like.  Perhaps in a future blog post I will.

I also noticed that with a list of files when generating these autoclusters at GEDmatch, you also get csv files to be used in Gephi.  I posted samples of that in my post from December 2022 called In-common-with, shared matches, and clustering.  I will have to take time to also try to use these actual clusters and look to see how Gephi renders it.

AutoCluster Endogamy tool at GEDmatch.com (Part 1)

Evert-Jan Blom of Genetic Affairs developed a new AutoCluster Endogamy tool on GEDmatch together with Jarrett Ross of GeneaVlogger. Introducing it as AutoCluster Endo (AutoCluster Endogamy when you see it on GEDmatch) is a modified version of the AutoCluster clustering tool designed specifically for those dealing with endogamous matches. It was created to address analyzing endogamous matches more efficiently by filtering for the most relevant (shared) matches.

Thanks to Jarrett Ross for bringing up specific features he mentions in his video.  It allows you to filter your primary matches by adjusting the average segment size, minimum largest segments, and number of largest segments.  It also allows you to filter by your shared matches using the same filters as for primary matches and in addition, the total amount of shared cM between shared matches.

When I used to run the AutoCluster tool at MyHeritage, I noticed people would post their examples mentioning how endogamous their matches were or how burdensome, and problematic it was to deal with it.  I also noticed a marked difference between their clusters and my own.  For one, they had more than one cluster.  I initially only had a single cluster until I uploaded one of my 1C1R with whom I do not share as much DNA (as expected I guess for someone of that relationship) and was enough for this tool to pick up.  This cousin of mine appeared in my second cluster with other relatives on my paternal (non-Polynesian) side and he also produced gray squares matching several matches in the first/large cluster.

I emphasized in my AutoCluster for others to take note that the minimum threshold implemented was not 20cM or 30cM like many others that I remember seeing.  Mine was significantly higher.

I also sorted my match list showing the lowest amount at the top, sharing 108.1cM, so the 26 matches I decided to show only shows from 108cM to 110cM.  Of course there are 470 other matches that comprise that large cluster.

I kept pointing this out to others, how our minimum threshold will vary across different populations, depending on the amount of shared DNA we have our matches and the number of matches, etc.  There is a bit more freedom with utilizing Genetic Affairs directly.

With this AutoCluster Endogamy tool at GEDmatch, you can do quite a bit.  This tool is offered to Tier 1 subscribers (Tier 1 pay-as-you-go membership $15 per month and Recurring monthly Tier 1 memberships $10 per month) only.

The first thing you will notice is that you have the option to select the level of your endogamy or how endogamous you are.

The default is set to “Not Endogamous.”  While I only tried the “Endogamous” option to see the difference from the “Highly Endogamous” (Polynesians should be using “Highly Endogamous”) and noticed that the parameters were set higher to numbers that are very familiar to me.

Leah Larkin (The DNA Geek) has shown in her presentations charts of various endogamous populations and to what degree of endogamy each has to deal with.  This is where I first saw how she utilized the average size segment to quantify endogamy, how to gauge how much endogamy you are really dealing with.

She took the amount of shared DNA for Close Relatives (Ancestry), predicted First Cousins, Second Cousins, Third Cousins, Fourth Cousins and Distant Cousins, divided by the number of segments to come up with the average size segment.  What was presented were various sizes present in specific endogamous populations.  She had mild, moderate and strong endogamy.  These were average size segments present in specific (predicted) relationships, i.e. 1C, 2C, 3C, etc.

In her comparison, the one that had the smallest average size segment were Polynesians. She also separated to demonstrate what Western Polynesians had compared to Eastern Polynesians.  She has confirmed (although many of us probably noticed this already) how endogamous, or extremely (“Highly” is the term used for this AutoCluster Endogamous tool at GEDmatch) endogamous Polynesians are.  This could not be done without the help of others submitting their samples to Leah for analysis.  I was able to submit one Samoan and two Kanaka Maoli samples to her to utilize. And the results were worth it!

Having said all of that, do know that Polynesians should automatically select “Highly Endogamous.”  This seems to raise the Min average segment cM and other parameters.  This image below is an example of what it looks like when you do not select anything and keep it at the default “Not Endogamous.”

Even with “Not Endogamous” you can still adjust the settings to your liking.

So below are the settings that you would automatically see when selecting “Highly Endogamous.”

It is important to note, based on what I have seen others post with their own comparison and my 11 years of noticing the largest segment size among Polynesians and known relationships, that the Min largest segment cM selected for 30cM is a good minimum amount to use.  This is what you would expect around the 2nd Cousin level.

I have at Ancestry and MyHeritage (as do other relatives of mine) endogamous matches whose largest segment exceeds 30cM yet what helped distinguish it from a true close relative versus an endogamous one is how they still have a significantly high amount of segments.

Below is a table of all of my matches (Ancestry) and I have highlighted my known relatives.  The ones not highlighted are the endogamous matches.

You can clearly see how with my known (highlighted) 2C, 2C1R, and 3C1R relatives (Predited as Second Cousin) the number of segments aren’t always as high. The ones that are, they have little to no non-Polynesian lines, which means more Hawaiian branches that are coming up as matches to me.  But, the largest segment is coming from our most recent common ancestor.  Notice that for the New Zealand Maori and Kanaka Maoli matches the number of segments are really high.

For comparison, this (table below) is a cousin of mine.  Although I did not indicate the true close relatives, it should be obvious based on the high amount of segments plus the average segment size which ones are truly close relatives.

For the past 11 years, this is what I have been noticing. That it was not common to see DNA matches among Polynesians (mainly Kanaka Maoli and NZ Maori) whose largest segment size exceeded 20cM.  Utilizing the average size (taking the total shared cM divided by the number of segments), we see 7cM and 8cM to be the norm both in my cousin’s predicted First Cousin matches and my predicted Second Cousin matches.  It is pretty common even when looking at the 3rd Cousin, 4th Cousin, and Distant cousin matches.

So this is why we have the type of results you would see with autoclustering and why the need to be able to adjust these parameters in order to find the best matches (true close relatives) to be used in clustering.

So now we have an understanding of what to expect among Polynesian DNA matches as far as the average size segment, the number of segments (to help get the average size segment), and the largest segment size.  In my next blog entry, I will address the results of running this tool and how adjusting these may or may not be as useful.

One thing to note is that various companies will use the longest block (FTDNA), longest segment (Ancestry), and largest segment (MyHeritage & GEDmatch) for the same thing.  I may use these terms interchangeably, but for this particular GEDmatch tool, I’ll only refer to it as largest segment.

 

Ancestry updates their ethnicity yet again

As of November 13, 2019, everyone’s AncestryDNA results were updated.  Back in late October, only a few people have been getting the new update and all new testees.  Now we are all on the same page.

They did several changes which include increasing the number of genetic communities for various populations, increasing the size of their reference samples, renaming of categories and adding in a few new categories such as Guam, Samoa and Tonga.

We are going to concentrate on Samoa and Tonga, which they attempted to split off from the rest of Polynesia.

When AncestryDNA created the Polynesia category back in December 2013, it only consisted of 18 Polynesian samples which included at least one (or possibly more) of the samples that have distant European ancestry.  They updated their category and rolled out the new update to everyone back on September 12, 2018 with an additional 40 more samples increasing to a total of 58 for Polynesia.

In June and December 2018, I had the opportunity to speak to David Turissini, Ph.D who is a population geneticist at AncestryDNA.  I expressed my concerns with him regarding more specific categories among Polynesians.  Basically splitting eastern from western Polynesia.  I also explained why I thought that would be much better for us particularly for matching as we all tend to match each other at a very closely predicted relationship.  And that I thought the low number of reference samples could possibly affect the way we get our results.

He told me that I already understood how Polynesians lack genetic diversity so increasing the number of samples would not make any difference.  But then I pointed out how it was not that difficult for me to distinguish a western Polynesian (Samoan, Tongan, Tokelau, Tuvalu) versus an eastern Polynesian (Maori, Tahitian, Cook Island Maori, Hawaiian, Marquesan, Rapa Nui).

Despite all that was said, I was surprised to see how they increased the number of reference samples for Polynesia along with adding in Samoa and Tonga.

New categories & increase of samples for Polynesia

You can read more about it here:

https://www.ancestry.com/cs/dna-help/ethnicity/estimates

So their reference samples of 16,638 has increased by 23,379 samples to a total of 40,017.  Of that amount, they added 130 more samples to the Polynesia category and creating Samoa with 73 and Tonga with 97 samples.

While I have not noticed a lot of Tongan results yet, I have seen several Samoans.  Most of the ethnicity results I have seen are either Hawaiians or Maoris.  For the most part, eastern Polynesians are getting either Samoa and/or Tonga in the range of 1% – 4%.  For Samoans, I’ve seen about 60% – 70% Samoa and the rest Tonga.  A few Cook Island Maoris seem to have a higher percentage of Samoa compared to other eastern Polynesians but that may be due to the fact that they have ties to Aitutaki or its neighboring islands versus Rarotonga.  Or maybe Cook Island Maoris just have a higher percentage because of another group of people that settled earlier and/or it could be due to the original people who just so happened were genetically more like Samoans.

This whole classification, while it cannot be accurate as it is nothing but an estimate, really makes it interesting and gives us a bit more of an insight as to the settling of Polynesia.  Of course we can also see this as more people are getting Y-DNA tested and mtDNA and we slowly learn more about these different migration patterns which no surprise, confirms our oral histories.

My results have changed throughout time since I tested with AncestryDNA back in January 2014.  The biggest breakthrough came last year as they actually created the Philippines category which correctly allocated my Filipino side from Polynesia, therefore decreasing my amount.

But what does my tree look like compared to my current DNA results?

 

With the latest update it made my color scheme more difficult to accomplish but in the tree I do point out the foreigners.  While my father was born in Lahaina, Maui, Hawai’i, both of his parents were from the Visayas region in the Philippines.  For my maternal grandmother’s mother – Rose Holbron, her paternal grandfather was from Hull, England while her maternal grandfather was from Queens, New York, U.S.A.  And for my maternal grandmother’s father – Frank Kanae, he had distant American ties.  His great-grandfather Isaac Lewis Kanae was the son of Captain Isaiah Lewis.  I still have not pinpointed his origin yet.  And Isaiah Lewis’ father-in-law Oliver Holmes arrived in the Hawaiian Islands in 1793 from Plymouth, Massachusetts.  At the time Oliver Holmes left Plymouth, there were only 15 states in the U.S.A.

So what I did was place their ethnicities under a continental level and compared it to my DNA results, which all adds fairly nicely, taking in random inheritance.  My mother gets 17% European compared to her sister who gets exactly 15% which is consistent with the genealogy.  And in turn my mother gave not one but both of my brothers about half of her European – 8% and 9% for them while I ended up with the higher percentage – 11% which appears as about 11% – 12% at different testing companies.

And while I show 2% Samoa, my mother ended up with 1% of both Samoa and Tonga.

 

For my cousin who is not admixed, it was interesting to see, despite the erroneous genetic communities that would come up, how hers changed.  Because we match other Polynesians at a very closely predicted relationship, and the fact that my cousin is not admixed, she matches a lot of part Polynesian people who fall into a specific genetic community among others of whom she also matches.  So she ends up with the same genetic community.

 

 

With this latest update, they finally got rid of the Native American category for both my cousin and my mother.  But now with Samoa and Tonga, it is no surprise that they would give us a small percentage of that.  And having gone through several of these 1% – 2% categories of Samoa and Tonga, they all seem to range the same – 1% – 4%.  Interestingly for my mother, her range for Tonga was 1% – 3% while her Samoa was 1% – 4%.  But the way it ended up was both 1%.

I have also been witnessing those who previously had small amounts of Polynesia now being reclassified as Samoa, Tonga or Guam.  Usually, these are people with either Melanesia or some other Southeast Asian from various parts of Indonesia.  I would be really interested in seeing more results who have ties to that area.

So while I was told the number of increase of samples would not do anything, it obviously did quite a bit.  If only they would have renamed the Polynesia category by specifying Eastern Polynesia.  They should also do the same renaming their genetic community.  It would make more sense as we know that both Samoa and Tonga is part of Polynesia and of course, their map for Polynesia would include Samoa and Tonga within that area.  I would have expected western Polynesia as I mentioned to them versus eastern Polynesia, but they really got very specific.  And in the end result, Samoans will see that they are about 30% Tongan and probably the same for Tongans where they will see a smaller percentage of Samoa.  These people do get about 0% – 1% Polynesia in their results.

We will just have to wait to see what the future updates would bring.

Previous entries about AncestryDNA’s Polynesia category:

https://hawaiiandna.wordpress.com/2014/12/15/polynesia-category-ancestrydna-com/

https://hawaiiandna.wordpress.com/2015/06/30/polynesia-category-ancestry-com-part-2/

Recent Founder’s Effect, bottlenecking and 6 Tahitian women on Pitcairn island

I finally got the autosomal results of a Pitcairn resident who has been a member of the Polynesian project for a year now.  Previously I had another member who is a Norfolk island descendant and whose ancestors moved to Norfolk but were originally from Pitcairn.  Another Norfolk descendant tested at another company, but his raw data were uploaded to GEDmatch.com in order to be compared.  Now having that this particular Pitcairn resident tested, I can make a comparison for these 3 people since they all have ties to Pitcairn.

 

HISTORY OF PITCAIRN ISLAND

Pitcairn was settled in 1790 by mutineers of the HMS Bounty and Tahitians1.  The initial population of 27 consisted of 9 mutineers, 6 Tahitian men and 11 Tahitian women along with an infant girl.  Only 6 of the mutineers and 6 Tahitian women would produce descendants.

Mutineers:
1) Fletcher Christian
2) Edward Ned Young
3) John Mills
4) William McCoy
5) Matthew Quintal
6) John Adams

Tahitian women:
1) Mauatua Maimiti
2) Teraura
3) Teio
4) Tevarua2
5) Vahineatua
6) Toofaiti

 

POPULATION GROWTH, DECREASE & RE-POPULATION

The population started with 27 people but only 12 of them would produce descendants.  By 1840 the population exceeded 100, and by the mid-1850s the community was outgrowing the island3.

On May 3, 1850 the entire community left for a 5 week trip and settled on the island of Norfolk on June 8.  Nearly 3 years later 16 of them returned to Pitcairn.

Screen Shot 2015-12-21 at 9.03.27 AM

 

EFFECTS WITH AUTOSOMAL DNA

I have mentioned in previous blog entries that eastern Polynesians are genetically less diverse than western Polynesians.  So it should be no surprise that Hawaiians and Maoris as well as Tahitians will come up as closer matches to each other despite sharing common ancestors 8 centuries ago.

Now we are looking at two things.  Firstly, a founding population where only 12 people produced offspring, and half of the 12 being Tahitian women, or eastern Polynesians.  And these 12 were not paired off equally.

Screen Shot 2015-12-21 at 9.32.29 AM

They married multiple times, some of them never produced descendants with their other spouses.

Secondly, there was a population bottleneck in 1859.

Screen Shot 2015-12-21 at 9.35.35 AM

In 1856 the population expanded to 193, then the entire population left.  That population was already interrelated just 66 years after the initial 12 founding people started the population.  They all left, but 16 of them returned.  Eventually, a few more returned but the remaining population continued life on Norfolk island while the rest of the Pitcairns were starting the population again. It would take only 23 years to repopulate the island increasing the population to 250.

 

ANALYZING A PITCAIRN RESIDENT’S AUTOSOMAL DNA

The Pitcairn resident descends from all of the 12 founding people.  No surprise, given that small amount plus that was just 225 years ago and 7 generations ago for this particular person.

Although I cannot show with a family tree how many times they descend from the 12 founding people due to size and the complexity of the tree, I decided to list the number of times they descend from each of the 12.

Screen Shot 2015-12-21 at 9.50.22 AM

This resident’s paternal grandparents are 2nd cousins one way, and 3rd cousins another way while their maternal grandparents were 2nd cousins two ways.  There are more ways that they are related going further back as well, but my genealogy software cannot pick up the multiple relationships and it seems to select the closest relationship but selected 2nd cousin once removed, so not sure which line it was picking up.  This person’s maternal grandfather was born on Pitcairn but there is no known genealogy for him.  For their other grandparents, here is who they descend from.  (Founding people in bold)

Paternal grandfather – Christopher Warren, son of George Warren whose mother was Agnes Christian, and Alice Butler whose mother was Alice McCoy.
Paternal grandmother – Mary Christian, daughter of Sidney Christian & Ethel Young.
Maternal grandmother – Ivy Young, daughter of William Young & Mercy Young.

Agnes Christian and Alice McCoy were 2nd cousins, great-granddaughters of Fletcher Christian and Mauatua.  Ivy Young’s parents William and Mercy Young were 2nd cousins two ways to each other.  Great-grandchildren of Edward N. Young and Toofaiti and of Fletcher Christian and Mauatua.

As confusing as it seems, you can imagine how would DNA show up.  After uploading the raw data to GEDmatch.com for further analysis, I immediately ran the “Are Your Parents Related” tool.

Screen Shot 2015-12-21 at 10.07.52 AM

It predicted 3.3 for the most recent common ancestor (MRCA).  Still not sure how to interpret GEDmatch’s MRCA estimation, but in reality, the most recent common ancestor would be their 2nd great-grandparents – Thursday October Christian II and Mary Polly Young.  And there were other Youngs as I previously mentioned and Christians as well.

When I ran my mother’s kit through that same tool, her largest segment was 13.9cM, and there were a total of 5 segments that would total 51.5cM.

Largest segment = 13.9 cM
Total of segments > 7 cM = 51.5 cM
Estimated number of generations to MRCA = 4.1

Unlike the Pitcairn resident whose largest segment was 24.7cM and with 11 segments.  My mother’s parents were from different islands and as far back as I was able to trace their ancestries, they did not intersect nor did their ancestors come remotely near to each other given that they were from 3 different islands.

I would love to get more Pitcairn residents to test, to see if there is any noticeable pattern using this tool, or David Pike’s ROH.  If there is, we definitely could use it in helping to determine a true close genetic match versus an endogamous one.

 

COMPARING TO NORFOLK DESCENDANTS

There are 2 particular matches to many of the Polynesian DNA project’s members and both of these 2 people are descendants of Norfolk residents.  I will refer to them as Norfolk #1 and Norfolk #2.

Norfolk #1’s maternal grandmother was from Norfolk and she was the daughter of Francis Nobbs and Ruth Christian.  Norfolk #2’s maternal grandfather was from there, and his parents were William Adams and Sarah Christian.

A further breakdown where I bold the founding people.

NORFOLK #1
Francis Nobbs’ ancestry, son of Alfred Nobbs & Mary Christian:
Paternal grandfather – George Nobbs
Paternal grandmother – Sarah Christian, daughter of Charles Christian & Tevarua
Maternal grandfather – Benjamin Christian, son of John Buffett & Mary Christian
Maternal grandmother – Eliza Quintal, daughter of John Quintal & Maria Christian

Sarah and Maria Christian were daughters of Charles Christian & Tevarua, while Mary Christian was their 1st cousin.

Ruth Christian’s ancestry, daughter of Isaac Christian & Miriam Young:
Paternal grandfather – Charles Christian, son of Fletcher Christian & Mauatua
Paternal grandmother – Tevarua, daughter of Teio
Maternal grandfather – William Young, son of Edward N. Young & Toofaiti
Maternal grandmother – Elizabeth Mills, daughter of John Mills & Vahineatua

NORFOLK #2
William Adams’ ancestry, son of John Adams & Caroline Quintal:
Paternal grandfather – George Adams, son of John Adams & Teio
Paternal grandmother – Polly Young, daughter of Edward N. Young & Toofaiti
Maternal grandfather – Arthur Quintal, son of Matthew Quintal & Tevarua
Maternal grandmother – Catherine McCoy, daughter of William McCoy & Teio

When comparing the two Norfolk descendants to the Pitcairn resident, I was surprised to see no overlapping segments.

Screen Shot 2015-12-21 at 1.36.43 PM

Screen Shot 2015-12-22 at 12.58.16 PM

It is interesting to see how for Norfolk #1, the largest segment is 40.85cM for the largest segment and a total of 134.5cM.  The largest segment is significant, and although Pitcairn & Norfolk #1 are related multiple ways, the closest known relationship makes them 4th cousin once removed.

Comparing Pitcairn to Norfolk #2, the largest segment is 27.3cM, which for Polynesians in general could be pretty distant.  Total shared is 95.1cM.  And just as with Norfolk #1, Norfolk #2 and Pitcairn are related multiple ways, but the closest relationship makes them 4th cousins.

At the moment I cannot compare Norfolk #1 and Norfolk #2, but I am trying to get one that taken care of in order to upload Norfolk #1’s raw data to GEDmatch for further analysis.

I was expecting to see the overlap at least when comparing to the Pitcairn resident given that their ancestors’ have been on the island since the beginning, but it goes to show how unpredictable and random DNA can be.

A list of all 3 and how many times they each descend from the following founding population.

Screen Shot 2015-12-21 at 1.46.23 PM

And while various Polynesians can be compared to all three of these people and may show overlapping segments, there is really no way to map these segments.  These 3 testees would match other project members based on segments inherited by one or more of these 6 Tahitian women that settled on Pitcairn.  And we all would have shared common ancestor(s) from at least 8 centuries ago.

Below I compare the Pitcairn resident to a Hawaiian, a Maori and a Cook Island Maori as well as my Hawaiian mother.  Incidentally, there is a project member whose father was from Tahiti, yet that person does not come up as a match.

(default setting)

Screen Shot 2015-12-21 at 3.40.11 PM

(1+cM setting)

Screen Shot 2015-12-21 at 3.48.43 PM

 

Comparing Norfolk #1 with the same people with the exception of not being a match to the Cook Island Maori.

(default setting)

Screen Shot 2015-12-21 at 3.41.18 PM

(1+cM setting)

Screen Shot 2015-12-21 at 3.51.14 PM

Norfolk #2 did not test at FTDNA but at 23andme, and although their raw data was uploaded to GEDmatch.com, all the others being compared were not uploaded except for my mother’s raw data.

For additional information about the DNA study of the descendants of the Mutiny on the Bounty, see ‘Mutiny on the Bounty’: the genetic history of Norfolk Island reveals extreme gender-biased admixture.

Footnotes

1. History of the Pitcairn Islands.
2. Pitcairn Settlers lists an additional Tahitian woman known as Sully, as the wife of Matthew Quintal and the mother of Matthew Jr., John, Arthur, Sarah and Jane Quintal. Another source, as well as the Pitcairn resident who got DNA tested, claims that there were only 6 Tahitian women of whom they descend from.  There was no mention of Sully, although Tevarua is listed as being married to Matthew Quintal and the parents of  Matthew Jr., John, Arthur, Sarah, and Jane Quintal.
3. Historical Population of Pitcairn.

Comparing Western and Eastern Polynesians

In my last blog entry “Tiny Segments from the Same Common Ancestors“, I began comparing Western Polynesians (Samoans & Tongans), and Eastern Polynesians (Maori and Hawaiians), and compared them to each other in order to show how the tiny segments appeared like missing teeth on the chromosome browser.  Now I will show how people compare to each other based on total centimorgans and their longest block (FTDNA).

First I compare Tongans and Samoans to each other.  Both Samoans and Tongans are Western Polynesians and are the most diverse.   Polynesian settlement began in the west in the Tonga/Samoa/Fiji area.  I mentioned this in a previous entry “Loss of heterozygosity – from Western Polynesia to Eastern Polynesia.”

T = Tongan
S = Samoan
– = no match

I colored it to make it easier to see or compare Tongans to Tongans in light green, and Samoans to Samoans in light blue.  The ones not colored are comparing Samoans to Tongans.  The top number is the total shared in centimorgans, while the bottom number is the longest block (largest segment).  The average totals seem to be between the upper 200s to mid-300s. The lower numbers (in the hundreds) is due to the fact that the person is admixed.  In other words, they are not pure Samoan/Tongan, and usually have some European ancestry.

WestPoly

Comparing Tongans to themselves:
TOTAL
lowest –  117cM (part Tongan)
highest – 340cM
average – 258cM

LONGEST BLOCK
lowest – 5.79cM
highest – 10.51cM
average – 8.54cM

Comparing Samoans to themselves:
TOTAL
lowest – 165cM (part Samoan)
highest – 366cM
average – 271cM

LONGEST BLOCK
lowest – 5.66cM
highest – 16.54cM
average – 9.20cM

Comparing Tongans to Samoans:
TOTAL
lowest –  143cM
highest – 321cM
average – 248cM

LONGEST BLOCK
lowest – 5.79cM
highest – 11.07cM
average – 7.81cM

This is what it looks like when I compare those same Tongans and Samoans to Hawaiians and Maoris who are Eastern Polynesians.

H = Hawaiian
M = Maori
T = Tongan
S = Samoan
? = unable to determine if a match
– = no match

In this graph, I again colored it for easy comparison.  Hawaiian vs. Tongans in light brown, Hawaiians vs. Samoans in golden yellow, Maoris vs. Tongans in pink, and Maoris vs. Samoans in light green.

West-EastPoly

Most of the Eastern Polynesians are admixed except for two Hawaiians and one Maori.  But those that are admixed are still more than 75% Polynesian which still keeps the totals fairly high as you can clearly see it still above one hundred with the exception of one Hawaiian who is admixed to the Tongan that is admixed.  In fact, that admixed Tongan only shares with one Hawaiian and one Maori, both less than 100cM.  Yet their longest block still falls within the range.

Comparing Hawaiians to Tongans:
TOTAL
lowest – 72cM
highest – 341cM
average – 199cM

LONGEST BLOCK
lowest – 5.34cM
highest – 12.12cM
average – 7.94cM

Comparing Hawaiians to Samoans:
TOTAL
lowest –  135cM
highest – 314cM
average – 213cM

LONGEST BLOCK
lowest – 5.09cM
highest – 11.50cM
average -7.57cM

Comparing Maoris to Tongans:
TOTAL
lowest –  68cM
highest – 240cM
average – 202cM

LONGEST BLOCK
lowest – 5.31cM
highest – 10.94cM
average – 7.87cM
Comparing Maoris to Samoans:

TOTAL
lowest –  147cM
highest – 278cM
average – 229cM

LONGEST BLOCK
lowest – 5.28cM
highest – 10.94cM
average -7.81cM

When looking at the average, it seems to be consistent as far as comparing Eastern Polynesians to any Western Polynesian.  However that changes drastically when comparing Eastern Polynesians to themselves.

H = Hawaiian
M = Maori
? = unable to determine if a match
– = no match

I colored Hawaiians in light blue and Maoris in light green when comparing to themselves.  The non-colored portion is when they one group is compared to the other.

EastPoly

Comparing Hawaiians to Hawaiians:
TOTAL
lowest –  225cM
highest – 780cM
average – 463cM

LONGEST BLOCK
lowest – 8.45cM
highest – 23.58cM
average -14.90cM

Comparing Maoris to Maoris:
TOTAL
lowest –  581cM
highest – 694cM
average – 641cM

LONGEST BLOCK
lowest – 12.51cM
highest – 19.98cM
average -16.66cM

Comparing Maoris to Hawaiians:
TOTAL
lowest –  291cM
highest – 773cM
average – 514cM

LONGEST BLOCK
lowest – 8.98cM
highest – 29.68cM
average -16.20cM

So to recap, showing just the average total shared and the average longest block size:Screen Shot 2015-04-20 at 4.45.54 PM

Although I used only 3 Maoris compared to 8 Hawaiians, it was based on the top matches to my mother.  There were a few more Maoris but I did not have access to their data and that would have allowed more “?” in the charts.  But as we can see, the Western Polynesians tend to have lower totals since they are more diverse unlike the Eastern Polynesians.  More admixed Polynesians will result in lower totals, but the longest block is not that much difference from those not admixed.

In the future I will probably attempt to look at admixed Polynesians and compare them to show the average longest block sizes compared to those not admixed.

Tiny segments from the same common ancestors

Disclaimer: This post demonstrates the use of 1+cM when comparing specific groups of people in order to see patterns of multiple descent from a few ancestors.  It should not be used to validate connections with matches, particularly in this example where connections are beyond a genealogical time frame reaching at least up to 500 years.

Recently I have been comparing both western Polynesian (Tongan and Samoan) and eastern Polynesian (Hawaiian and Maori) matches.  I compared western Polynesians among themselves, and  did the same thing with eastern Polynesians comparing them among themselves.  Then I compared the two groups to each other.

To those who are not familiar with Polynesian origins and/or are new to reading my blog, I will recap.  The ancestors of Polynesians originated from the Melanesia area and thrived there for thousands of years. Thousands of years later a group of “Austronesians” originating from Southeast Asia moved into the area, intermingled briefly and continued to move into western Polynesia where Polynesian culture was born.  At least a couple of thousand of years would pass before they would continue to expand further eastward.  As Polynesians moved from west to east, their genome became less diverse due to repeated founder’s effects and bottle necking.

oceania

I analyzed my mother’s results and compared her to a Hawaiian (orange), and a Maori (blue) below.  The Hawaiian is her top match, sharing a total of 693.60cM, longest block 15.52cM, consisting of 158 segments.  The Maori is her 4th top match sharing a total of 517.90cM, longest block 18.08cM, consisting of 119 segments.  FTDNA counts all the tiny segments as low as 1cM once the criteria of a match is met, which is why the number of segments is high.

tinyseg-mom

With the default at 5+cM I did not see anything unusual other than ordinary small segment matches.  But when I reduced the setting down to 1+cM (above), you can see a lot of tiny segments resembling a comb.  The slightly bigger gaps are just the missing teeth of a comb.  Some of these patterns begin to appear at 3+cM, although most do not appear until you reduce it down to 1+cM.  In my mother’s example above I show only chromosomes 1 – 20 since there were no segments that looked like a comb on the other chromosomes.

Then I looked at a Maori woman’s results (below) and compared hers to other Maoris and one Hawaiian.  She also shows the missing teeth at 1+cM, but only in a few areas.  Some areas have the comb pattern while other areas seem random.  The random segments could be IBS (Identical by State) or IBD (Identical by Descent).  Polynesians lack genetic diversity, particularly eastern Polynesians more than western Polynesians, so the random looking segments could be both IBS and IBD segments.

tinyseg-mary

Then I looked at two Tongan men and compared them to other Tongans and Samoans.  With Tongans & Samoans there seem to be more randomness.  A few of the tiniest segments may be close to each other, but nothing resembling too much like my mother’s results, a definite comb-pattern.  Take the purple and green colors for example for this one Tongan man below.  Notice how on some chromosomes they seem to be closer together while on others it just looks random.  Again, these are only using the bare minimum 1+cM.

tinyseg-peni

The other Tongan example.

tinyseg-keni

As you can see, it is hard to look for patterns that resembles a comb, and instead you see random colors all over the chromosomes.  What was interesting to see was how little X these Tongans had.  Unlike with the Maoris and Hawaiians, many of them shared multiple segments with each other.

But what does all of this mean?  These are very small island populations.  They have had repeated emigration from these small islands that resulted in a series of founder’s population.  There there was also bottle necking that occurred a few times.  All of these combined would leave only a few closely related ancestors to populate and repopulate new areas every time.

So the multiple, very small segments that represents a comb with missing teeth is the result of people descending from just a few ancestors who contributed that particular segment, but was inherited from multiple lines going back to the same ancestor over and over again.

Below is an image where I compare my mother with two Samoans (yellow & green) and three Tongans (orange, blue & purple).  There seems to be more randomness, however, there are a few of those comb patterns.

tiny-mom&western

Notice how the X chromosome is much more full, unlike what we saw when comparing the western Polynesians (Tongans & Samoans) among themselves. The yellow color belongs to a Samoan woman. The fact that women have 2 X chromosomes may be the reason why there is a long match versus using two Tongan men whose matches included two women in their examples above.  But these are Polynesians, so you would expect more of a match on the X.  My observance of matches for the past 2 years was limited to only my mother being compared to others, which means I have seen a lot of X matches for her, and the same for myself and my brother.

From what I am noticing so far is that these patterns look like what is mentioned in research papers about Polynesian genome and the loss of heterozygosity going from west to east.  The last place in Polynesia to be settled was in the east, ending at the extreme points of the Polynesian triangle, namely Rapa Nui (Easter Island) in the south east, Aotearoa (New Zealand) to the south west, and the Hawaiian islands in the north.  This explains why my mother and the Maori woman have less random looking tiny segments compared to the Tongans and Samoans.  And if we compare western and eastern Polynesians to each other, we may see some randomness but not as much as we would see with western Polynesians alone.  Other types of Polynesians getting DNA tested would help to exhibit any other additional patterns that I cannot currently see with the majority of Hawaiians and Maoris getting tested.

Loss of heterozygosity – from Western Polynesia to Eastern Polynesia

Genetic research on Polynesians will frequently mention the loss of heterozygosity.  This is more noticeable when comparing eastern Polynesians to western Polynesians.

oceania

Map outlining migratory paths of Austronesian speaking populations, including estimated dates. Adapted from Bellwood et al., (2011) “Are ‘Cultures’ Inherited? Multidisciplinary Perspectives on the Origins and Migrations of Austronesian-Speaking Peoples Prior to 1000 BC.” [doi: 10.137/journal.pone.0035026.g001

Polynesian populations are relatively homogenous both phenotypically and genetically. Over a span of 3,200 years they moved throughout the Pacific, and unlike in Europe and other large continents, they did not mix with other populations due to isolation.  These small founder populations have experienced several bottleneck effects, which further caused this loss of heterozygosity ending with the settlement of eastern Polynesia.  Polynesians’ lack of genetic diversity is less evident in western  Polynesia where initial settlement began.  Hawai’i, New Zealand and Easter Island are considered to be eastern Polynesia, and these places were the last places of Polynesia to be settled.

Recently I have been able to look at the autosomal matches among Samoans and Tongans of western Polynesia.  Previously, I have been only studying Hawaiian matches and noticed that top matches were both Hawaiians and Maori people.  Looking at Samoans and Tongans was very interesting as I now could compare the two different regions.

My mother is 80% Hawaiian, while I am 40%.  And as admixed as I am, I still get 1st – 3rd cousin predictions on Family Tree DNA (FTDNA), while on 23andme I get 2nd cousin and 3rd to distant cousin predictions.  The centimorgan totals that I show with my matches reach as high as 369cM on FTDNA, and 161cM on 23andme.  For my mother, 693cM on FTDNA and 376cM on 23andme.  I see the same happening with Maoris, ranging between 300cM – 700cM (FTDNA) for the top 20 people.  And for a non-admixed Hawaiian, their top matches are in the 600 – 700cM range.   An admixed Polynesian would logically have lower totals. But even an admixed person can still have a fairly high amount of totals shared, as when I am comparing myself being less than half Hawaiian.

When comparing two Tongans, the highest that they shared was 335cM.   A Samoan compared to another Samoan was 366cM.  And both of these Tongans and Samoans had their remaining top matches in the range of 100cM to 200cM.  Many of their matches are the same Hawaiians and Maori that match each other at a much higher total.  It is amazing to see these autosomal matches and how diverse the western Polynesians are, or rather how Hawaiians and Maoris are not as diverse.  And even if it is an admixed Hawaiian or Maori, the matches to each other are still pretty high, and as high as what non-admixed western Polynesians would have to each other.

When comparing the longest block (largest segment) with Tongans and Samoans, they seem to rarely get close to 15cM, averaging around 10cM.  Anything more than that could indicate a possible closer relationship or perhaps a specific common geographic origin.  The Hawaiians and Maoris usually range between 10cM – 15cM for the largest segment, but can go as high as 28cM which is usually in admixed Hawaiians and Maoris compared to each other.  In other words, all Polynesians in general will have high totals exceeding 100cM, but whose largest segment rarely exceeding 10cM.

I look forward to more western Polynesians getting tested so we can see if there is any pattern to specific islands in their own island group, something I have been trying to do with Hawaiians with the few haplogroups that there are for Polynesians.  What also needs to be analyzed are people from Tahiti and the Marquesas being that they were key dispersal points for eastern Polynesians.  I managed to only see the results of one admixed Tahitian woman and her match totals are identical to mine when comparing totals.  I am curious to find out what non-admixed Tahitians will show, if it is more identical to eastern Polynesians, or to western Polynesians.