AutoCluster Endogamy tool at GEDmatch.com (Part 3)

Continuing on with GEDmatch’s AutoCluster Endo tool, as I play around with the settings.  I kept adjusting my mother’s kit and could not find the right settings.  Basically, I could not produce any clusters simply based on the settings expected with non-endogamous populations.

What I ended up doing was adjusting the Min average segment cM lower and lower (from 15cM down to 9cM) each time until I got it right at that threshold where I knew it would produce the larger (endogamous) cluster.

I repeated the same thing under Shared Match Filter by selecting 9cM (screen shot shows 10cM which is what I used for myself) for Min average segment cM

So with my mother, it seems that she currently does not have any decent size matches to cluster.  I would have noticed that since I normally would sort her matches (as well as my own) by Largest Segment size.

Interestingly with my own matches, I noticed something unique and unexpected and I know it had to do with the Shared Match Filtering which I will go over in a bit.

But first I want to stress the importance of knowing what the Min average segment cM would be and why.  Again, given the past 11 years I have been trying to weed out all endogamous matches, and I have noticed that the largest segment size for many of the matches rarely exceeds 12cM.  I noticed this at FTDNA, 23andMe and at GEDmatch.

I was able to go through my Ancestry matches, also the same for my mother and one of my cousins and have seen what that average size looks like.  So taking a closer look at what that looks like (using Ancestry matches as an example).

Whether it is in the predicted 2nd cousin range, 3rd cousin, 4th cousin, or distant cousin, the average size you will see (based on the total shared cM divided by the number of segments) is around 8cM.  I identified my endogamous matches in the closer (2C range), but never completed the entire lengthy list going down to the 4C range except for the known relationships where it is highlighted in yellow.

I highlighted an actual 2C1R who falls in that predicted 4C range (remember that at the 3C level, you will not match about 10% of them) and whose average segment size is 8cM.  While those numbers (total shared divided by #segments) seem to be similar to the other matches in that range, the longest segment size is more than 20cM.  That is something that we rarely get, at least with this closely predicted range.  This relative does have a few other lines or branches that are of the same endogamous background, which explains why there are many segments.

Now take a look at one of my cousin’s average size segments even with their endogamous matches predicted to be in the 1st cousin range.

That average segment size is still 8cM due to the number of segments for that given total shared cM.  These are a cousin’s matches so I did not take the time to highlight and identify their known relatives, but it should be obvious which ones are the true, close relatives.

Okay, now that the average segment size is defined, and identified why that number is what we see, you have an idea what number to utilize (assuming that your average segment size is also below 12cM) in this tool and what amounts would remove all the endogamous matches.

These are the results for my matches utilizing the 10cM average size segment.  It’s not too small, although I could have easily made that slightly larger, like 15cM.

(Sidenote, you can easily zoom in & out of the cluster.  So having a lot of matches in a cluster and trying to reduce it is a plus!)

So it divided my grandfather’s matches (blue) from my grandmother’s matches (orange).  But if I took a careful look at just my grandfather’s side, this is what I noticed.

Focusing on the blue cluster first, I have a pair of 2nd cousins who are siblings (indicated in white) and another 2nd cousin and her 1st cousin 1x removed (my 2C1R) in the blue cluster which is my grandfather’s side.  The two white 2C’s grandmother, and the other 2C’s grandmother (the 2C1R’s great-grandmother) were sisters to my grandfather.

Then we have the 2C1R in red whose grandfather was a brother to my grandfather’s mother.  The (yellow) 2C2R was a 2C to my grandfather.  But here is a clearer picture of how each match is connected to me and each other.

[“k” is kāne or male, “w” is wahine or female]

So going back to that cluster (above), you can see where the problem is where the 2C2R on my grandfather’s father’s side is matching the 2C1R on my grandfather’s mother’s side.

Ideally, the clusters should be separated by pairs of ancestors.


Looking at the details of the two clusters,  I can see why it was able to separate my grandmother’s matches from my grandfather’s.  Remember, I normally do not get any separation when it comes to my maternal side.  Even with my paternal side being a different population, I have paternal relatives whose other side belongs to the same endogamous (Kanaka Maoli) population and can generate gray marks indicating that they are part of more than a single cluster.  Some other endogamous populations will have a lot of these, maybe you might too with your matches.

Only 2 other matches are making up the second cluster.  My 2C and a 3C1R to both me and my 2C.  I share with the 2C a total of 241cM across 13 segments, and the largest segment 40cM.  With my 3C1R I share a total of 148cM across 10 segments, and  largest segment of 40cM.  Comparing those two cousins with each other, they share a total of 108cM across 10 segments, with the largest segment being 26cM.

Since I selected 20cM for the minimum largest segment, it pulled up these actual relatives of mine, all of whom have the largest segment size as small as 29cM, and as large as 41cM.  These high settings helped remove the endogamous matches.  Not only that, it was able to at least separate my grandmother’s side from my grandfather’s side.  What is important to know is that at GEDmatch I do not have any close enough relatives on my grandmother’s father’s side, only on her mother’s side.  Maybe if I had a few close relatives, we could have seen how my grandmother’s father’s side would mix with her mother’s side as well?  Who knows.  It is just amazing to me that this tool, given the opportunity to adjust these parameters could help break the matches into actual clusters.  I am speaking from an endogamous perspective and how we have to deal with the high amount of closely predicted shared matches.

AutoCluster Endogamy tool at GEDmatch.com (Part 2)

In my last blog entry  AutoCluster Endogamy tool at GEDmatch.com (Part 1), I covered briefly about the settings that are adjustable when ready to produce clusters and what is suggested for Polynesians. I also mentioned Leah Larkin referring to different levels or degrees of endogamy based on the average size segment and given that size, what works best or how to approach your DNA matches.  I had to play around with it quite a bit in order to get decent clusters.

First, understanding the high settings put in place when you select “Highly Endogamous.”

So the minimum largest segment is 30cM, which is what I’ve been promoting for nearly 11 years.  This was based on my own observation when I tested a second 1C1R and could compare that 1C1R with another 1C1R who had tested a few months prior, and are 2C to each other.  Given that we have a lot of predicted 2nd to 3rd cousin matches (100cM – 300cM) where the largest segment size rarely would exceed 20cM.  With these two 1C1R who are 2C to each other, I noticed that their largest segment was 41cM.

It wouldn’t be till about 2 or 3 years later when I heard others somewhat following that same analogy and seeing the significance of a largest segment size indicating a closer ancestral connection versus an endogamous one.  This was specific with Ashkenazi Jewish background and how they seem to be set on 20cM.  By this time, I had already determined that 30cM would be best.  Also, with other Polynesians who share a true 2nd to 3rd cousin relationship, their largest segment would be larger than 20cM.  And among all the endogamous matches, it rarely would exceed 20cM.

So that was the reason why I determined that 30cM was a good amount to be used in the Min largest segment cM.  And while this blog entry is specific for this new AutoClustering tool for endogamy at GEDmatch, I have noticed that at MyHeritage, even with the endogamous matches that the largest segment size could exceed 30cM. However, what would also be indicative of an endogamous match vs. a truly close 2nd to 3rd cousin match, is the number of segments.

Taking a closer look and right next to the minimum largest segment size is the number of largest segments.  Thought this was interesting and not sure if it’s necessary or not.

 

I am assuming that when it asks for minimum largest segment followed by the number of largest segments, that would mean it will have your smallest — largest segment size set at whatever number you have selected, times whatever you selected under number of largest segments.  In other words, if you select 100cM min largest segment size, it will require that the smallest size you have is not smaller than 100cM.  And the number of largest segments, say I select 10, it would require that you have at least 10 segments no smaller than 100cM.

You rarely would get a largest segment of the same size, or at least not that I have seen in both endogamous and non-endogamous matches.  After all, these direct to consumer DNA testing companies are showing you the size of the largest segment that you have among all of the matching segments that you share.  This is probably why I initially was not generating any matches/clusters simply because I had it set to 2.  So my suggestion is to change it to 1.

So all of those parameters are allowed under the Primary Match Filtering section.  Then you have the Shared Match Filtering section which is nearly identical to the Primary Match Filtering section except you also have a minimum shared cM between shared matches which is what you also see when you run an autocluster with MyHeritage or Genetic Affairs directly.

With this parameter, you can tell it how much your DNA matches must share with each other to be considered to be put into a cluster.   And what I did was set it to as low as 100cM since I have hundreds of matches from 100cM up to 200cM.  My advice is that if you’re not admixed, or rather you have less foreign branches, definitely increase that higher than 100cM.  It was easier for me to guess the numbers to use since I know how many matches I have, and out of these varying ranges of shared DNA, how many matches I would have for each.

For example, I have hundreds of matches predicted to be 2nd cousins (Ancestry).  That is I have hundreds of matches sharing as low as 200cM and as high as 649cM.  In the predicted 3rd cousin range I have over a thousand of these type of matches which range from 90cM to 199cM.  And predicted 4th cousins, more than 24,000.  These range as low as 20cM and as high as 89cM.

In my previous post I showed an example of what my autocluster looked like from MyHeritage and that I sorted it (by total shared cM) from the lowest to the highest.  The lowest was 108cM, and from there it slowly went up.  I had 12 matches sharing 108cM.   I also had 12 matches sharing 109cM.  The number of matches sharing about the same amount can be a lot.  So understanding this will help you decide the best numbers or amounts to use when creating your clusters.

I am hoping that others from various endogamous groups start utilizing this new tool and am really curious how it will affect their research, expecting it to be for the better!  Since I am still trying to generate various clusters by constantly adding in varying numbers, I will not be posting any examples of what they look like.  Perhaps in a future blog post I will.

I also noticed that with a list of files when generating these autoclusters at GEDmatch, you also get csv files to be used in Gephi.  I posted samples of that in my post from December 2022 called In-common-with, shared matches, and clustering.  I will have to take time to also try to use these actual clusters and look to see how Gephi renders it.

AutoCluster Endogamy tool at GEDmatch.com (Part 1)

Evert-Jan Blom of Genetic Affairs developed a new AutoCluster Endogamy tool on GEDmatch together with Jarrett Ross of GeneaVlogger. Introducing it as AutoCluster Endo (AutoCluster Endogamy when you see it on GEDmatch) is a modified version of the AutoCluster clustering tool designed specifically for those dealing with endogamous matches. It was created to address analyzing endogamous matches more efficiently by filtering for the most relevant (shared) matches.

Thanks to Jarrett Ross for bringing up specific features he mentions in his video.  It allows you to filter your primary matches by adjusting the average segment size, minimum largest segments, and number of largest segments.  It also allows you to filter by your shared matches using the same filters as for primary matches and in addition, the total amount of shared cM between shared matches.

When I used to run the AutoCluster tool at MyHeritage, I noticed people would post their examples mentioning how endogamous their matches were or how burdensome, and problematic it was to deal with it.  I also noticed a marked difference between their clusters and my own.  For one, they had more than one cluster.  I initially only had a single cluster until I uploaded one of my 1C1R with whom I do not share as much DNA (as expected I guess for someone of that relationship) and was enough for this tool to pick up.  This cousin of mine appeared in my second cluster with other relatives on my paternal (non-Polynesian) side and he also produced gray squares matching several matches in the first/large cluster.

I emphasized in my AutoCluster for others to take note that the minimum threshold implemented was not 20cM or 30cM like many others that I remember seeing.  Mine was significantly higher.

I also sorted my match list showing the lowest amount at the top, sharing 108.1cM, so the 26 matches I decided to show only shows from 108cM to 110cM.  Of course there are 470 other matches that comprise that large cluster.

I kept pointing this out to others, how our minimum threshold will vary across different populations, depending on the amount of shared DNA we have our matches and the number of matches, etc.  There is a bit more freedom with utilizing Genetic Affairs directly.

With this AutoCluster Endogamy tool at GEDmatch, you can do quite a bit.  This tool is offered to Tier 1 subscribers (Tier 1 pay-as-you-go membership $15 per month and Recurring monthly Tier 1 memberships $10 per month) only.

The first thing you will notice is that you have the option to select the level of your endogamy or how endogamous you are.

The default is set to “Not Endogamous.”  While I only tried the “Endogamous” option to see the difference from the “Highly Endogamous” (Polynesians should be using “Highly Endogamous”) and noticed that the parameters were set higher to numbers that are very familiar to me.

Leah Larkin (The DNA Geek) has shown in her presentations charts of various endogamous populations and to what degree of endogamy each has to deal with.  This is where I first saw how she utilized the average size segment to quantify endogamy, how to gauge how much endogamy you are really dealing with.

She took the amount of shared DNA for Close Relatives (Ancestry), predicted First Cousins, Second Cousins, Third Cousins, Fourth Cousins and Distant Cousins, divided by the number of segments to come up with the average size segment.  What was presented were various sizes present in specific endogamous populations.  She had mild, moderate and strong endogamy.  These were average size segments present in specific (predicted) relationships, i.e. 1C, 2C, 3C, etc.

In her comparison, the one that had the smallest average size segment were Polynesians. She also separated to demonstrate what Western Polynesians had compared to Eastern Polynesians.  She has confirmed (although many of us probably noticed this already) how endogamous, or extremely (“Highly” is the term used for this AutoCluster Endogamous tool at GEDmatch) endogamous Polynesians are.  This could not be done without the help of others submitting their samples to Leah for analysis.  I was able to submit one Samoan and two Kanaka Maoli samples to her to utilize. And the results were worth it!

Having said all of that, do know that Polynesians should automatically select “Highly Endogamous.”  This seems to raise the Min average segment cM and other parameters.  This image below is an example of what it looks like when you do not select anything and keep it at the default “Not Endogamous.”

Even with “Not Endogamous” you can still adjust the settings to your liking.

So below are the settings that you would automatically see when selecting “Highly Endogamous.”

It is important to note, based on what I have seen others post with their own comparison and my 11 years of noticing the largest segment size among Polynesians and known relationships, that the Min largest segment cM selected for 30cM is a good minimum amount to use.  This is what you would expect around the 2nd Cousin level.

I have at Ancestry and MyHeritage (as do other relatives of mine) endogamous matches whose largest segment exceeds 30cM yet what helped distinguish it from a true close relative versus an endogamous one is how they still have a significantly high amount of segments.

Below is a table of all of my matches (Ancestry) and I have highlighted my known relatives.  The ones not highlighted are the endogamous matches.

You can clearly see how with my known (highlighted) 2C, 2C1R, and 3C1R relatives (Predited as Second Cousin) the number of segments aren’t always as high. The ones that are, they have little to no non-Polynesian lines, which means more Hawaiian branches that are coming up as matches to me.  But, the largest segment is coming from our most recent common ancestor.  Notice that for the New Zealand Maori and Kanaka Maoli matches the number of segments are really high.

For comparison, this (table below) is a cousin of mine.  Although I did not indicate the true close relatives, it should be obvious based on the high amount of segments plus the average segment size which ones are truly close relatives.

For the past 11 years, this is what I have been noticing. That it was not common to see DNA matches among Polynesians (mainly Kanaka Maoli and NZ Maori) whose largest segment size exceeded 20cM.  Utilizing the average size (taking the total shared cM divided by the number of segments), we see 7cM and 8cM to be the norm both in my cousin’s predicted First Cousin matches and my predicted Second Cousin matches.  It is pretty common even when looking at the 3rd Cousin, 4th Cousin, and Distant cousin matches.

So this is why we have the type of results you would see with autoclustering and why the need to be able to adjust these parameters in order to find the best matches (true close relatives) to be used in clustering.

So now we have an understanding of what to expect among Polynesian DNA matches as far as the average size segment, the number of segments (to help get the average size segment), and the largest segment size.  In my next blog entry, I will address the results of running this tool and how adjusting these may or may not be as useful.

One thing to note is that various companies will use the longest block (FTDNA), longest segment (Ancestry), and largest segment (MyHeritage & GEDmatch) for the same thing.  I may use these terms interchangeably, but for this particular GEDmatch tool, I’ll only refer to it as largest segment.

 

Ancestry’s 2023 Ethnicity Updates

This year’s ethnicity updates introduced a few new regions, but nothing new for the Polynesian categories.  This has made people’s results seem to fluctuate just a little where they find a small increase or decrease in percentage with various existing categories.  For some, there might be new categories (if they received a new region) and for others, the new category that they received is usually something very small, usually less than 5%.

For the ever-evolving categories relating to Polynesians, we have seen how the number of samples has significantly increased over the years, compared to other populations.

Looking through Ancestry’s white paper from 2013 and how since 2018 they have been updating yearly, I extracted the following numbers of reference samples for specific categories created for Polynesians.

The Polynesia category was first created in late 2013  and that category consisted of just 18 samples.  Apparently, they had the same amount for Melanesia (previously reported as 28 samples).1

In 2018, Ancestry made its first ethnicity update since implementing the Polynesia category. They increased their 18 samples to 58, and in 2019 made a significant increase up to 188 samples.  In that same year they also introduced a Samoa and Tonga category which would now allow the small 1% – 4% to show up for some Polynesians, mainly Hawaiians and New Zealand Maoris.   But Samoans were finding a large percentage of Tonga showing up in their DNA results, just as Tongans noticed the same with the Samoa category in their DNA results, and to a lesser extent they both might have 1% – 2% Polynesia.  This is expected given how we see overlap with other populations that are similar to each other.

In 2020 they renamed the Polynesia category to Eastern Polynesian & New Zealand Maori while continuing to increase the number of samples for all three categories.  An interesting choice for a category name that seemed to specify one Polynesian population and all others relegated to a region – east.

Two years later they decided to split that category into two.  Hawaii, and New Zealand Maori.  This would leave both populations having some percentage of the other population in their results, just as with Samoans and Tongans and other non-Polynesian populations.

While it’s not accurate, it does allow us to quickly see what ethnicity or population the DNA match is from. At least with Hawaiians and New Zealand Maoris, there is about a 70% – 30% ratio, the dominant being the population that the person really is from.

Perhaps with the ever-increasing amount of Polynesians getting DNA tested and if they have good trees, we could have even more specific island populations appearing. We have seen how they tried that with the DNA communities although they have not recently updated those communities.  Given that the communities are based on DNA matches and how they connect/network with other matches, and how we already know that we could easily match other island populations despite having no recent genealogical ties to each other, I do not expect to see any changes with those.

Ancestry’s DNA Matches Split by Parent

A few months ago Ancestry attempted to split people’s matches by parent.  This helps people to figure out how their matches are related to them by splitting up your matches based on your DNA,  and not family trees. They utilize their SideView™ technology, where they group your matches according to the parent they’re related to.

So they split your matches by parent 1 and parent 2, or if you already labeled them, paternal vs. maternal. But for Polynesians and other endogamous populations, we can also have a lot of matches falling under “Both sides.”

Initially, I only had 29 matches falling under “Both sides.”  My mother ended up with 2,231 matches for Both sides.

It is understandable why my mother would have a lot given that both of her parents were Kanaka Maoli.  But in my case, since my father was Filipino I should not have any under Both sides, unless they really are related to me on Both sides.  I do have a few like that, but they never got DNA tested.

After analyzing all 29 matches, I did see that all of them were the same background as I am.  Filipino, but specifically having ties to Bisaya, the region where my father’s parents were from, and also Kanaka Maoli.  So they could be some distant match on my Filipino side, but we also witness somewhat of endogamous matching going on because not all of these matches will have ties specific to my grandparents’ hometown or island, and just so happen we match DNA because of our shared DNA Kanaka Maoli segments. 

Recently, they had an update.  So now we have more matches that have been assigned including more matches for Both sides but not as much as I thought I would get.  While my mother’s and a couple of thousands up to 6,000 matches on Both sides, as they do have both parents who are Kanaka Maoli, again with my own it should not be the case.

This time while going through the additional 20 new matches categorized under Both sides, I noticed that not all of them had any Filipino and a few of them were not Polynesian at all. 

After carefully looking through these matches and their shared matches, I realize that the matches were either 100% Filipino or 100% Polynesian, or admixed Polynesian without any Filipino.  The amount of shared DNA segments that these matches have with each other is significant enough to put them into the Both sides category.

I saw 100% Filipino matches who were connecting via my paternal side would share DNA segments with matches from my maternal side due to the fact that the matches on my maternal side are also Filipino as well as Kanaka Maoli.  And the same for those on my maternal side where they do not have any Filipino ancestry but could share DNA segments with matches on my paternal side and who happen to be part Kanaka Maoli like myself.  But I’ve seen 100% New Zealand Maori falling under Both sides.  

So while I understand endogamous populations like Ashkenazi Jewish people can have thousands and thousands of matches under Both sides, just as I am seeing with other Polynesians, I did not expect this phenomenon to happen to me simply because my parents were of different populations.  First I realized it was because these matches under Both sides had the same exact background as me.  But now with this latest update, it is not the case, but has a lot to do with the fact that Hawaiʻi consists of islands who received other island populations (Japan, Philippines, Portuguese, Puerto Rico), and those people mixed into the current population at that time. 

I have not looked at other accounts that I manage, like cousins who are of Portuguese and Kanaka Maoli ancestry.  Majority of those Portuguese immigrants were from Portuguese islands.  I am sure that is happening with these populations or admixed people as well.

Curious what future updates might yield, given that I see 100% Filipino and admixed Polynesians (not limited to Kanaka Maoli) can easily fall under Both sides for me.

In-common-with, shared matches, and clusterings

There are a few tools out there that either these DNA testing companies will provide to help distinguish our matches from each other.  They are known as in-common-with (icw) or shared matches.  The idea is that a group of DNA matches on your match list who match each other indicates a common ancestor.  

Figuring out a paternal DNA match from a maternal match may or may not be as challenging for some, depending on how well of a tree you have.  It might be difficult to know if a DNA match is on your paternal grandfather vs. paternal grandmother’s side, or from a maternal grandfather vs. a maternal grandmother’s side.  Or even going back further, figuring out that a DNA match is on your maternal grandmother’s father’s or mother’s side, or that grandparent’s maternal grandfather vs. their maternal grandmother’s side.  That would also depend on how well your tree is built out, and the same would apply for your DNA matches.

This is where the shared matches or in-common-with features could help.  For Polynesians, because we match each other to some extent due to endogamy (just as other endogamous populations will experience this), it can be confusing, misleading and really not useful.

Clustering

Visually, there are a few tools to help make it easier for you to distinguish.  Clustering (auto-clustering) is another tool, something that MyHeritage offers or you could use a third-party site such as GeneticAffairs.com to visually show you groups of matches.

Here I show a few of my 1st cousins who have DNA tested, both on my father’s and mother’s side.

My paternal 1st cousins are represented in the green.  My maternal 1st cousins are in red.  Then there are my 2nd cousins on my maternal grandfather’s side represented by the orange.  Going further back on my grandfather’s side, specifically to his mother’s side I have two 2nd cousins once removed who have tested, they’re in blue.  Then on my grandfather’s paternal side, other distant cousins, they are in lavender.

A closer look at this shows how on my father’s side (green) my 1st cousins will match each other, defined by a line.  Since we are all 1st cousins to each other, cousin 1 will match cousins 2, 3, 4, 5, 6 & 7, plus me of course as these are my DNA matches.  Cousin 2 will match 1 (as already mentioned) plus 3, 4, 5, 6 & 7.  The same for 3, 4, and so forth.  

For my mother’s side, I started off with the color red, my grandparents’ grandchildren. We all match each other.  Then going to my 2nd cousins (orange), they come from two different sisters of my grandfather Joseph.  So they all match each other, plus match me and my 1st cousins.  Then going back further on Joseph’s mother’s side (blue), they match each other plus my 2nd cousins plus my 1st cousins as we are descended from my grandfather’s mother Elena’s ancestors. Then finally my grandfather Joseph’s father’s side (lavender).  So while those cousins will match my 2nd cousins and my 1st cousins, they will not match my grandfather Joseph’s mother’s side.  That is the basic concept of how this will visually work.

With endogamy, or with Polynesian matches, that same cluster would basically have all the dots connecting each other.  So imagine my grandfather’s father’s side (lavender) matching my grandfather’s mother’s side (blue).   See the grey lines connecting the two sides.

 

Example of every dot connecting to each other – what you would expect to see with endogamous matches.

In reality, that is what we will see because of how we all match each other.

Gephi

I finally took the time to try to use a network analysis software called Gephi to demonstrate what this interconnected group of DNA matches could look like.  Previously I used a website’s tools.  That website is RootsFinder.com, and used their Triangulation tool that produced nearly identical results as Gephi.  But for now, just demonstrating what Gephi has to offer.

This diagram consists of 196 nodes (dots) and 9,494 edges (lines).  To get that, I had to import a csv (spreadsheet) file, the icw file which has 9,494 lines of names into Gephi.

As I said earlier, while these clustering tools do not work due to the fact that we connect to each other and usually at a very high amount of shared DNA, I was able to extract some information from it.  I probably could have extracted and gathered all of this data manually but taking it directly from a spreadsheet is not as easy as it is just data that are organized by columns, rows, and/or categories.  This is why these tools are available in order to provide a more visual way of interpreting your matches.

What I did gather from this and thought was interesting was that the longest segment size showed 12cM for 24% of my matches.  I noticed this years ago that the size of the longest segment, largest segment, or longest block (depending on the DNA testing company) for many of these predicted 2nd – 3rd cousins would be between 12cM – 14cM.  Rarely would it go over 20cM.  In my previous blog entries, I mentioned the importance of the longest segment size in determining a true 2nd – 3rd cousin.

Looking at that same data, we see that only a single DNA match has the longest segment size of 64cM.  That DNA match is actually my 2C2R (2nd cousin twice removed). 

This next image is the same data except now it’s showing the number of shared segments.  Prior to  Ancestry providing us the longest segment size, we only had to go by the amount of total shared DNA and the number of segments.  So the top (28% of my matches) shows 28 segments.  They seem to range between 25 – 29 for the most part.

An important thing to notice about this particular data, unlike other people who could actually produce nice clusters, is that when I ran this icw file that took about 4hrs to do, I had to limit the amount of shared cM (centimorgans).  This particular diagram in which the icw file I finished running last night range from 185cM – 199cM.  Yet I had 98 matches that fell into that range.

Prior to this particular icw file, I ran one back in May 2022 where I went as low as 90cM.  So it is 90cM – 190cM.  This was the result of that older icw file.

Looking at the data, 13 segments seems to be at the top making up about 14% of these matches.  That particular file had 1,215 matches, which the icw file produced 2,049 nodes and 1,046,502 edges.  That is a lot of dots and lines.

A few people had suggested using Gephi as I could tweak the data. I have been tweaking it for about a week, and as I knew I would not be able to get anything unique from it.  

The problem with this, something that any endogamous group would encounter is running the icw file.  Imagine having only 10 DNA matches.  But for an endogamous person where you could match nearly all the other people even if you are not really closely related at all, that could be easily multiplied.  So match #1 would match all of the other 9 matches on that list.  Match #2 would have about the same matching all 9 other matches on that list.  And the same for match #3, match #4, etc.  So that icw file gets larger and larger.  Now complicate that issue as the less amount of DNA you share, you probably match more people or have a longer list of icw people to add.  This is why I initially ran it again since last May but going down only as low as 185cM from 199cM rather than 90cM – 190cM.  As I go lower, the number of matches, the number of nodes and edges will greatly increase.

For non-endogamous populations, expect to see something that would be more clear.  Utilizing Gephi you could easily attach names and whatever data you would like to the nodes and distinguish each cluster from each other easily.

Auto-Cluster

As I mentioned MyHeritage as one of the DNA testing sites that offers auto-clustering with your DNA  matches.  If you have tested at MyHeritage, you could run an auto-cluster as often as you would like.  Unlike GeneticAffairs.com where you could adjust the parameters, MyHeritage seems to do it automatically.  So depending on the number of matches that you have, or in my case could have a lot of icw, they (automatically) decide what would be best to produce a decent amount of matches.

First, an example of what you would see with autoclusters:

What are autoclusters

Image from MyHeritage’s FAQ page.

What you would get are colored blocks assigned randomly.  The grey square are DNA matches who happen to match someone in one cluster as well as in another cluster.  This could indicate that you have a DNA match who might not have enough shared DNA to match everyone in a particular cluster, something that you would see in a more distant relative like a 2nd cousin of yours not matching a lot of your common 3rd cousins. 

That is basically how clusters work.  They are to help you figure out how your DNA matches match each other.  Then of course it is up to you to figure out based on their trees how all of you connect.

This autocluster of mine I generated back in June.

I actually now have two clusters.  MyHeritage puts a limit as how the maximum amount of shared DNA to be used in autoclusters.  400cM, since that is about the level what you would share with 2nd cousins, not with 1st cousins, maybe a few 1C1R (1st cousins once removed).  My second cluster which reflects my paternal (Filipino) side actually does consist of two 1C1R, a 1/2 1C and a 2C (2nd cousin).  One of those 1C1R in my second cluster is also Kanaka Maoli like myself, so that cousin did produce a few grey squares with some of my other DNA matches in that larger cluster.

What I also did was extract the data which I put on the right-hand side.  I sorted it by the least amount of shared DNA and identified the person if I knew their ancestry. You can also see the size of the largest segment and the number of segments.

A reminder that with MyHeritage’s autoclusters they implement a maximum threshold of 400cM.  The minimum threshold will vary depending on the person’s DNA matches, how much they share with you as well as how much they share with each other.

In my case, there were 494 matches taken from my list who share less than 400cM with me but more than 95cM (actually 108.1cM was the lowest amount shared).   They also decided that in order to be considered a shared DNA match, my matches need to match at least 95cM with each other.

Conclusion

While these tools are great for separating your DNA matches and possibly help you figure out how each one is connected to you and to each other, Polynesians will not benefit from these at all.  They actually could be misleading if they one does not understand what they are looking at, which is a lot of closely predicted 2nd, 3rd and 4th cousin matches.

AncestryDNA’s Ethnicity Update, Ethnicity Inheritance, and Chromosome Painter

Ancestry updated their ethnicities again, although they made some adjustments earlier this year. They also released their Ethnicity inheritance back in May. You can read more about it here: SideView™ Technology.

They also have a Beta version of a Chromosome PainterBeta where they paint the chromosomes with the regions they’re associated with in your ethnicity estimate.

ETHNICITY INHERITANCE

Ethnicity Inheritance

The Ethnicity inheritance estimates which regions you inherited from each parent.  Once you know which side belongs to which parent, you can edit it, identify your paternal and maternal sides.

 

CHROMOSOME PAINTER

Currently, the Chromosome Painter is in Beta and not everyone have this feature yet.  It attempts to assign each ethnicity to a specific part of your chromosome.

You can click on paternal, maternal, or see them all together.  You can also click on each ethnicity to see where specifically they are located on each chromosome.

 

NEW REGIONS

The former Eastern Polynesia & New Zealand Maori region is now in two separate categories – Hawaii and New Zealand Maori regions. These new regions are supposed to provide more precise results for people of both Hawaiian or Kanaka Maoli (aboriginal Hawaiian) and Maori heritage.

They state that while people from places near or with deep historical and genetic ties to Hawaii and New Zealand like French Polynesia and the Cook Islands, will most likely see their previous Eastern Polynesia & New Zealand Maori percentage split between the new Hawaii and New Zealand Maori regions. This is similar to when they created the Samoa and Tonga categories and Samoans were getting some small percentages of the Tonga region and Tongans were getting small percentages of the Samoa regions plus anywhere from 0% to 2% of Eastern Polynesia & New Zealand Maori, just as we would get 0% – 2% Samoa or Tonga and that would vary with every update.

They have made a few “updates” which I believe was an increase in the number of their reference samples.  

Currently, their reference panel has 68,714 DNA samples that divide the world into 84 overlapping regions and groups.  For the Polynesian groups:

Hawaii – 392
New Zealand Maori – 206
Samoa – 91
Tonga – 164

I tried to see if there was some type of pattern as far as how much of the Hawaii region would show up versus the New Zealand Maori region for Hawaiians and Maoris.  There almost seem to be some consistency from what a few Hawaiians and Maoris have been showing me.  

Just comparing my own results, along with my mother’s and one of my cousins.

Focusing on the Polynesian percentages.

When looking at my cousin’s results, since she’s not admixed, I figured maybe it’s about  70% of the correct region versus the other 30% of the other region that we’re not part of.  Other people have shown me their results and it seems like they do fit that range.

When I calculate the percentages that both what my mother and I have, this is what I came up with:

 

It seems pretty close to that 70% vs. 30%, but not consistently as I do see varying results from other people including some of my other relatives.

At first, I thought this is more prevalent among those who are less than 25% Polynesian since those who were sharing these varied percentages had only a single grandparent or great-grandparent and beyond who was Polynesian.  A closer look at my own relatives I realized it’s still not consistent.

One of them does fall into that range I expected while the other two are similar to each other, more of an 80/20 split.  

And then there are some of my relatives I saw that still have that small percentage of Samoa or Tonga.

While it would be nice to see 100% matching to the appropriate region, that would apply to everyone else who aren’t Polynesian and have mentioned how they do not show the correct amount of German or French or Spanish vs. Portuguese, or like with  my own results where they removed my “England and Northwestern Europe” and replaced some of it with “Wales” and some of it went to my “Norway” region.

I have been keeping track of all of the updates I’ve been getting since I tested with Ancestry back in 2014.

So as they attempt to get specific with our region when back in 2018 they (the science team) told me to my face that they cannot split the former Polynesia region, it becomes less accurate and seems to cause more confusion.

And while their communities are still there (they are below the regions and other features near the bottom of your results), I continue to witness many who still believe that they do have ties to that other region, whatever is not known in their genealogy.  Like a Hawaiian insisting that they have Maori heritage or a Maori having Hawaiian heritage when the paper trail does not support it and/or they are known to just not have that heritage at all.

What confuses people more about the ethnicities is the fact that we can be a very close match to other Polynesians, particularly those of the same region.  So a New Zealand Maori and a Hawaiian for example can be predicted as first or second cousins.

Here I compare that same cousin that I mentioned who isn’t admixed, showing her top/highest/closest endogamous matches.  I identify who is NZ Maori vs. Hawaiian (Kanaka Maoli).  Then I added in my own top matches who are my known closest cousins. 

There is an easy way to distinguish a true 1st, 2nd or maybe 3rd cousin match from an endogamous match.  The longest segment for example is essential in determining a true 1st, 2nd or 3rd cousin match.  Also with endogamous matches are the high amount of segments, something that you will not see with your true 1st, 2nd or 3rd cousins.

In a future blog post, I will blog more about these anomalies that exist for Polynesian matches, and the difference between an eastern Polynesian versus western Polynesian match.

But for now, hoping that with more updates the results become more precise and cause less confusion.

Making a connection: Utilizing the Longest Segment size

On December 29, 2021 while going through my new matches at AncestryDNA, I decided to go through my closest matches to be sure I didn’t miss anyone. Usually, these closer matches are endogamous, mostly New Zealand Maoris.

While going through my list of top matches, I noticed a match that I haven’t viewed and had a Japanese surname, which to me would indicate that the match has ties to Hawai’i. And I was right, but more surprisingly, he shared a fairly large segment with me – 41cM. The person had no tree, so I immediately contacted him identifying my maternal grandparents and their parents. Right after that I began to search everything I could and managed to only find the names of his parents, as all three of these people (including the match) were star athletes, so their names came up often in the newspaper. I couldn’t find anything else other than their accomplishments and winnings.

The usual process that I do to figure out how a DNA match is connected is to compare with other relatives and to see which one of these relatives shares a large segment, at least 30cM. This would indicate a non-endogamous match, or rather it is more likely to be a true 2nd – 3rd cousin match. Since I have access to these relatives’ accounts, I can see how large of a segment they share. Using the “Shared Matches” feature is not useful when dealing with Polynesian matches as we tend to match everyone.

The cousins whose account I have access to are in bold. (These are not their real names)

So I go through my list of matches, and usually I can figure it out with the longest segment size to which branch the match belongs. But that may not be as effective once you get into the 3rd cousin range. In fact, while Jan, Steve, Lani and Lei are all my 2nd cousins, Jan and I share a longest segment size of 24cM while the others exceed 30cM. This is why I have been pushing the 30cM longest segment size, although sometimes, as the exception with me and Jan, it may not be as large. But with Jan compared to Lani and Lei (her 2nd cousins) and with her brother Steve compared to Lani and Lei, and when I compare myself to Lani and Lei, we do all share a longest segment size that is more than 30cM.

But this is what it looked like for all of us when I compared the longest segment size.

So while Steve had the next highest (after me & my mother) longest segment size, it wasn’t enough to convince me that the match was on my grandfather’s side. And while I usually ignore the total shared amount of DNA when trying to narrow down to which branch (my grandfather’s paternal vs. maternal side or my grandmother’s paternal vs. maternal side) that match belongs, you can see that compared to my 2nd cousins Lani, Lei, Jan and Steve, only Lani shares less total shared DNA with this new DNA match than I do. But my other 2nd cousins including my 1/2 1st cousin Angies shares more DNA than I do.

Days later I looked into the match again. And not with my other DNA matches as I’ve exhausted all avenues on the DNA aspect and matching, but rather searching for my match’s ancestors.

Again, since the match and his parents were athletes, I kept finding mostly articles about their sports activity, accomplishments and winnings. I also found sports photos of them in yearbooks (on Ancestry), and looked through social media, trying to compare features of the match’s parents, although you really can’t compare with Polynesians. A lot of Polynesians, even admixed Polynesians look like some aunt, uncle or cousin of mine. But then it dawned on me, what are the chances of the match actually being related to me via his mother? Her full name was mentioned in the paper, yet I was focusing on his father and actually stopped searching because I could not find more information on his father. I even looked in my (genealogy) database of all the family members that I have, looking for this match’s surname just in case I had it. But I did not. So now I thought I should really focus on the mother.

I managed to find in the 1972 Directory of City & County of Honolulu the match’s mother living at an address that was the same as what I believed were her parents. And from dates in the yearbook photos of the mother and maternal grandfather, the mother was not much younger than me. That made me realize that it is more likely that the connection is a few generations back.

I searched the newspapers with the match’s maternal grandfather’s name and found a 1952 obituary that mentioned the grandfather as a brother to the deceased, who was a 9 year old boy. The obituary mentioned the parents of the deceased as Mr. and Mrs. H. Markham and several siblings. A couple of the siblings with that same surname (Markham) and a few others carrying the same surname as the DNA match’s grandfather.

I immediately thought that maybe the connection is through Mrs. Markham who was previously married to the maternal grandfather’s father. Near the end of the obituary, it mentioned the grandfather of the deceased – John KAHEAKU. I knew exactly who that was, as he was married to Violet HOLBRON, the sister to my great-grandmother Rose HOLBRON. Incidently, when I searched my database for that couple, I did have their daughter Violet KAHEAKU listed and as having two spouses. With the first spouse, Violet had a 9 year old boy who died in 1952, and that she also married Harry MARKHAM.

So the DNA match turned out to be my 3rd cousin once removed.

While utilizing the longest segment is good practice, once you get beyond the 2nd cousin level, the segment size may not be as large, as in the case of my cousin Angie who is a 3C1R to the DNA match. Sometimes you can also tell by the high number of segments. In my example, I do share less amount of segments but so does my cousin Lani who really isn’t (recently) connected to this new DNA match, or my new found cousin.

It’s still a good rule to go by when you have a lot of 2nd to 3rd cousins, or as in my mother’s case a lot of 1st to 2nd cousins.

Ancestry is finally showing Longest Segment size

I have been waiting and have asked Ancestry for this a few years ago.  Apparently, I was not the only person of an endogamous background who had asked for it.

I went through my list looking for the first known Maori, just to see how large the longest segment would be.  I have always advised Polynesians to look for anything at least 30cM for the largest segment (longest block at FTDNA) or as Ancestry is calling it, longest segment size in order to determine a true 2nd to 3rd cousin relationship.  I know with other endogamous groups they tend to look for something around 20cM.

At Ancestry, you will have to click on that match’s name in order to see the longest segment size.

Notice how the longest segment size is below 20cM, but based on the total amount shared the predicted relationship is anywhere between a 2nd – 3rd Cousin.  To show you what that looks like against known 2nd to 3rd cousin relationships, I am showing my match list about where the endogamous matches come in.  I indicate the Hawaiian ones versus the Maori ones and my known cousins.  I am inserting the longest segment size since you cannot initially see it on your list until you click on the match.

While I have a lot of 3rd cousin matches, my mother and one of my cousins have a lot of 2nd cousin matches.  My cousin had over 500 of 1st – 2nd Cousin predicted relationships.   Just looking at her top matches, I indicated the known relationships versus any Maoris and Hawaiian that she matches.

This definitely will help with determining the endogamous matches.  But the longest segment size does get smaller the more distant of the relationship becomes.  So by the 3rd to 4th cousin level, you may not really be able to tell, with the exception of the fact that we tend to get a lot more segments.

I have a 2nd cousin of whom we do not share a lot of DNA.  While we still share in the range of what is expected for a second cousin, the longest segment size is just over 20cM.

 

At least sorting through these matches have become easier now that we have this additional feature.  Again, this works well with the closer predicted relationships.  This may not be as useful if you already have a lot of distant matches and your Polynesian matches fall within that range.  A lot of my western Polynesian (Samoans and Tongans for example) matches are in that range.

Below are some of my Samoan matches and while their total shared is not a whole lot, their longest segment size is significantly smaller compared to what we normally see with eastern Polynesians.  This is true with other DNA companies like FTDNA and 23andme.

My top FTDNA matches where the endogamous matches come in among my known relationships.

I am hoping that in the near future Ancestry will put the longest segment size immediately on the match list page so it will be easier to go through rather than click on each name to see if the match really is worth looking into.    For now, what we have is definitely an asset to help us sort through these matches.

More Genetic Communities at AncestryDNA

Finally, after asking AncestryDNA to split their Genetic Communities at least into 2 main regions (eastern Polynesia vs. western Polynesia, they come up with a major update, not just for Polynesians but for other places in the Pacific Islands, Asia, and America.

The former “Hawaii, Tonga & Samoa” genetic community has been broken into 4 different communities.

The maps that go with these genetic communities are not the best given how small these islands and atolls are on the map.  Not to mention how distant one island nation is to the next, especially when you see how AncestryDNA decides to group them together.

The Polynesian Islands genetic community is basically the same map that they had for the former Hawaii, Tonga & Samoa genetic community.

The Cook Island & French Polynesia genetic community covers a vast area.  But the area covered is nothing like comparing the distance between New Zealand & Hawaii.

The Hawaii & New Zealand genetic community basically just highlights the remaining extreme points of the Polynesian triangle (minus Easter Island), with Hawaiʻi to the north and New Zealand to the southwest.

The broader Hawaii, Tonga, Samoa, Fiji & New Zealand genetic community’s map does not even zoom in.  You will have to zoom in to see a better view of the islands within this genetic community.

And finally, the Tonga, Samoa & Fiji genetic community would also include other western Polynesian islands like Niue, Tokelau, and Tuvalu.  What is interesting, although it is no surprise, that they also included Fiji into this group.  Historically Fiji had some role with initial populating of Remote Oceania.  I have seen a few Fijian matches not just for me but also with my mother and cousins.

It would be nice if in the future they really can fine-tune these genetic communities a bit more.  While we know that eastern Polynesian people come from western Polynesia centuries ago, those of us from Hawaiʻi, Aotearoa (New Zealand) and Rapa Nui (Easter Island) know that we had ancestors coming from what is known as French Polynesia (Tahitian archipelago and the Marquesas) as recent as 800 years ago.

So while I did not get the French Polynesia (with Cook Islands) community, my mother did.  She does have more matches with western Polynesians (Samoans and Tongans) than I do.  She also has more matches with Fijians than I do, so I am not surprised she got the Tonga, Samoa & Fiji and also the Hawaii, Tonga, Samoa, Fiji & New Zealand genetic communities.

We will see as time goes by how my own genetic communities get updated.

Separating maternal matches from paternal matches

A problem that endogamy presents is when you have a match who matches you on both your paternal and maternal sides of the tree.  If you do not know how you are related, figuring out the connection is challenging.

Working out how matches for my mother are connected can be difficult.  Both of her parents were Kanaka Maoli.  So unless they have trees or I have the motivation to trace a match’s ancestors beyond what they already have, I usually would ignore the match. It takes a lot of work to distinguish if the match is related on my mother’s paternal or maternal side.

While it is only my mother who comes from an endogamous background, my father, on the other hand, was Filipino and I get very distant matches on that side.  And like my endogamous side, I pay no attention unless the match has a tree where I could figure out our connection.

Being from Hawai’i, I do encounter a lot of matches who are like me where they are part Filipino and part Kanaka Maoli.  I have seen a few matches whose trees indicated ties to the same island as my Filipino grandmother.  For their Hawaiian branches, they may or may not show the same geographic area where my Kanaka ancestors lived.  For the most part, we do tend to match on a DNA level because of the endogamous side as I mentioned earlier, the matches on my Filipino side are usually distant.

Here I demonstrate showing my closest cousins on my Filipino side, and how they can easily match up relatives on my Kanaka side.  Basically, my mother matches a few of my cousins on my father’s side.  It is because my Filipino cousins are also part Kanaka Maoli, and they are connecting to my mother via that side.  Of course, something like DNAPainter or Kitty Cooper’s Segment Mapper could be used to show which segments are from my father versus my mother.  But the point here is to just compare how my paternal cousins also match my maternal cousins.

Paternal cousins indicated in RED and maternal cousins in BLUE.

 

I indicate the relationship (for the ones without names) how they are related to me, e.g.  1st cousin (1C), 1st cousin twice removed (1C2R).

For my maternal cousins in blue, I list how much they share with my paternal cousins.  But for my mother and myself, I show how much we share with both my paternal and maternal cousins.  In some cases, my cousins on my mother’s side have other endogamous (Kanaka) lines so they might share more DNA than expected compared to another closer relative of theirs, or even to my mother.  For an example, take a closer look at cousin #6 and to their parent cousin #5.  Another example is cousin #3 and #4 compared to my mother.

In the example above I only used my close paternal cousins, and know how we connect.  But when dealing with distant matches and no trees, it will be difficult to differentiate paternal versus maternal matches.

This does not include recent pedigree collapse where I do have on my Filipino side cousins who share the same common ancestors more than once, or where I have cousins who are related to each other in more than one way.  This can also affect the amount of DNA shared.

Ancestry updates their ethnicity yet again

As of November 13, 2019, everyone’s AncestryDNA results were updated.  Back in late October, only a few people have been getting the new update and all new testees.  Now we are all on the same page.

They did several changes which include increasing the number of genetic communities for various populations, increasing the size of their reference samples, renaming of categories and adding in a few new categories such as Guam, Samoa and Tonga.

We are going to concentrate on Samoa and Tonga, which they attempted to split off from the rest of Polynesia.

When AncestryDNA created the Polynesia category back in December 2013, it only consisted of 18 Polynesian samples which included at least one (or possibly more) of the samples that have distant European ancestry.  They updated their category and rolled out the new update to everyone back on September 12, 2018 with an additional 40 more samples increasing to a total of 58 for Polynesia.

In June and December 2018, I had the opportunity to speak to David Turissini, Ph.D who is a population geneticist at AncestryDNA.  I expressed my concerns with him regarding more specific categories among Polynesians.  Basically splitting eastern from western Polynesia.  I also explained why I thought that would be much better for us particularly for matching as we all tend to match each other at a very closely predicted relationship.  And that I thought the low number of reference samples could possibly affect the way we get our results.

He told me that I already understood how Polynesians lack genetic diversity so increasing the number of samples would not make any difference.  But then I pointed out how it was not that difficult for me to distinguish a western Polynesian (Samoan, Tongan, Tokelau, Tuvalu) versus an eastern Polynesian (Maori, Tahitian, Cook Island Maori, Hawaiian, Marquesan, Rapa Nui).

Despite all that was said, I was surprised to see how they increased the number of reference samples for Polynesia along with adding in Samoa and Tonga.

New categories & increase of samples for Polynesia

You can read more about it here:

https://www.ancestry.com/cs/dna-help/ethnicity/estimates

So their reference samples of 16,638 has increased by 23,379 samples to a total of 40,017.  Of that amount, they added 130 more samples to the Polynesia category and creating Samoa with 73 and Tonga with 97 samples.

While I have not noticed a lot of Tongan results yet, I have seen several Samoans.  Most of the ethnicity results I have seen are either Hawaiians or Maoris.  For the most part, eastern Polynesians are getting either Samoa and/or Tonga in the range of 1% – 4%.  For Samoans, I’ve seen about 60% – 70% Samoa and the rest Tonga.  A few Cook Island Maoris seem to have a higher percentage of Samoa compared to other eastern Polynesians but that may be due to the fact that they have ties to Aitutaki or its neighboring islands versus Rarotonga.  Or maybe Cook Island Maoris just have a higher percentage because of another group of people that settled earlier and/or it could be due to the original people who just so happened were genetically more like Samoans.

This whole classification, while it cannot be accurate as it is nothing but an estimate, really makes it interesting and gives us a bit more of an insight as to the settling of Polynesia.  Of course we can also see this as more people are getting Y-DNA tested and mtDNA and we slowly learn more about these different migration patterns which no surprise, confirms our oral histories.

My results have changed throughout time since I tested with AncestryDNA back in January 2014.  The biggest breakthrough came last year as they actually created the Philippines category which correctly allocated my Filipino side from Polynesia, therefore decreasing my amount.

But what does my tree look like compared to my current DNA results?

 

With the latest update it made my color scheme more difficult to accomplish but in the tree I do point out the foreigners.  While my father was born in Lahaina, Maui, Hawai’i, both of his parents were from the Visayas region in the Philippines.  For my maternal grandmother’s mother – Rose Holbron, her paternal grandfather was from Hull, England while her maternal grandfather was from Queens, New York, U.S.A.  And for my maternal grandmother’s father – Frank Kanae, he had distant American ties.  His great-grandfather Isaac Lewis Kanae was the son of Captain Isaiah Lewis.  I still have not pinpointed his origin yet.  And Isaiah Lewis’ father-in-law Oliver Holmes arrived in the Hawaiian Islands in 1793 from Plymouth, Massachusetts.  At the time Oliver Holmes left Plymouth, there were only 15 states in the U.S.A.

So what I did was place their ethnicities under a continental level and compared it to my DNA results, which all adds fairly nicely, taking in random inheritance.  My mother gets 17% European compared to her sister who gets exactly 15% which is consistent with the genealogy.  And in turn my mother gave not one but both of my brothers about half of her European – 8% and 9% for them while I ended up with the higher percentage – 11% which appears as about 11% – 12% at different testing companies.

And while I show 2% Samoa, my mother ended up with 1% of both Samoa and Tonga.

 

For my cousin who is not admixed, it was interesting to see, despite the erroneous genetic communities that would come up, how hers changed.  Because we match other Polynesians at a very closely predicted relationship, and the fact that my cousin is not admixed, she matches a lot of part Polynesian people who fall into a specific genetic community among others of whom she also matches.  So she ends up with the same genetic community.

 

 

With this latest update, they finally got rid of the Native American category for both my cousin and my mother.  But now with Samoa and Tonga, it is no surprise that they would give us a small percentage of that.  And having gone through several of these 1% – 2% categories of Samoa and Tonga, they all seem to range the same – 1% – 4%.  Interestingly for my mother, her range for Tonga was 1% – 3% while her Samoa was 1% – 4%.  But the way it ended up was both 1%.

I have also been witnessing those who previously had small amounts of Polynesia now being reclassified as Samoa, Tonga or Guam.  Usually, these are people with either Melanesia or some other Southeast Asian from various parts of Indonesia.  I would be really interested in seeing more results who have ties to that area.

So while I was told the number of increase of samples would not do anything, it obviously did quite a bit.  If only they would have renamed the Polynesia category by specifying Eastern Polynesia.  They should also do the same renaming their genetic community.  It would make more sense as we know that both Samoa and Tonga is part of Polynesia and of course, their map for Polynesia would include Samoa and Tonga within that area.  I would have expected western Polynesia as I mentioned to them versus eastern Polynesia, but they really got very specific.  And in the end result, Samoans will see that they are about 30% Tongan and probably the same for Tongans where they will see a smaller percentage of Samoa.  These people do get about 0% – 1% Polynesia in their results.

We will just have to wait to see what the future updates would bring.

Previous entries about AncestryDNA’s Polynesia category:

https://hawaiiandna.wordpress.com/2014/12/15/polynesia-category-ancestrydna-com/

https://hawaiiandna.wordpress.com/2015/06/30/polynesia-category-ancestry-com-part-2/

Largest Segment – Is it the best way to gauge the closeness of relationship?

In my earlier blog posts I have mentioned how significant the largest segment size is when determining a true 2nd to 3rd cousin relationship.  Polynesians can have a total shared amount that can easily exceed 100cM.   These totals tend to over-estimate the predicted relationships.

From the ISOGG Wiki’s page, you can see that the average for 2nd cousins once removed (2C1R) is 106cM while 2nd cousins are averaging around 212.50cM.

So we tend to get a lot of these 2nd – 3rd cousin matches, depending on the company you tested with.  This is why the largest segment size has become important.  Blaine Bettinger has a post entitled The Shared cM Project – Longest Shared Segment where people had submitted their longest segment size based on their known relationships.  You can compare 2nd and 3rd cousins there and see what the average is for the longest segment size for specific relationships.

A quick look at the type of numbers just by looking at my own ONE TO MANY from GEDmatch.com.

My cousin Allen who is a 2C1R to me (his maternal grandmother & my mother are 1st cousins) has a large segment of 35.9cM.  You can see more comparisons of the largest segment for 2nd to 3rd cousins from Blaine’s Shared cM Project but I also have been keeping my own numbers from my known relatives.


Only one of those 2nd cousins shared a large segment of 21.8cM, pretty small, and then it gets even lower as you go more distant.  But normally 2nd cousins will share a rather large segment, which is why more than 20cM has always been advocated and also among the Ashkenazi Jewish community.  In fact, I thought they used 25cM, but I could be wrong.  I even mentioned 30cM would be good.

But is it a requirement?  Absolutely not.  However, if you cannot find a connection, or the same geographical origin i.e. New Zealand or Hawaii, then that would be a strong indicator that you are not as closely related as it was predicted.

I have been noticing how I do have a few Hawaiians whose largest segment is more than 30cM but have not been able to find a connection.  I also notice that these matches will not have the same geographical origins as I do.  So could it be that these large segments remain in our population for many generations?

Here’s an example of how it actually has remained for centuries by comparing my Hawaiian mother and a Maori.

Taking my mother’s ONE TO MANY matches, I sorted them by the largest segment size.  I indicated the known relatives in blue and the unknown in red.  My mother has a Hawaiian match as 44.6cM for the largest segment.  I still have not been able to find a connection, although one of that match’s branch goes back to the area of a few of my ancestors.  But even for us, that was more than 3 generations ago from my mother.  Another at 39.9cM, not sure if that person is a Hawaiian or Maori.  And there is a Maori match with the largest segment of 25.9cM.   At FTDNA, there is a Maori match whose largest segment is 23cM.

Here is the largest segment sized match with a couple of Hawaiians from MyHeritage.

37.2cM and 33.6cM.  They have pretty good trees but their ancestry goes back to totally different islands from my own ancestors.  And I saw in their trees the origins of the different islands is further back while the more recent ones were born in Honolulu where some of my more recent ancestors were born.  I did trace many of my ancestors’ descendants who remained in Honolulu but none are those connect to these matches.

Here is a Maori match from MyHeritage.

Notice that the largest segment is 34.2cM.  The highest I’ve seen with a Maori.  How can a large segment last that long after many centuries?

And while the focus here is utilizing the largest segment to get a more accurately find a true 2nd to 3rd cousin match, we know how in one generation a large segment can quickly be reduced.

Comparing with the largest segment that my mother shares with her 1/2 3C.  This is how they connect.  I outlined in yellow all testees in this particular comparison.

The largest segment that my mother & her 1/2 3C share is 49.6cM (FTDNA indicated 52cM) according to GEDmatch.  But that particular segment was not inherited entirely by my mother’s sister and seemed to have been broken up thanks to recombination and turned into a 10.7cM and a 25.1cM segment.

My mother’s deceased brother seemed to have received that same segment or maybe even slightly larger.  And while he is not alive to test, his son did, and he shares 50.2cM with this 1/2 3C of our parents, or our 1/2 3C1R.

This is what the comparisons look like.

My younger brother got nearly the entire segment as my mother got it but I got a very small portion of it, just 14.3cM.  That’s a huge difference from 49cM.  Had my mother nor my younger brother got tested, I would not have been able to find this good match and would have concentrated on matches with large segments more than 20cM or even 30cM.  My older brother got DNA tested however he does not share any of this same matching segment.  In fact, he shares 0cM on this particular chromosome.

This 1/2 3C was key in finding my mother’s biological parents.  At the time I did not know how we were related but I did concentrate on this match because of the large segment size.

So how do we really filter all of these matches?  By solely concentrating on the largest segment?  You should definitely not spend too much time on large segments that are less than 20cM and whose shared total is way over 200cM.  With those particular matches, if you compare trees and notice no common geographic area, that would be a big indicator that it is a distant match.

Remember that with a 2nd cousin you would share a pair of great-grandparents.  With a 3rd cousin you would share a pair of 2x great-grandparents.  By that generation or even a generation further back or two if you find that you do not share the same geographic location, then the match is a distant match.  The same applies for large segments greater than 30cM.  If no common geographic location, then it is probably a distant match.

myOrigins 2.0 update – FTDNA

Back in April, FamilyTreeDNA (FTDNA) finally updated their myOrigins.  This was initially set to roll out shortly after November 2015’s 11th Annual International Conference on Genetic Genealogy held in Houston, Texas.

FTDNA started off with Population Finder, which was replaced by myOrigins in May of 2014.  With Population Finder, they had an Oceania (Papuan, Melanesian) category.  When they switched to myOrigins, they removed the Oceania category.  Since Polynesians are about 75% Southeast Asian and 25% Melanesian (Oceania), Polynesians would show up as just Southeast Asian.

They increased their population clusters so now they have a total of 24.  I believe prior to this newer version there were about 18 of them.  While a lot of people have reported how “off” these results are, focusing on just the Polynesian genome, I notice that there is a consistency to have about 3% – 9% Northeast Asian along with the predominantly nearly 75% Southeast Asian.

Population Finder, myOrigins 1.0 and myOrigins 2.0

These are the different versions.  FTDNA seems to be ever increasing the amount of European that I have for whatever reason.  I usually range between 8% – 12% at various DNA companies.

Below is a breakdown of what other Polynesians have been getting with the new version of myOrigins.

 

Click for full image

Percentage breakdown by the various East Asian and Oceanian categories.

For now it seems that the new version of myOrigins are giving a lot of people many trace regions.  While I did not include them in the image above, I have been seeing this for eastern Polynesians so far.  Maybe in the future there will be an update that could refine these trace regions so that it appears less for everyone.

MyHeritage Ethnicities

As of May 30, 2017, MyHeritage finally released their Ethnicity Estimate (beta) to those who uploaded their raw data.  So far this service is still free.  Not sure if they will discontinue that service.  Currently their tests are at a reduced price of $79.

 

Not only does MyHeritage (MH) have an Oceanian category but they included Polynesian along with Melanesian and Papuan.

 

Last year and probably the year before that, they reached out to people who had a tree at MH whose 4 grandparents were listed in a given geographic area confirming ties to that particular place or country.  And while they seem to have obtained more than Ancestry’s 18 Polynesian samples, they did not take into consideration that these people may be admixed.

A lot of admixed Polynesians who did test with MH are reporting to have lost a lot of their European while simultaneously having an increased percentage of Polynesian.  There seems to be about 10% difference.

Here are my mother, my maternal aunt and my own results.

Click for larger image

My mother and her sister are 85% Hawaiian while 15% is of European background.  My mother gets about 17% European at the varying DNA testing companies.

Several Polynesians have shared their Ancestry results with me.  Comparing it to MH it seems that the numerous samples that they used for the Polynesian category included some admixed Polynesians of European heritage.  I have been hearing the same situation for those with admixed Native American background reporting 20% to 30% more Native American while reducing the amount of European.

What is interesting about MH is that they did have other populations not covered by the other testing companies.  They separated the Melanesian and Papuan, commonly grouped together and labeled as “Oceanian” by other companies or at GEDmatch, and provided a separate Polynesian category.  They did something similar for the Asia group.

MH has specific groups within the Southeast Asia area, such as Filipino, Cambodian, Vietnamese, Thai, Malaysian and Indonesian.  While the thinking is that if you come from that particular background which they tested, you should score perfectly with that group.  This may not apply for some either due to the limited number of samples and/or where they got their samples from, such as taking samples from one specific area.  I have only seen a few Filipinos’ results where they score 100% Filipino/Indonesian/Malaysian.  A couple of Chinese people received majority Chinese and Vietnamese and a smaller percentage of Filipino/Indonesian/Malaysian.

My guess is that this breakdown of the various Southeast Asian groups helps separate Polynesians (and Micronesians) who also have some Southeast Asian background.  At Ancestry, Chinese people were reporting about 10% Polynesia, Vietnamese as high as 15% and Filipinos around 32%.  Ancestry has no Southeast Asian category, so those of Southeast Asian background will get some of the East Asian, or what Ancestry has as “Asia East” along with a small percentage of Polynesia.

Aside from MH engulfing the European for admixed Polynesians, it seems fairly accurate at least for me being that I am half Filipino and 43% Hawaiian and about 12% European background.  I am going to assume that the West Asian below is part of my European background while the South Asian is part of my Filipino backgroundc

New 5th Cousin connection helps map out chromosome!

HOW WE CONNECT

Now that I had figured out who my mother’s biological parents were it has become easier to find connections.  (You can read about it here: https://hawaiiandna.wordpress.com/2015/08/01/finding-a-dna-connection-despite-endogamy/)

While there is one branch where I find a lot of relatives on my great-grandmother Rose Holbron’s side, I am slowly finding distant connections on my great-grandfather Frank Kanae’s side.  Frank Kanae was Rose Holbron’s husband.

Earlier this week I received an email from a woman named Raychelle who saw me and my numerous kits of family members that I manage on GEDmatch.com as a match to her.  I began the normal response, almost ready to dismiss her since many of these matches appear to be close when in reality we are usually distant, and for others, much more distant.  And from what I could see, it wasn’t such a huge amount.  At GEDmatch, Raychelle and my mother shares 62.9cM total, with a large segment of 10.7cM.  So at least a 4th cousin level.

After I told her that she could find me on Ancestry (since she uploaded to GEDmatch via Ancestry) and look at my HOLBRON family tree, she found out that we have the LEWIS connection.

She is a 5th cousin to my mother, and a 5th cousin once removed (5C1R) to me.  I come from Isaac Lewis who was known as Isaac Lewis Kanae or Isaac Kanae Lewis, and also known by the Hawaiianized version – Aikake Lui.  While Raychelle comes from John George Lewis, and his Hawaiianized name was Keo Lui.  My assumption is that Keo was short for Keoki (George).  Keo could also be short for Keoni (John) and then there was the catholic version – Ioane for John.

But what was interesting is that she had this genealogy and I had updated mine from this to reflect what a couple of people have been researching.

According to the information that has been circulating at various sites on the internet, Isaac’s father – Captain Isaiah Lewis was the son of Captain Ezra Lewis.  And John G. Lewis was the son of Captain John Lewis, who was Captain Ezra Lewis’ son but through a different wife.  I listed them as spouse #1 and spouse #2 because different sites and people will switch the spouses showing Isaac as the son of one spouse, and another will show Isaac as the son of the other spouse, and vice versa for John G. Lewis.

Click to see larger image
So the question is, were Isaac and John full brothers, or (maternal) half-brothers?  And if they were (maternal) half-brothers, were their fathers paternal half-brothers?

While all of this information going back that far is based solely on people creating these trees without further documentation, for now I am only going by what was documented.  The trees habitually say that Polly was known as Sarah Pauline “Polly” Holmes.  While I can understand that Polly could be a diminutive for Paula and Mary, I’m not so sure that these are the same person, especially since a lot of the information lists this Sarah Pauline “Polly” Holmes having been born in Massachusetts and died there,  and that her husband Captain Isaac Lewis from Massachusetts too.

What we know for a fact according to testimonies from people who lived during the time of Polly Holmes and her father Oliver Holmes.

screen-shot-2016-12-02-at-12-32-31-pm

I am still in the process of confirming and documenting all of these ancestors, so for now I am considering Raychelle and I 5C1R, and that her 3x great-grandfather John George Lewis (Keo Lui) and my 4x great-grandfather Isaac Lewis Kanae (Aikake Lui) were full-brothers.

 

SHARED DNA SEGMENTS & CHROMOSOME MAPPING

I compared Raychelle to all of the relatives to see which segments we all had in common.  Any common segments or segments that multiple relatives share would indicate that segment was inherited from a common ancestor.  In this case, Polly Holmes and her husband Isaiah Lewis.

And while autosomal DNA inherited from our common ancestor can remain in our genome for about 5 – 6 generations, there are some cases where it can span several generations and for some as we have seen, in larger segments. These larger segments tend to be passed on within generations entirely intact and having not recombined.

With endogamy, that may confuse things as it isn’t guaranteed that the shared segment came from that same common ancestor.  Especially for Polynesians where we share many small segments.  And these multiple segments may not be in common with other relatives, or rather these segments may not overlap as what I am about to demonstrate.  So when looking to map out these segments, and at the 4x great-grandparent level, if the segments are really small, that may be suspect to being segments randomly inherited.  It may or may not be from the common ancestor, or may come from the same common ancestor multiple times through their different descendants.

I first compared my brother Kaimi and Raychelle and looked for the chromosomes that should match my mom.  Kaimi and I have different fathers, so I decided to use his to compare because his father is also Hawaiian.

I use Kaimi’s unphased and phased data to be sure that if there are extra segments that does not match our mother, then the presumption is that the segment came from Kaimi’s father.  These were the results.

screen-shot-2016-12-23-at-2-08-35-pm

You can easily see how with the phased data the size of the segment is somewhat smaller if it doesn’t remain the same or disappear altogether.

The real work comes in when I compare Raychelle to my mom’s brother’s son Chris, her half-brother’s daughter Lena and her maternal half-sister Aunty Stella.  The detailed specification of their relationship is to help you understand how they are related and know what is to be expected as far as sharing DNA with different relationships go.

What I did first was compare Raychelle to all of those family members mentioned and then see which of those matching segments actually matches up with what my mother matches.  Here’s a diagram of how we are related and descend from Isaiah Lewis and Polly Holmes.

screen-shot-2016-12-03-at-6-55-03-pm

I’ll start first with Chris, the son of my mother’s brother Joseph.

screen-shot-2016-12-04-at-7-33-00-am

While there were other segments that Raychelle shared with Chris, I am only comparing overlapping segments that are shared with my mom.  There are 3 chromosomes where they share overlapping segments.  Ch 6, 7 and 20.

With Aunty Stella, there were segments on different chromosomes, sometimes on the same chromosome but in different parts of the chromosome that did not overlap.

screen-shot-2016-12-04-at-7-45-56-am

Only one overlapping segment which is on ch 7.

Then with Lena, the daughter of my mom’s half-brother George.

Lena also shared different segments and different chromosomes with Raychelle that my mom does not have, except for ch 7.

So what is consistent with all of them is that a segment on chromosome 7 is shared with Raychelle.

The diagram above  shows how everyone matches each other, with the last one again showing my mom with Raychelle and that consistent block of segment.

So the fact that we all shared an overlapping segment in common with each other indicates that particular segment was inherited from our common ancestor.  In this case, both Isaiah LEWIS and Polly HOLMES.  But how do we figure out if that segment came from Isaiah vs. Polly?  Remember that there was a discrepancy that Polly’s two husbands – Isaiah LEWIS and John LEWIS were paternal half-brothers according to some other genealogy and that Isaac LEWIS KANAE was Isaiah’s son, while John George LEWIS was John LEWIS’ son.  Both Isaac and John had the same mother – Polly HOLMES.

The best way to distinguish that inherited segment being inherited from Isaiah LEWIS or Polly HOLMES is to test members of each of those families.  That would be distant relatives of whom we cannot find a connection to just yet.  Instead, I used another method.

Since my mother tested at 23andme, they have the ability to show the ancestry broken down by each chromosome. This is what my mother’s 7th chromosome looks like.

 

screen-shot-2016-12-03-at-7-36-33-pm

23andme identifies portions of the Hawaiian segments of the chromosome as a combination of East Asian & Native American, and Oceanian.  I simplified it by just indicating Hawaiian.  Both of my mother’s parents were Hawaiian, but her mother Rose KANAE also had European ancestry.  Which is why in that diagram one chromosome is labeled as the paternal chromosome, the other as the maternal.

My mother’s maternal grandmother was Rose HOLBRON.  Rose’s paternal grandfather John HALBORN was from Hull, England, and her maternal grandfather William LUDLUM was an American whaler from Jamaica, Queens, New York.  Rose HOLBRON’s grandmothers were Hawaiian (Kanaka).

But it is Rose KANAE’S father – Frank KANAE whose paternal grandmother Mary LEWIS KANAE’s father was Isaac LEWIS KANAE.  Isaac’s father was Captain Isaiah LEWIS.  Isaac’s mother Polly HOLMES was the daughter of Oliver Holmes of Kingston, Plymouth, Massachusetts and Mahi, daughter of the chief Kalanihooulumokuikekai of Ko’olau.  My assumption was that the European portion from Rose KANAE’s father is too far back.  In other words, the European portion of that chromosome that my mother inherited from her mother could have only come from John HALBORN or William LUDLUM, or a combination of both.

There are a few factors that could make a segment remain in tact for several generations:
1) The length of the chromosome.
2) How many cross-over events there were for that particular chromosome.
3) Location on the chromosome (some areas are more SNP dense than others).
4) The possibility of having fewer cross-over events or none at all (we see this happening as well).
This segment seems to match nicely ranging from 7.1cM (my mom) to 9.1cM (Aunty Stella) with all the relatives.

So when I visually compare the section of chromosome 7 that matches up with the shared overlapping segment for all of us, this is where they line up.


If you have read my other posts, you would have read that multiple segments for Polynesians can remain for awhile given that we come from a few common ancestors multiple times.  This paritcular segment had to have come via Polly HOLMES’ mother – Mahi who got it from her parents Kalanihooulumokuikekai and his wife.  And since Raychelle is also a descendant of Polly HOLMES and Isaiah LEWIS, this portion of chromosome 7 did not come from my HOLBRON side.

While my family members used for comparison descend from Isaac LEWIS KANAE’s daughter Mary LEWIS KANAE, there are other descendants through Mary’s sister Papanaha LEWIS KANAE who got DNA tested.  But only one of them was a match to Raychelle.

screen-shot-2016-12-04-at-4-00-56-pm

This cousin shares an overlapping segment of 8cM on chromosome #7.  But when I compared that relative to my mother, they did not share that particular overlapping segment, although all my other close relatives did share that overlapping segment with this cousin.  After looking into it further, I found out that my mother seemed to have inherited a smaller section of that overlapping segment compared to other family members, and her matching criteria just did not qualify as a match according to GEDmatch.com where all of this analysis was done.  After all, she shares the least out of all the relatives only 7.1cM of this segment and Aunty Stella shares 9.1cM.  And while she gave me and my brother Kaimi this segment, my brother Travis did not inherit this segment.  Which means this portion of chromosome 7 for him was from our grandfather, not our grandmother Rose KANAE.

But that is what is complicated about mapping out segments for Polynesians. These segments could be from any of these lines going back to the same common ancestor multiple times. That means that Raychelle could just so happen match all of us via my maternal grandmother Rose KANAE’s mother’s side, or my great-grandfather Frank KANAE’s mother’s side, or John KANAE’s father’s side, and so forth.  It could also be just by chance, that we share the segment with any other of her Hawaiian ancestors.

Since many Polynesians share multiple small segments and as small as 7cM, as well as having these segments line up very close to each other if not right next to each other, it makes chromosome mapping very difficult to do.  For example, I mentioned one of Papanaha LEWIS KANAE’s descendants share that same overlapping segment on chromosome 7 with the rest of us, while the other descendants  share multiple non-overlapping segments.  I cannot easily assign them to our common ancestor – Isaac LEWIS KANAE, or presume that all of these multiple segments came from our common ancestor.

Since Polly HOLMES is 6 generations away from my mom and all of her descendants share this same overlapping segment, it is safe to presume that this segment came from Polly HOLMES’ mother – Mahi.  And now I can assign at least this small portion to Mahi.

screen-shot-2016-12-04-at-4-11-13-pm

Determining half-relationships with Polynesians – Part II

In my last entry I demonstrated the difficulties of determining the half-relationships after receiving the DNA results of my half-first cousin.   Within an endogamous group, that could be even more difficult as we see larger amounts of DNA shared.

While the ISOGG Wiki Autosomal DNA Statistic page can list the average amount of centimorgans shared,  Blaine Bettinger’s The Shared cM Project  demonstrated that the minimum and maximum amounts shared can vary.  This becomes more evident as the distance of relationship increases.

Within an endogamous group it makes sense that having more than one pair of common ancestors may increase that amount.  The same would apply if you descend from the same common ancestor multiple times.  Both would produce higher amounts shared.

A few months ago I got the results of my aunt believed to be a full-sister of my mother.  My aunt suspected that her father was not her biological father.  And she was right.  But she was not the only one who knew of this, but the rest of the family, particularly the ones of my generations believed that this Aunt’s father was her biological father and did not suspect otherwise.

From my mother’s Family Finder (autosomal) match list at FTDNA:

Screen Shot 2016-05-03 at 5.21.34 PM

The top is my mother’s sister while the one right below it belongs to my half-1st cousin whose father George was mentioned in the last entry – Determining half-relationships with Polynesians.

Initially I was confused by the total amount since I knew it was more than what I shared with two of my half-brothers.  This is how two of my half-brothers compare to me and to each other.

Screen Shot 2016-05-03 at 5.47.08 PM

So my mother and her sister did share a bit on the high-end for half-siblings, but low end for full-siblings.  These are the predicted averages shared for siblings vs. half-siblings.

Screen Shot 2016-05-03 at 5.52.23 PM

The next step was to take a look at the X chromosome.  For half-sisters who had the same father, they would share an entire X chromosome based on how the X is inherited.  To my surprise, it looked like someone took a razor blade and sliced out some pieces of the image.

 

5+cM setting

5+cM setting

 

1+cM

1+cM

For half-sisters they share a lot compared to what I saw when comparing my half-brothers to each and to me.  Also, I decided to include both the default 5+cM setting and the 1+cM.  With my brothers, we hardly get anything when I lower it to 1+cM.  But with my mother and aunt, you can see a difference although chromosomes 4 and 18 are more likely to be IBS, but given the situation (endogamy, small communities, & isolation) it just may be IBD from a very long time ago.

So the X was not helping me one bit since I thought maybe they were areas on the chromosome that could not be read – no calls.

I immediately uploaded to GEDmatch for further analysis.  No surprise that when I looked at the X, it was the same exact thing.   Knowing that it wouldn’t be helpful, I turned to the other 22 pairs of chromosomes.

Screen Shot 2016-05-03 at 6.52.44 PM

What you would be looking for in full-siblings are full-identical regions (FIR) which are the green sections on the bar graph.  Here is an example of my 1st cousins, a brother and sister.

Screen Shot 2016-05-03 at 7.55.59 PMScreen Shot 2016-05-03 at 7.56.08 PM

About 25% will be fully identical.  You can read more about how much full versus half-identical regions siblings would share at ISOGG’s Wiki – Fully Identical Region page.

This is what my mother and aunt showed.

Screen Shot 2016-05-03 at 11.18.11 PMScreen Shot 2016-05-03 at 11.18.36 PM

There are only small chunks of  FIR rather than long segments of it that you would see in full-siblings.  So this confirms a half-sibling relationship.

Determining half-relationships with Polynesians

I recently got my cousin’s results to compare to my mother and my brothers.  This cousin’s father was my mother’s half-brother George, so a half-first cousin relationship.

Prior to making contact with my mother’s relatives I was thinking of having these cousins tested as a means to figure out who my mother’s biological father really was.  But a couple of months ago when I did make contact with these long lost relatives it was revealed that my mother’s biological father was Joseph Kaapuiki Akana, the man whom I doubted was my mother’s father based on his name (Akana is of Chinese origin) and the fact that my mother remembers her father being pure Hawaiian and her DNA composition does not support Chinese ancestry.  I thought that maybe testing these half-cousins would determine if their grandfather was my mother’s biological father.  But it is more complicated than I realized.

Like my mother’s father Joseph Kaapuiki Akana, George’s father was also Hawaiian.  George and my mother shared the same Hawaiian mother.

This is what the ISOGG Wiki Autosomal DNA Statistics page says about how much should be shared between a half-aunt and also to half-cousins.

Screen Shot 2015-12-29 at 7.17.17 PM

Combining with Blaine Bettinger’s Shared cM Project, the total shared for a half-aunt would range from 540cM to 1348cM, averaging 892cM.  The average is around the amount indicated by the ISOGG Wiki page.

For a half-first cousin, Blaine Bettinger’s Shared cM Project says it would range from 262cM to 1194cM, averaging 458cM.  Again, that average is what is indicated on the ISOGG Wiki page.

This is how GEDmatch.com compares my half-cousin to us.

Screen Shot 2015-12-29 at 7.33.09 PMIt is obviously on the high end, for a half-aunt while half-first cousin, not that extreme.  But we are talking of one example only.  There are more half-cousins that I could have test and probably will in the future.  And all of these cousins have had a grandfather that was Hawaiian, so I would expect their amounts to be high.

Comparing to non-endogamous groups, I compare my paternal aunt to her nephews and nieces and a great-nephew and great-niece on GEDmatch.

Screen Shot 2015-12-29 at 7.45.46 PM

My cousin Terri may share the lowest total among the 1st cousins but it does not seem that significantly different from the average 1700cM.  It is interesting to see that her largest segment is 104.7cM.  When I look at my half-first cousin and how much she shares with her half-aunt (my mother), the total is 1412.8cM, and largest segment is 103.3cM.  That figure can be misleading.  I have more cousins on my father’s side that I have yet to test and there may be other cousins who share less or more with our aunt than the cousins that have already tested.

If I take my aunt out of the equation, this is how the cousins compare to each other.

Screen Shot 2015-12-29 at 7.50.04 PM

A couple of my paternal 1st cousins share much less with each other than my half-cousin does with me and my brothers.

It will be awhile before I can get an ample amount of Polynesians who have close relatives tested to fully make a comparison.  Initially I wanted to see if testing half-cousins would help determine if my mother’s siblings were half or full siblings and when I was not certain that Joseph Kaapuiki Akana was her biological father.

It is clear now that any type of half-relationship is difficult to determine if the other parent is also Polynesian, and in our case Hawaiian.  My grandmother married 3 different Hawaiian men and so far from what I know, they have ties to geographically different places.

The endogamous nature just makes it hard to determine the relationship even if it is a close relationship.  It does not have to be a distant 3rd cousin and beyond to appear as a closer relationship.  Even with cousins (half or full) and half-siblings, they seem to appear on the higher end of the relationship, possibly giving a false prediction if the true relationship was not known.

Recent Founder’s Effect, bottlenecking and 6 Tahitian women on Pitcairn island

I finally got the autosomal results of a Pitcairn resident who has been a member of the Polynesian project for a year now.  Previously I had another member who is a Norfolk island descendant and whose ancestors moved to Norfolk but were originally from Pitcairn.  Another Norfolk descendant tested at another company, but his raw data were uploaded to GEDmatch.com in order to be compared.  Now having that this particular Pitcairn resident tested, I can make a comparison for these 3 people since they all have ties to Pitcairn.

 

HISTORY OF PITCAIRN ISLAND

Pitcairn was settled in 1790 by mutineers of the HMS Bounty and Tahitians1.  The initial population of 27 consisted of 9 mutineers, 6 Tahitian men and 11 Tahitian women along with an infant girl.  Only 6 of the mutineers and 6 Tahitian women would produce descendants.

Mutineers:
1) Fletcher Christian
2) Edward Ned Young
3) John Mills
4) William McCoy
5) Matthew Quintal
6) John Adams

Tahitian women:
1) Mauatua Maimiti
2) Teraura
3) Teio
4) Tevarua2
5) Vahineatua
6) Toofaiti

 

POPULATION GROWTH, DECREASE & RE-POPULATION

The population started with 27 people but only 12 of them would produce descendants.  By 1840 the population exceeded 100, and by the mid-1850s the community was outgrowing the island3.

On May 3, 1850 the entire community left for a 5 week trip and settled on the island of Norfolk on June 8.  Nearly 3 years later 16 of them returned to Pitcairn.

Screen Shot 2015-12-21 at 9.03.27 AM

 

EFFECTS WITH AUTOSOMAL DNA

I have mentioned in previous blog entries that eastern Polynesians are genetically less diverse than western Polynesians.  So it should be no surprise that Hawaiians and Maoris as well as Tahitians will come up as closer matches to each other despite sharing common ancestors 8 centuries ago.

Now we are looking at two things.  Firstly, a founding population where only 12 people produced offspring, and half of the 12 being Tahitian women, or eastern Polynesians.  And these 12 were not paired off equally.

Screen Shot 2015-12-21 at 9.32.29 AM

They married multiple times, some of them never produced descendants with their other spouses.

Secondly, there was a population bottleneck in 1859.

Screen Shot 2015-12-21 at 9.35.35 AM

In 1856 the population expanded to 193, then the entire population left.  That population was already interrelated just 66 years after the initial 12 founding people started the population.  They all left, but 16 of them returned.  Eventually, a few more returned but the remaining population continued life on Norfolk island while the rest of the Pitcairns were starting the population again. It would take only 23 years to repopulate the island increasing the population to 250.

 

ANALYZING A PITCAIRN RESIDENT’S AUTOSOMAL DNA

The Pitcairn resident descends from all of the 12 founding people.  No surprise, given that small amount plus that was just 225 years ago and 7 generations ago for this particular person.

Although I cannot show with a family tree how many times they descend from the 12 founding people due to size and the complexity of the tree, I decided to list the number of times they descend from each of the 12.

Screen Shot 2015-12-21 at 9.50.22 AM

This resident’s paternal grandparents are 2nd cousins one way, and 3rd cousins another way while their maternal grandparents were 2nd cousins two ways.  There are more ways that they are related going further back as well, but my genealogy software cannot pick up the multiple relationships and it seems to select the closest relationship but selected 2nd cousin once removed, so not sure which line it was picking up.  This person’s maternal grandfather was born on Pitcairn but there is no known genealogy for him.  For their other grandparents, here is who they descend from.  (Founding people in bold)

Paternal grandfather – Christopher Warren, son of George Warren whose mother was Agnes Christian, and Alice Butler whose mother was Alice McCoy.
Paternal grandmother – Mary Christian, daughter of Sidney Christian & Ethel Young.
Maternal grandmother – Ivy Young, daughter of William Young & Mercy Young.

Agnes Christian and Alice McCoy were 2nd cousins, great-granddaughters of Fletcher Christian and Mauatua.  Ivy Young’s parents William and Mercy Young were 2nd cousins two ways to each other.  Great-grandchildren of Edward N. Young and Toofaiti and of Fletcher Christian and Mauatua.

As confusing as it seems, you can imagine how would DNA show up.  After uploading the raw data to GEDmatch.com for further analysis, I immediately ran the “Are Your Parents Related” tool.

Screen Shot 2015-12-21 at 10.07.52 AM

It predicted 3.3 for the most recent common ancestor (MRCA).  Still not sure how to interpret GEDmatch’s MRCA estimation, but in reality, the most recent common ancestor would be their 2nd great-grandparents – Thursday October Christian II and Mary Polly Young.  And there were other Youngs as I previously mentioned and Christians as well.

When I ran my mother’s kit through that same tool, her largest segment was 13.9cM, and there were a total of 5 segments that would total 51.5cM.

Largest segment = 13.9 cM
Total of segments > 7 cM = 51.5 cM
Estimated number of generations to MRCA = 4.1

Unlike the Pitcairn resident whose largest segment was 24.7cM and with 11 segments.  My mother’s parents were from different islands and as far back as I was able to trace their ancestries, they did not intersect nor did their ancestors come remotely near to each other given that they were from 3 different islands.

I would love to get more Pitcairn residents to test, to see if there is any noticeable pattern using this tool, or David Pike’s ROH.  If there is, we definitely could use it in helping to determine a true close genetic match versus an endogamous one.

 

COMPARING TO NORFOLK DESCENDANTS

There are 2 particular matches to many of the Polynesian DNA project’s members and both of these 2 people are descendants of Norfolk residents.  I will refer to them as Norfolk #1 and Norfolk #2.

Norfolk #1’s maternal grandmother was from Norfolk and she was the daughter of Francis Nobbs and Ruth Christian.  Norfolk #2’s maternal grandfather was from there, and his parents were William Adams and Sarah Christian.

A further breakdown where I bold the founding people.

NORFOLK #1
Francis Nobbs’ ancestry, son of Alfred Nobbs & Mary Christian:
Paternal grandfather – George Nobbs
Paternal grandmother – Sarah Christian, daughter of Charles Christian & Tevarua
Maternal grandfather – Benjamin Christian, son of John Buffett & Mary Christian
Maternal grandmother – Eliza Quintal, daughter of John Quintal & Maria Christian

Sarah and Maria Christian were daughters of Charles Christian & Tevarua, while Mary Christian was their 1st cousin.

Ruth Christian’s ancestry, daughter of Isaac Christian & Miriam Young:
Paternal grandfather – Charles Christian, son of Fletcher Christian & Mauatua
Paternal grandmother – Tevarua, daughter of Teio
Maternal grandfather – William Young, son of Edward N. Young & Toofaiti
Maternal grandmother – Elizabeth Mills, daughter of John Mills & Vahineatua

NORFOLK #2
William Adams’ ancestry, son of John Adams & Caroline Quintal:
Paternal grandfather – George Adams, son of John Adams & Teio
Paternal grandmother – Polly Young, daughter of Edward N. Young & Toofaiti
Maternal grandfather – Arthur Quintal, son of Matthew Quintal & Tevarua
Maternal grandmother – Catherine McCoy, daughter of William McCoy & Teio

When comparing the two Norfolk descendants to the Pitcairn resident, I was surprised to see no overlapping segments.

Screen Shot 2015-12-21 at 1.36.43 PM

Screen Shot 2015-12-22 at 12.58.16 PM

It is interesting to see how for Norfolk #1, the largest segment is 40.85cM for the largest segment and a total of 134.5cM.  The largest segment is significant, and although Pitcairn & Norfolk #1 are related multiple ways, the closest known relationship makes them 4th cousin once removed.

Comparing Pitcairn to Norfolk #2, the largest segment is 27.3cM, which for Polynesians in general could be pretty distant.  Total shared is 95.1cM.  And just as with Norfolk #1, Norfolk #2 and Pitcairn are related multiple ways, but the closest relationship makes them 4th cousins.

At the moment I cannot compare Norfolk #1 and Norfolk #2, but I am trying to get one that taken care of in order to upload Norfolk #1’s raw data to GEDmatch for further analysis.

I was expecting to see the overlap at least when comparing to the Pitcairn resident given that their ancestors’ have been on the island since the beginning, but it goes to show how unpredictable and random DNA can be.

A list of all 3 and how many times they each descend from the following founding population.

Screen Shot 2015-12-21 at 1.46.23 PM

And while various Polynesians can be compared to all three of these people and may show overlapping segments, there is really no way to map these segments.  These 3 testees would match other project members based on segments inherited by one or more of these 6 Tahitian women that settled on Pitcairn.  And we all would have shared common ancestor(s) from at least 8 centuries ago.

Below I compare the Pitcairn resident to a Hawaiian, a Maori and a Cook Island Maori as well as my Hawaiian mother.  Incidentally, there is a project member whose father was from Tahiti, yet that person does not come up as a match.

(default setting)

Screen Shot 2015-12-21 at 3.40.11 PM

(1+cM setting)

Screen Shot 2015-12-21 at 3.48.43 PM

 

Comparing Norfolk #1 with the same people with the exception of not being a match to the Cook Island Maori.

(default setting)

Screen Shot 2015-12-21 at 3.41.18 PM

(1+cM setting)

Screen Shot 2015-12-21 at 3.51.14 PM

Norfolk #2 did not test at FTDNA but at 23andme, and although their raw data was uploaded to GEDmatch.com, all the others being compared were not uploaded except for my mother’s raw data.

For additional information about the DNA study of the descendants of the Mutiny on the Bounty, see ‘Mutiny on the Bounty’: the genetic history of Norfolk Island reveals extreme gender-biased admixture.

Footnotes

1. History of the Pitcairn Islands.
2. Pitcairn Settlers lists an additional Tahitian woman known as Sully, as the wife of Matthew Quintal and the mother of Matthew Jr., John, Arthur, Sarah and Jane Quintal. Another source, as well as the Pitcairn resident who got DNA tested, claims that there were only 6 Tahitian women of whom they descend from.  There was no mention of Sully, although Tevarua is listed as being married to Matthew Quintal and the parents of  Matthew Jr., John, Arthur, Sarah, and Jane Quintal.
3. Historical Population of Pitcairn.

Confirming what could have been a NPE (non-paternal event) or misattributed parentage

Another useful tool for DNA testing is to answer those questionable paternity that either was brought up by a family member or documentation may not support what is known.  This was one of the main reasons why I got DNA tested in the first place.

Quite a bit of people getting DNA tested are finding what is known as an NPE (non-paternal event) or a misattributed parentage.  That is when the presumed or putative father was not the biological father.  This could have happened either recently, a generation ago, or way beyond that to where current living people may not be aware.

This is when people need to take the extra steps by testing other family members or also getting other specific tests, such as a Y-DNA test. Sometimes it can be a Y-DNA test that makes people realize that there was an NPE.

Back in July of 2015 I figured out who my mother’s biological mother was.  Her name was Rose Kanae, and Rose was married three times.  I found that one of her husbands — Joseph K. Akana  resided at the same address where my mother was born.  So the assumption was that he was probably my mother’s biological father.  The  Akana surname is of Chinese origin, and it is what initially made me believe that he was not the biological father.  My mother was told after having met Joseph Akana once as she was 5 years old, that he was a pure Hawaiian man.

Last October a cousin confirmed that Joseph indeed was my mother’s biological father.  It was explained to me by a couple of relatives that Joseph took the surname – Akana from his Aunt who married a Chinese man surnamed Akana.  Joseph’s original name was Joseph Kaapuiki, and later he went by Joseph Kaapuiki Akana.

This same cousin who confirmed that Joseph was my mother’s biological father did question Joseph’s paternity, suggesting that Joseph’s mother Elena Kauhi was not so faithful.  This is how I was able to confirm that Joseph’s father – John Kaapuiki was his biological father.

Below is my mother’s top 5 matches.Screen Shot 2015-12-17 at 5.18.58 PM

These all say “Possible range: 1st – 2nd cousins.”  Her first match is how I was able to figure out who her biological mother was.  This is how Frank is connected to my mother.

Screen Shot 2015-12-17 at 5.23.11 PM

Frank and my mother are actually 1st cousins once removed, making Frank & I second cousins.  With females there is less ambiguity whereas with men there can always be that questionable paternity.

The second top match was “lkauhi” and this is how that person actually is related to my mother once I was able to get my grandfather’s genealogy.

Screen Shot 2015-12-17 at 5.20.32 PM

“lkauhi” is off to the right, and she matches my grandfather Joseph Kaapuiki (Akana) via his mother’s side, through Elena Kauhi.  This would confirm that Joseph is the biological father of my mother since “lkauhi’s” grandfather Johnathan and Joseph’s mother Elena were brother and sister.

One of my cousins gave me the names of our grandfather Joseph Kaapuiki Akana’s ancestors going back as far as his grandparents.  His father John Kaapuiki‘s father was Kukahuna Kaapuiki.

Further research online revealed that the Akana-Kaapuiki family listed my ancestor Kukahuna and traced it a few more generations back.  But I was not confident at first to know that any of the names beyond Kukahuna were my own ancestors.  This is the same family that I was told my grandfather Joseph took his surname from, and that they were related.  Given that they listed Kaili Kaapuiki who married a Chinese man surnamed Akana as the sister to my ancestor Kukahuna Kaapuiki, I knew that was probably the connection but could not confirm it through documentation.

I looked for the genealogy of my mother’s 3rd match “milt17th.”  I contacted him and he confirmed his genealogy, that he was the grandson of Kaili Kaapuiki and Akana.

This confirms that John Kaapuiki was the biological father of my grandfather Joseph Kaapuiki Akana.