Wednesday, May 4, 2011

Structure within Houston Gujaratis resolved?

Structure within Houston Gujaratis resolved?Artist Info


Blog
Quick Takes
Mutineers
FAQ
Events
Send Tips
Contact





« Haverfoodisms · Main · >1 Billion People... »
April 29, 2011
Structure within Houston Gujaratis resolved?

guj.jpgAbout two and a half months ago I brought your attention to the fact that there is population substructure in the Gujaratis of Houston. That might sound strange, but here’s the back story. Over the past ~10 years or so there has been a project attempting to catalog common human genetic variation, known as the HapMap. The HapMap began with East Asian, West African, and European groups. But over the years it has been expanding. The first South Asian population added to the database were people of Gujarati origin in Houston, Texas. Therefore, you had a situation where in the medical genetic literature there was a lot of talk about “Gujaratis from Houston,” as if that was a group of particular importance.

The ultimate pragmatic rationale for the catalog was to allow researchers to control for ancestry when attempting to fix upon genes implicated in disease. By illustration, if Chinese have disease X at a greater frequency than Europeans, if you had a common pool of Chinese + Europeans then all the genetic variants associated with the Chinese might come up as causal, when actually it’s just a correlation with ancestry.

guj2.jpgAnd this brings me to the Houston Gujaratis. One thing that jumps out at you in analyses of genetic variation of this population set is that it has substructure. That is, there are two populations within the data set. More precisely, there is one tight cluster, while the rest of the individuals vary a great deal in their genetic character. The image above is my own plotting of the variation of Chinese and the Houston Gujaratis onto a cubic space. You immediately see that there is a Chinese cluster and a Gujarati cluster, and a range of Gujaratis who fall outside of the main cluster.

Knowing what we know about the prevalence of endogamy among South Asians the immediate model which jumped out at me was that the Houston Gujarati cluster was a specific subgroup which migrated to the United States. But who? My immediate hunch was that they might be a group of Patels. Others of you suggested Bohras.

I can now report something substantive thanks to Zack Ajmal. He has some Gujarati Patels in the Harappa Ancestry Project,and they match closely with the Gujarati cluster in question. This does not exclude the possibility that the cluster consists of Bohras, and does not entail that it must be Patels. I don’t know the relationship between these various groups in Gujarat. But I think we’re getting closer to a resolution of this mystery at least.

indiaMDS_htm_31221595.jpgOf course the Gujarati HapMap cluster is not unknown to scholars. Two years ago in the supplements of the paper Reconstructing Indian History the authors observed the peculiar pattern in the principal component plots, which visualize the largest independent dimensions of variation in a data set. Most of the Indian populations fell along a line which has at one pole various groups like South Indian Dalits and at the other pole Europeans. But as the authors note a section of the Gujaratis were outside of the expected pattern. Why? Here is their hypothesis:

…Interestingly, one of the GIH subgroups fall outside the main gradient of Indian groups, suggesting that they harbor substantial ancestry that is not a simple mixture of ASI and ANI. A speculative hypothesisis that some Gujarati groups descend from the founders of the “Gurjara Pratihara” empire, which is thought to have been founded by Central Asian invaders in the 7th century A.D. and to have ruled parts of northwest India from the 7-12th centuries. I. Karve noted that endogamous groups with names like “Gurjar” are now distributed throughout the northwest of the subcontinent, and hypothesized that that they likely trace their names to this invading group.

This is wrong. The reason that a subset of Gujaratis fall outside of the main cluster is that they are a very genetically homogeneous group. This is why you exclude close relatives from these analyses; the relatives will shake out into their own clusters, which is obviously not what you want to clutter up the results. All the Gujaratis who are not in the cluster run the gamut you would expect in terms of ancestry for individuals from Central West India. Those in the distinctive cluster have a particular pattern in common.

To the left is a bar plot I generated from a selection of individuals and population from Zack’s K = 11 ADMIXTURE run. You can see the raw data in Google Docs. What K = 11 means that Zack took all the individuals in his data set, which runs into the thousands, and allowed the program to apportion them to 11 populations. These are not real populations necessarily, but abstractions. So you shouldn’t take the labels too seriously. I’ve limited it to the population components of particular relevance for South Asians. The labels in all caps are a number of individuals from public data sets. Those which are not in all caps are individuals from the Harappa Ancestry Project. I’ve constrained the individuals and populations to be somewhat informative of my overall point. What is that point? The “Patel” Gujarati cluster is among the most “pure” of South Asian populations. The Bengali to the left is my mother, and you see can see that her South Asian proportion drops mostly because of her elevated East Asian ancestry. Among the Jatts the European and Southwest Asian proportion is higher. The “Onge” components refers to an affinity with a tribe in the Andaman Islands. This, combined with the “S Asian” component is probably a good shadow of patterns of variation which denote ancestral deep roots within the Indian subcontinent. Combing the two you see the the Gujarati cluster and individuals affiliated with it top out in excess of 90%! I think this is the outcome of the ancient admixture event between “Ancestral North Indians” and “Ancestral South Indians” which defines South Asians as a distinctive genetic unit on a worldwide canvas. All those who came later, whether it be Austro-Asiatics, Aryans, or Scythians, are overlays upon this robust common substrate.

Ironically the geneticists who decided to select the Gujaratis of Houston stumbled onto a group which is archetypically representative of what it means to be South Asian in a biological sense.
Science


Razib Khan
POSTED BY Razib Khan @ 3:06 PM

PermalinkPERMALINK
shareSHARE


18 Comments
(anonymous comment)
Author Profile Page Razib Khan | April 30, 2011 1:33 AM | Reply

The Aryan Migrations or the Scythian Migrations?

former.
(anonymous comment)
Author Profile Page Razib Khan | April 30, 2011 3:05 PM | Reply

By the way, detecting Scythian movement maybe much more challenging. They were very mixed people. Most spoke Turkic language, and some, the ruling elite, may have spoke an Iranian language.

i don't know how the ethnographic terms were confused, but the ancient scythians were clearly an iranian people. if they had turkic and it had an impact it would be easily detectable in the genetics. from what i can tell the jatts do not have it. there is quite often some turkic in muslims from UP north and west. some of the pashtuns in the HGDP data set clearly have it. i do not usually see it in hindus, except for people from the himalayan fringe (nepal, himachal pradesh). we bengalis tend to have a more southeast asian element. the two are differentiable.

i say the indo-aryans had a bigger impact becasue the "european-like" component found at low fractions across south asian is WAY more widespread than the saka domains. in fact, it seems to be found in most indo-aryan people, even bengalis of non-brahmin origin like my parents, and south indian brahmins. so if you integrate over the distribution the indo-aryans clearly were much more of an impact. the saka influence was salient though obviously.
(anonymous comment)
Author Profile Page Razib Khan | April 30, 2011 3:46 PM | Reply

sorry if this has been answered because I skimmed the article really quickly, but why did you leave out African ancestry? Or is it so negligible that it wasn't worth putting in? I always figured certain groups like pashtuns, balochis, sindhis, etc would have just as much African as East Asian heritage; although judging by the graph, I really underestimated just how much E.Asian ancestry non-Bengali S.Asians have

1) outside of muslims from the northwest part of the subcontinent, african is negligible

2) at the 0-5% interval a lot of the "east asian" is probably noise, though not all (especially on the northern fringe of the subcontinent). the tell is if you have a group with low levels of east asian admixture that is uniform, vs. a group where lots of people have none, and then one person has a lot. that is not uncommon among north indians (excluding bengalis, who tend to have low levels which vary across the 5-15% interval). among some south indian tribals you have 2-3% in everyone. that is either VERY OLD admixture, or it is noise that emerges from the inability to properly separate ancestral components in the algorithm.
Author Profile Page John Jacobi | May 1, 2011 8:35 AM | Reply

Razib, My humble suggestion is that you summarize the the point your posts in the introduction or conclusion - in non-technical language if possible.
(anonymous comment)
Author Profile Page Zachary Latif | May 1, 2011 2:51 PM | Reply

I imagine that the Indo-Aryans were more of a "settling" population than the Scythian who were a tribal/ruling elite hence the disproportionate genetic impact.

Impressive that the n.w south asia (indus) has seen an admixure of other "continental" influences like African & N.E Asian so is Bengal in that matter; they were really the "borderlands".

Apologies for the confusion but is there any resolution or clarity on Gujarati B?
Author Profile Page Razib Khan | May 1, 2011 3:56 PM | Reply

Apologies for the confusion but is there any resolution or clarity on Gujarati B?

so zack calls "gujarati b" "gujarati a." so it is resolved. probably patels.
Author Profile Page Razib Khan | May 1, 2011 4:08 PM | Reply

I am wondering if the S.Asian in this case is ANI (not ASI).

yes.

Also the (single) Rajasthani brahmin with E.Asian--is it a Scythian or Mongol contribution or just migrations from within other parts of N.Indai. There have been a lot of those as I've said before. Can you clarify what E.Asian is (something shared with peoples of SE Asia, the Han etc...)?

these results won't allow clarification. but it's not that hard. generally it is as you expect, ppl from nw south asia are more 'turk', those from east south asia more 'southeast asian.'
(anonymous comment)
Author Profile Page Razib Khan | May 1, 2011 5:03 PM | Reply

India was peopled LONG ago: about 60,000 years ago. To put this into perspective, the following were colonized at

there's no guarantee that the first settlers made much of an impact and weren't replaced. i think this has happened in europe, and, in much of east africa.

This doesn't make sense, and it's like saying that apes have a lot of human genes (not the semantically more lucid "humans have a lot of ape genes."). The research's narrative should attempt to elucidate how much Proto-Indian genes remains in C Asia and S Asia and SE Asia.

there is a non-trivial probability that all of non-africa was settled from the fringes of south asia. but the ASI lineage which is oldest in india, even if it is descended from the first settlers, obviously has changed since the last common ancestor with other lineages.

fwiw, i bet ASI is basal to all east eurasian, oceanian, and amerindian lineages. it might be basal to west eurasian ones though. there is mounting evidence that anatomically modern humans left africa closer to ~100,000 years ago, but that there was a "pause" or interregnum somewhere in southwest asia before a second expansion event.
(anonymous comment)
Author Profile Page Razib Khan | May 1, 2011 5:20 PM | Reply

Also, what is the centroidal European gene set? Is it someone from the geographic/population center of Europe in SE France/Lithuania? Or is it from the eskaru who may have been descendants if the original cromagnons? Or from someone just west of the Urals in Kazakhstan?

the "european" in south asians is clearly affiliated with eastern europeans. i'm not going to get into the details, but it is pretty obvious if you run ADMIXTURE yourself. it's a band from the baltic down to the caucasus and sweep to northwest india and a little beyond.

We could have even framed the poles as a Tamil Dalit Node, a Tibetan Node, a Portuguese Node (to detect admixtures amongst Konkani Brahmins and Goan Christians), a Tajik/Sogdian/Pamiri Node to detect Scythians, or even a Persian Node to detect Greeks (since Alexandrea troops were mostly Iranians when he came to Pakistan).

some populations really are more "types" than others. a tamil dalit and portuguese node makes sense. the others don't. those groups are simply always linear combinations of other groups in all the runs i've seen.
(anonymous comment)
Author Profile Page Razib Khan | May 1, 2011 5:58 PM | Reply

blood groups are subject to natural selection, so i don't think we should rely on them much today. O seems ancestral, with A & B probably arising due to some selective pressure more recently.

here's a map of B if ppl are curious:

http://johndenugent.com/images/map-of-b-blood-in-the-world.gif
(anonymous comment)
Leave a comment
Sign in to comment, or comment anonymously.
Name
Email Address
URL
Remember me?
Prevent comment spam. Please type the word brown below.
Please don't feed the trolls. Requests for celebrities' contact info or homework assistance; racist, abusive, illiterate, content-free or commercial comments; personal, non-issue-focused flames; intolerant or anti-secular comments; and long, obscure rants may be deleted. Unless they’re funny. It’s all good then. Sepia Mutiny reserves the right, if served a legal subpoena by official law enforcement authorities conducting a civil or criminal investigation, to supply identifying information about commenters. Starting Sep 22, all anonymous comments will be hidden (collapsed) as a default. Please sign in using one of our many options to ensure that your voice is heard. Comments (You may use HTML tags for style)

site design by Avani P

No comments:

Post a Comment