August 7, 2020



Coptic woman (fragment of Coptic woollen tapestry, 3rd-4th c.)

In Part I, Part II and Part III of this series, I talked about how the Egyptian media misunderstood and misinterpreted the Genographic Project’s results on Egypt, the need to know about the project and the science behind it in order to understand its results, and I gave the details of the Egyptian reference group results. In this Part IV, I will present a critical analysis of the Genographic Project (GP) on Egypt.

The GP says that it is non-profit. However, this does not square with the fact that it has generated much income. The IBM and the Waitt Family Foundation have provided the initial funding to the project of $40m (£21m), but the project generated an income from those who participated in the project (kits were sold for $99 in Phase I and $199 in Phases II and III) of over 171 million.[1]

As you have Seen in Part II, the GP defended its decision to privately fund the project rather than allowed to be funded by the Federal Government and thus make it subject to public scrutiny. A genetic scientist, Kenneth Weiss, expressed his concerns that “Decisions [of the GP] will be made privately rather than in the kind of open forum that the HGDP[2] was going to for setting ethical standards, for making sure that the populations are not exploited, and that samples are not collected in a hit-and-run kind of way.”[3] Had he known of the extent of the GP and the results that it came out with, particularly the reference populations results, he would have said more.

We have seen in Part III that the GP[4] has come up with an Egyptian reference population – that’s the average proportions in the blood of the modern Egyptians from six ancestry regions: Northern African (68% of the Egyptian genetic pool), Southwest Asia & Persian Gulf (17%), Jewish Diaspora (4%), Asia Minor (3%), East Africa (3%), and Sothern Europe (3%).

The GP says that its results on the Egyptians is based on “native Egyptians”. The word ‘native’ was introduced into English vocabulary in the late 14th century, natif, meaning “natural, inborn, hereditary, connected with something in a natural way,” from the Old French ‘natif’  (native, born in; raw, unspoiled) and directly from Latin ‘nativus’ (innate, produced by birth), from natus.[5] If by it the GP meant the inhabitants of Egypt who were born in it, then it’s understandable. There is no surety, however, that the GP had asked those who participated their samples from Egypt if they were Egyptian by birth or not. If by it, on the other hand, is meant ‘the aboriginal peoples’ of Egypt, as the word is meant to mean when we talk about the natives of America or Australia, then it poses a great problem, since only a small section of Egypt, the Christian Copts, and may be some Muslims who were originally Copts but converted to Islam, deserve the label ‘native Egyptian’. Anyhow, the GP does not tell us what they mean by their labelling.

It is clear that the GP has based its results on Egypt as a country: it obviously believes that Egypt is a uniform nation, and fails to recognise that Egypt is in fact a multi-ethnic and bi-religious country, and that its genetic diversity often runs along these lines. There are Arab settlers, Arab Bedouins, Turks, Nuba, Berber, Beja (descendants of the Blemmyes), and other ethnicities. There has been interbreeding between these, particularly between the Muslims of Egypt, but that is limited to a certain level. The GP had a great opportunity of finding about the extent of the inbreeding between all these ethnicities, but it missed it. We have also seen in Part III how the GP addressed the same problem in the case of the British reference population, and after producing one result for the British in Geno 2.0, it separated the Irish and Scottish ethnicities from the British (which must mean mainly English), and produced three separate different results in Geno 2.0 Next Generation for the UK. These are more accurate results that those of the Geno 2.0, and representative of the three nationalities in the UK (the Welsh were ignored by the GP). But even here, we do not know if what the GP describes as Scottish, e.g. were participants that lived in Scotland, or born in Scotland, or thought of themselves as having Scottish descent.  Different results would have been produced by addressing these variables. But the results for the UK are definitely much more sophisticated than the results for Egypt.

Despite the assertion from some that there is no difference between the Coptic Christians of Egypt and the Muslim Fellahin, there is no prove that this is the case. The theory is that the majority of the Copts had converted to Islam, and that they intermingled with the Arab invaders. Again, this is a pure conjuncture based on dubious literature. The GP had another great opportunity to test that hypothesis but it squandered it. We contend that the Copts had minimal admixture with the Muslim Arabs and Berbers of Egypt, and that the Copts are genetically different from Egyptian Arabs, Berber, Nuba and Beja of Egypt, or of those who claim descent from one or a combination of these ethnicities. In the case of Lebanon, we saw how the GP has based its results on the country as a whole – taking the Lebanese as one uniform population, exactly as it did in the case of Egypt. Its results, which I have produced in Part III, therefore, fail the generalisability test. This was proven by the study undertaken by Pierre A. Zalloua et al that proved that male genetic variation within Lebanon was more strongly structured by religious affiliation than by geography.[6] It is mandatory upon any researcher or group of researchers to take the factors that I have stated above about the diversity of the Egyptian population in any genetic anthropologist project that seeks to understand the Egyptian genome, or rather the genomes of the different ethnic constituents of Egypt. Understanding the history of the country and its historical migrations is essential.

A great defect in the GP is that it doesn’t follow the normal procedure required in scientific research. It was not published in a scientific journal and its methodology and findings were not subjected to a peer review in order to provide credibility to the project. Its research methodology is also deficient or suspect. Research methodology is basically the specific procedures or techniques used to identify, select, process, and analyse information about a topic. Without this very clearly described, the reader will not be able to critically evaluate the research and assess its validity and reliability.

We know that the target population of the GP on the Egyptian reference population is the total population of Egypt, which stood in 2019 at 100,388,073. The research the GP conducted on Egypt must have been a qualitative research, or so it should, since it made generalisations about the totality of the Egyptian population based on the findings in its study sample of participants. It should have used a probability sampling technique, in which every Egyptian’s likelihood of being selected for membership in the sample is known. It is important that the samples from which the DNA data were collected is a representative sample, resembling closely the Egyptian population. Obtaining a representative sample is important because generalisability is a key goal in any study that relies on probability samples. Without ensuring this generalisability, all elements in the researcher’s target population will not have an equal chance of being selected for inclusion in the study. This can only be done through random selection. But the GP did not rely on a random selection of its Egyptian participants: this was left to paying participants. It means that only rich Egyptians, who were able to pay the $199 for a DNA kit were included. It also means that, in all probability, only Egyptians living in the metropolises of Cairo and Alexandria, or abroad in the Egyptian diaspora in the West, were included, since poorer Egyptians in smaller cities and towns, and in the villages of Egypt, could not afford to participate with their DNA. It is important to recognise here that the different ethnicities of Egypt are concentrated in specific localities and are not uniformly spread throughout Egypt; e.g. the Nubians concentrate in Aswan area, people who can trace themselves to Arab tribes are mainly concentrated in Upper Egypt as those who consider themselves to be Berber in origin. These were not randomly included in the GP’s sample.

We do not even know the size of the sample the GP used of the Egyptian participants to make generalization about the whole Egyptian population. How big is the sample required from over a million Egyptians is needed to make the research findings representative? The GP does not tell us. Nor does it tell us about its confidence level, that tell us about the significance of its results, or its margin of error.

In short, there are several questions about the way the GP selected, processed, and analysed the information it had on the Egyptian reference group. Its research methodology is not up to scratch. The DNA results obtained from its paying Egyptian participants cannot be generalised to tell us about the Egyptian population as a whole. Even if the GP has followed the right methodology, and came up with a representative sample of the Egyptian population as a whole, it will still be questioned since it took the Egyptians as a uniform group, more or less. What is the average for an arrange and a banana? An ‘orangenana’? Genetic anthropology research that does not take the ethnic, religious, and locality diversity in Egypt, and similar countries, into consideration ends up with misrepresenting all: the genetic blend made of various ancestry components will not look like any in reality. The published results of such defective research will be underestimating or overestimating components, or indeed adding a component that does not exist at all in all, making it a waste of effort and money.

We can fairly say that the GP’s results on Egypt are inaccurate. As for the Copts, they do not represent them, and they cannot be applied to the Copts. The Copts make around 15% of the Egyptian population, and one can guarantee that, if a random sample of good size is taken from them, the result will be significantly different from what the GP came up with, and indeed from any results based on the Egyptian Muslim Arabs, Berber, Nuba, or Beja.


[1] In Phase I there were over 479,000 participants who each bought the kit for $99; in Phases II & III, there were over 621,000 participants who each bought the kit for £199.

[2] A previous project, Human Genome Diversity Project, from 1991, that sought federal funding to basically do the same kind of research.

[3] Graciela Flores, New genome project, new controversy in The Scientist (8 May, 2005).

[4] I am using here the Geno 2.0 Next Generation result.

[5] The Online Etymology Dictionary.

[6] Pierre A. Zalloua et al, Y-Chromosomal Diversity in Lebanon Is Structured by Recent Historical Events. The American Journal of Human Genetics 82, 873-882, April 2008.

