# William the Conqueror

But to the extent that ancestry is considered in genealogical rather than genetic terms, our findings suggest a remarkable proposition: no matter the languages we speak or the colour of our skin, we share ancestors who planted rice on the banks of the Yangtze, who first domesticated horses on the steppes of the Ukraine, who hunted giant sloths in the forests of North and South America, and who laboured to build the Great Pyramid of Khufu.

One of the things in Dan Brown’s book The Da Vinci Code that really ground my gears to a complete stop was the ‘revelation’ that Sophie Neveu and her little brother are descendants of Jesus and Mary Magdalene. At the time I thought the whole concept was ridiculous, because in every one of the 67 generations between the year zero and the present Jesus’ and Maria’s genes were halved, and nothing would be left, not even a base pair3. I never read another of his books.

So what’s the deal, did they have a Jesus gene that gave them the ability for turning water into wine? The book does not tell us what spectacular properties these kids have that made them so important. But I would like to know, because  I almost certainly would also have those capabilities.

After reading the section ‘The Tasmanian’s Tale’ in Richard Dawkins’ highly recommended The Ancestors Tale I realized that if Jesus and Mary Magdalen had any living descendants, everyone of us would be among their successors, in other words, they would be Common Ancestor (CA) to all of us.

I descend from William the Conqueror as the picture on the left shows4. I also descend from Charlemagne, who lived some ten generations before William, but so far I found no direct link between William and Charlemagne5. The tree on the left spans thirty one generations (I included my son), so let’s think for a second about the thirty generations that have elapsed. I have two parents, four grandparents, eight great grandparents and so on. By the time I get to William, I should have more than 1 billion (great)27-grand parents, but  the world population at that time is estimated at only 300 million6. At Charlemagnes time I would have had 1012 (a thousand billion) ancestors, probably more people than ever lived. Clearly this doesn’t work. And footnote 4 already shows why not. I have a common ancestor with Willem Meier, and with at least ten other people I don’t know, on geneanet  alone, who all have Klaasje van Winkel in their family tree (because that is what I searched for on that site). And there are many sites devoted to genealogy nowadays — what else can a pensioner (mainly men I get the impression) do with all his free time — so there are likely quite a few more. But this is the crux of the matter: at a certain time we must all have a common ancestor (CA). And of course if there is a CA there is also a most recent common ancestor (MRCA).

And here is the funny thing: this MRCA is much closer in time than you think, or at least than what I thought before reading Dawkins book7 and Chang’s papers2,8, and although William maybe is not, Charlemagne could well be a CA. Or they could both be.

As Dawkins explains ([7], p 39):

Pick any two people and go backwards and, sooner or later, we hit a most recent common ancestor. You and me, the plumber and the queen. Any set of us must converge on a single concestor (or couple). But unless we pick close relatives, finding the concestor requires a vast family tree, and most of it will be unknown. This applies a fortiori to all humans alive today. Dating concestor 0, the most common recent ancestor of all living humans, is not a task that can be undertaken by a practising genealogist. It is a task in estimation: a task for a mathematician.

That mathematician is J.T. Chang. The mathematical model Chang presents8 is simple and amenable to statistical analysis. But it is not too simple, later improvements show2,9 that the main conclusions hold. The model is, in Chang’s own words ([8], p. 1003):

We assume the population size is constant at n. Generations are discrete and non overlapping. The genealogy is formed by this random process: in each generation, each individual chooses two parents at random from the previous generation. The choices are made just as in the standard Wright–Fisher model—randomly and equally likely over the n possibilities—the only difference being that here each individual chooses twice instead of once. All choices are made independently. Thus, for example, it is possible that when an individual chooses his two parents, he chooses the same individual twice, so that in fact he ends up with just one parent; this happens with probability 1/n.

That last probability becomes of course vanishingly small when the population is large (in the millions).

On the basis of this model Chang proves two theorems. The first one gives the probability of finding the MRCA at a certain  generation in the past:

Theorem 1. Let $$\mathcal{T}_n$$ denote the number of generations, counting back from the present, to an MRCA of all present-day  individuals, in a population of size n. Then

$\frac{\mathcal{T}_n}{^2\log n} \to 1\quad \mbox{as} \quad n\to \infty$

What this means is that for large populations the number of generations $$\mathcal{T}_n$$ to the MRCA becomes proportional to 2log n where 2log n is the base 2 logarithm of n. In other words, this is approximately how many generations are needed on average to find the MRCA.  Now, the population of Europe in the year 1000 was about 60 million, and $$^2\log 6\times10^7 \approx 26$$ so there is a reasonably probability that the MRCA of all Europeans lived as few as 26 generations ago. It is a statistical result (the ratio approaches 1 in the infinite population limit), so it is not certain, but you can be pretty damn sure that  Charlemagne, who lived forty generations ago, is in the bloodline of the vast majority of people in Europe, and absolutely certain that Jesus (67 generations)  is one of my ancestors. If Sophie Neveu is his progeny, so am I.

Actually that does not mean very much, as the second theorem of Chang shows, because by that time a large proportion of the population is my ancestor. It is a little more complicated, so I won’t quote it in full, but give merely the numerical result.  It is based on the obvious  consequence that all ancestors of a CA also become CA, so pretty soon everybody living in the world a number of generations before the MRCA is my ancestor and of everybody else. Chang proves that this takes, on average and in the large population limit, 1.77 2log n generations.  It is a little dangerous to think in terms of genes in a genealogical context, but I have the feeling that it means that pretty much everybody living at the time of Jesus in Europe and the middle east contributed to my genome.  The world as a whole is more complicated, and you need to take into account groups of people living more or less isolated from the rest, but even so Chang et al. argue that the MRCA of all current humans lived only a few thousands of years ago ([2], p. 565).

Of course not everybody contributes, since there is always a proportion of the population that leaves no offspring at all, and some, maybe even most, genealogical lines do come to an end. Consequently, a person living some four or five millenia ago either has no current relatives at all, or is everybody’s ancestor. Some of these considerations are nicely illustrated by a simple simulation of  Chang’s model  consisting of just five people per generation ([8], p. 1005).

The bottom line is generation 0 and every individual from that generation chooses two parents at random. In the case of person 1 this is twice 5, but for larger populations that possibility becomes negligible, and does not alter the results in any essential way. Person 2 has 2 and 5 in generation -1 as parents, 3 has 1 and 2, and so on. Thus, in generation -1 number 1 has 3 and 5 as children, 2 has 2 and 3, well, you get the idea. The next step illustrates what can also happen: number 5 is not chosen at all, and this ends the contribution of 5 at generation -2 to all future generations. This is indicated by the symbol ∅. Also, already in generation -2 person 4 becomes CA,  indicated by an S. You can follow lines originating from that person to all persons in generation 0. $$\mathcal{T}_5=^2\log5\approx 2.3$$, so theorem 1 states that this should happen between generation -2 and -3, but the numbers are so small that it does not really apply to this situation10. In any case , the MRCA is identified. A generation before that 1 and 2 become ancestors to 4, thus becoming also a CA of generation 0, and two generations before that a person is either a CA to generation 0, or has no offspring at all.  You can also note that common ancestors in different generations need not be related. Number 4 in generation -2 (William the Conqueror) is a CA, as is number 2 in generation -4 (Charlemagne), but William does not have Charles as ancestor.

Should I be surprised that William the Conqueror is in my ancestry: not at all. In fact it would be very surprising if he were not. And even more so for Charlemagne.  In fact there are probably many more paths leading to the same people. Does it mean anything? I don’t think so, the link between genetics and genealogy is complicated. I can share an ancestor  but none of his/her genetic material. Chang, in  a response to one of his critics states:11

The descendants of a common ancestor need not share any particular DNA from that ancestor, and it is even possible that none of the descendants has inherited any DNA from the ancestor. If you and I were investigating our common ancestry, we might conceive of an extreme case in which your mother’s father is the same as my mother’s father, but our common grandfather passed along no genes to either of us. Our ability to detect this common ancestor may be affected by these genetic circumstances, but the fact that we have a common grandfather would remain.

Does the idea of being a descendant of Jesus have any meaning? For me even less so than before. Not only is the likelihood of having the gene for holiness infinitesimally small, but almost everybody else has him as an ancestor as well. Could Dan Brown have known this? The Da Vinci Code was published in 2003, Chang’s paper in 1999, so yes! But in view of the historical accuracy of many of the other things in the book, it does not surprise me that he was not up to date on the statistical literature either.

Everybody will either become a common ancestor, or leave no trace at all. The probability of me becoming a common ancestor to all people in the year 4000 is currently zero. My parents have a marginally better chance.

[1] He is still alive in almost all of us. Picture taken from  https://en.wikipedia.org/wiki/William_the_Conqueror

[2] D.L.T. Rohde , S. Olson, and  J.T. Chang,  Modelling the recent common ancestry of all living humans, Nature, 431, (2004), 562. doi

[3] Every generation the number of base pairs coming from Jesus (or Mary) is halved. In 67 generations this dilutes the original genome by a factor of 10-20. You only have 3×109 base pairs. It would already be very surprising if one of Jesus’s original base pairs is still in us. About the same probability as winning the lottery ten thousand times.

[4] Most of the work is done by a Willem Meier, whose family tree I found on https://gw.geneanet.org/ and with whom I apparently share a set of ancestors six generations ago (Hendrik van Winkel and Marijtje van Noort). There are a few other trees with which I could check some of the generations before that, and the first ten or so generations following William the Conqueror can be found on wikipedia and readily available documents of English heritage societies. Obviously this all becomes rather useless, some of the information  is missing or unreliable, and it is extremely likely that the visiting milkman is not just a recent development, despite the risks https://www.youtube.com/watch?v=s4nSt0q_Jfo

[5] There is a website http://www.kareldegrote.nl/ where you can submit your bloodline to Karel de Grote (as Charlemagne is called in the Netherlands). Given the number of generations gone by, the amount of wives (6) and known concubines (4)  he had children with, we can safely conclude that the majority of people in Europe could produce a link. And probably even more than that, since Charlemagne was also a guest of Haroen al-Rasjid, and considering his fondness of women, it would be very surprising if he had not grabbed some pussy in the middle east as well.

[7] Richard Dawkins, The Ancestor’s Tale, A pelgrimage to the Dawn of Life, Weidenfeld & Nicholson, (2004). Rendezvous 0, All Humankind. ISBN: 0-618-00583-8

[8] J.T. Chang, Recent Common Ancestor of All Present-day Individuals, Adv. Appl. Prob., 31, (1999) 1002. doi

[9] There are a number of critical papers and discussions following his publications. Ask me if you are interested, and can’t find them yourself.

[10] Also Theorem 1 is a statistical result, and the simulation  is just one realization of the model for a population of 5. Other realizations will give a different generation for the MRCA. The theorem only tells us what happens when we average over all of these.

[11] J.T. Chang, Reply to discussants: recent common ancestors of all present-day individuals, Adv, Appl. Prob., 31, (1999), 1036.