The other day I found myself wondering what proportion of genes cousins would expect to share compared to biological siblings. This took more time to figure out than I would have expected, in part because I knew that siblings who share two parents have 1/2 their genes in common on average, so I thought cousins sharing two grandparents might have a quarter. They don’t, though – it’s half that. In reasoning it out, it turned out to be easiest to think of moving up the biological family tree to a common ancestor, which led to one general formula and a few specific cases:
The Generalization:
Given two people A and B, find their closest common ancestor C. If there are n generations from A to C and m generations from B to C, then the expected proportion of shared genes is (½)n+m. If there are two closest common ancestors (for example, both parents) then this number would double.
In the case of a parent and child, for example, there is 1 generation from the child to the parent (the common ancestor) and 0 from the parent to itself, so the proportion of shared genes would be (½)1, or just ½. Cousins would each be 2 generations from common grandparents, leading to (½)4, or 1/16, for cousins with one grandparent in common (sometimes called half cousins) and twice that for cousins with two grandparents in common (sometimes called full cousins). Double cousins — that is, people who are cousins on both sides of the family tree (for example, cousins whose mothers are sisters and whose fathers are brothers) — would still have grandparents as the closest common ancestor, but now it would be up to four common grandparents instead of just one or two: the expected proportion of shared genes between cousins with four common grandparents would be 4·(½)4, or just ¼. Likewise, an aunt and nephew with two parents/grandparents in common would be 1 and 2 generations respectively from this pair of common ancestors, so the expected proportion of shared genes would be 2·(½)3, also ¼.
Special Case 1: great-great-…-great grandparents
In this case the older relative is the common ancestor, so if “g” is the number of “great”s then the proportion of shared genes is (½)g+2. The additional 2 in the exponent is because the number of “great”s counts the generations after grandparents, who are already 2 generations away from their grandchildren. This is the only case where the proportion is exact: in all the others, it’s only an expected proportion because siblings could have anywhere from no overlap of genes to complete overlap of genes from each common parent.
Special Case 2: great-great-…-great aunts and uncles
In this case the older relative’s parent(s) are the common ancestor. With a great-uncle and great-niece, for example, the great-uncle’s parent(s) are the great-grandparent(s) of the great-nephew. This means that there is 1 generation from the great-uncle to his parent(s), but 3 from the great-niece to that common ancestor, with each additional “great” adding another generation. If “g” is the number of “great”s, then the expected proportion of shared genes would be (½)g+3 if there is one parent in common, and (½)g+2 if there are two. (I personally find it interesting that you can expect to share the same proportion of genes with a sibling who shares both parents as you do with either of the individual parents, the same proportion with an aunt or uncle who shares both grandparents as you do with either of the individual grandparents, and the same proportion with a great-great-…-great aunt/uncle who shares both great-great-…-great grandparents as you do with either of those great-great-…-great grandparents themselves.)
One clarification: great-aunt is the term I grew up with, but in looking around I just discovered that “grand-aunt” may be the technically correct term, since that person is in the same generation as a grandparent; likewise, the sister of a great-grandparent would be a great-grand-aunt. This appeals to me aesthetically. If you were to use these terms, then you’d have one fewer “great” in describing the relationship, and you’d need to add 1 to the exponent in the formulas above.
Special Case 3: second cousins once removed (and the like)
Cousins share at least one grandparent, second cousins share at least one great-grandparent, and xth cousins share at least one great(x-1) grandparents. This means that xth cousins are each (x+1) generation removed from the common ancestor(s), and would expect to share (½)2x+2 of their genes if there is one common relative and (½)2x+1 if there are two. Each removal refers to one of the people being one more generation removed from any common ancestors, and so increases the power of ½ by 1. This means that xth cousins who are y-times removed would expect to share (½)2x+y+2 of their genes if there is one common relative and (½)2x+y+1 if there are two. Second cousins once removed would share either (½)7 or (½)6 of their genes, while first cousins twice removed would share (½)6 or (½)5.
For those who like the visual, there is a handy little chart below, which appears to be in the public domain on Wikipedia. It does make some assumptions, however – namely, that siblings, cousins, aunts and nieces, etc. have exactly two closest relatives in common (both parents, two grandparents, etc.).
