In Search Of The Paleo-Europeans

This is an introduction to the ur-European saga.

The saga is structured as a series of chronological essays tracing the path of the peoples who would become the earliest human Europeans, beginning with their exodus from Africa some 70ka (thousand years ago), continuing through tens of millennia down to the colonization of England by the Anglo-Saxon Federation in the early current era. The following are links to these essays (which will always be a work in progress):


Here will be told a story of ancient peoples and their movements through space and time. Though largely informal in exposition, citations of relevant underlying facts and hypotheses are annotated throughout.

As a synthesis, our story is structured along a physical/cultural timeline wrapped around select internet information, in effect a ‘curve-fitting’ exercise matching story line to trusted data. Such a synthesis is by nature speculative, a story woven around some best guesses. One such version of events is offered here. Other investigators will produce other hypotheticals.

When possible, human DNA clues are sought to inform our story. Accurate dating of genetic events, such as occurrence of a DNA mutation, can lead to discovery of the inter-dependencies between these events, and also to the relationship of these events to physical evidence that is dated by other means.

When paleogenomics garners more interest, research funds, and technology, there will be more testing of old bones and teeth for DNA, advancing by a leap the resolution of our current picture of European prehistory. With more DNA data points and more dated artifacts and more scientific analysis, an increasingly narrow field of possible versions of our tale are possible.

While we are nowhere near to a canonical version of events yet, our vantage point is now on solid ground. An evolving state of the art likely will cause the story’s details to move around a bit. But the overarching structure of our story will prove invariant, read from the DNA.

Technical Background

The information here has a technical content that will seem foreign to the uninitiated. A brief synopsis of relevant topics can be found under the menu [Science:Genetics:] Our Genetic Clanship, and under [Science:Biosphere:] Climate – Big Picture.

It may be instructive to read a corresponding case study regarding using tested DNA for genealogy. With assistance from helpful Internet research groups, such testing can help us flesh out both historical (paper) genealogy and the deep ancestral groups such as are shown below.


The author’s paternal and maternal DNA discovery has lead in each case to Europe in the paleolithic, stimulating a natural interest in the paleo-European population of 40ka.

Genetic Basis

There are two types of genetic data we can study: autosomal (recombinant) DNA, and non-recombinant DNA. Academic researchers typically analyze autosomal DNA, using DNA gathered from study of specific populations, to estimate their roots. This estimation involves advanced statistical modeling and knowledge of processes of population genetics.

For armchair genetic genealogists such as this author, ancient population inference derives principally from the current research databases containing paternal non-recombinant Y-chromosome (NRY) signatures for various current populations around the globe. Similar databases of maternal non-recombinant mitochondrial haplogroups are also consulted. The author’s  M/F coordinates in this non-recombinant SNP genetic space are (I2-Z166*, U5b1c2). Both of these coordinates place the author’s ancestry entirely within Europe 40kya. The 40kya ancestral haplogroups IJ and U5 were likely among the ur-Europeans. (Note: Z166* signifies DNA that tests positive (derived) for Z166, but negative (ancestral) for all identified SNPs downstream from Z166.)

The main focus here is on the male line, one tiny pathway in the author’s total genetic heritage. This paternal lineage came to be in the area of Frisia toward the end of this long journey. We hypothesize this because people living today with this paternal genetic signature frequently trace their deep ancestry to the German Bight and adjacent coastal territory.

Although the non-recombinant DNA is a very small component of the overall genome, we can also imagine that long ago, the populations we study here were somewhat homogeneous; for 95% of our study period, the non-recombinant DNA was characteristic for a significant proportion of a local population.

In the larger genetic picture, particularly in the most recent 5% of our time period, people with European deep ancestry are Euromutts, the intersection of a large number of genetic heritages that have at one time or another crossed paths in Europe. Since the Neolithic, all European cultures must be assumed to be a blend of several such paternal signatures. Our northern European mutt-iness has roots many millennia deep.

The table below lists the author’s succession of paternal NRY SNP markers, ending with the most downstream SNP currently tested. These markers define the limb-branching structure of the subject paternal kinship within our common human genetic tree. This clade hierarchy spans nearly 70ky, shown here with very approximate date of occurrence for the clade branching nodes.

Best-guess locations across time are noted for these paternal DNA sub-clades. But there are as yet few DNA validations for these guesses beyond contour (heat) maps of current population DNA densities, as observed in the research databases. Researchers then apply backward extrapolation on these current data to suggest the locus of various markers long ago.

In 2014, the current earliest paleogenomic result pertaining to our HIJKLT branch of the tree was observed. A femur from the Ust’-Ishim male, found in Western Siberia, was analyzed to be mtDNA haplogroup R and Y-DNA haplogroup K(xLT). This confirms what has been predicted for the prior decade, that haplogroup K went to central Asia 55ka. The femur has been dated to 45ka. The study paper also noted some TMRCA dates that are consistent with other study dates below: haplogroup IJ 49ka, haplogroup C 53ka, and haplogroup KIJ  54ka.

The locations of maternal mtDNA clades in the table are left largely non-specific, since the U5 clade descendants had become diffuse throughout Eurasia. On the other hand, since mtDNA sourced from ancient bones has been much easier to decode than associated NRY DNA, there have been some scattered mtDNA mesolithic paleogenomic identifications, noted in the table below, which specify a western Mediterranean flavor for the known samples of this maternal clan during the earlier Holocene.

The SNP founding dates in the table below are statistical, mathematically modeled by researchers using average mutation rates for various DNA locations. The genetic dating confidence intervals vary across haplogroups. Even the dates whose assumptions are most well understood have error bars of at least ±5ky in the earlier paleolithic.

Nothing definitive can yet be told regarding the real time of occurrence of any of the listed events, other than that an event’s date probably lies between the dates proposed for the preceding and following events. The dates shown below are typically within 5ky of dates published by other academic researchers. As research progresses, the dates get continual refinement, but at its current stage, this is an imprecise science.

In late 2014, a paper described a deep Y chromosome clade study (covering most of the period post-70ka). The paper assigns updated TMRCA dates to nodes (see its Table 2), but estimates no clade founding dates. Their statistical assumptions seem to cause their published TMRCAs to be too young. But multiplying each of these dates by the factor 1.47 gives a date at the midpoint of their TMRCA confidence ranges. These adjusted dates corresponded well to the hypothetical SNP founding dates of our table.

In early 2015, the YFull genetic sequence interpretation service published a Y-tree characterized by SNP founding and TMRCA dates for the Y-SNPs. Our table’s founding dates were then adjusted to become consistent with this new and  systematic effort at characterizing dates of first SNP occurrence. YFull’s founding date estimates pushed back in time many earlier SNPs in our table. The more recent SNPs were pulled somewhat forward; they currently seem to bunch up around 3ka.

This bunching may be characteristic of a bottleneck, perhaps associated with the submergence of Doggerland. YFull shows no extant subclades of I2-L801 for 5400 years. Its founding date is ~9.3ka, but its TMRCA is only 3.9ka.

A few leaf nodes identified by YFull now reach the current era and hence may be geo-located  via historical means. However, the dating of the furthest downstream SNPs listed in our table do not as yet get us to the historical time period in Northern Europe for which we could have definitive information related to population movements.

With the current state of research, there remains a two millennia gap between the reach of historical/autosomal genealogy research and the terminal SNPs of clades such as ours. Greater testing focus on populations in these portions of the tree will likely identify further leaf nodes that may allow most branches to reach the current era. Until then, we will continue to look to further paleogenomic research to shed light on these ‘dark ages’ for genetic genealogy.

As a final qualification regarding the table below, consider the geography of the various events to be restricted to the specific path followed by the subject ancestral line through Europe. Many of the earlier haplogroups likely had wide distribution, but we are focused here on central and north Europe where this line ultimately settled. A further caveat is that precise routes between hypothetical early locales are pure guesswork, with no supporting artifacts yet known. In some cases, the routes now likely are under water.

Age (ka) Location [Cultural affinity in brackets] NRY DNA Event mtDNA Event
 >75 Horn of Africa CDEF-M168 N
70 Southern Arabian Peninsula to NW India CF-P143 R
65 Indus Valley  F-M89 U
55 North Iranian Plateau HIJKLT-M522
49 Anatolia/Balkans/Danube [Aurignacian] IJ-M429 U5
43 Europe [Aurignacian-Gravettian] I-M170
28 Withdrawal to habitable refugia prior to LGM [Gravettian] I2-M438 U5b
22 LGM – population in refugia: SE Europe [Epigravettian] I2-L460 (I2a) I2-M436 (I2a2) U5b1
17 Post-LGM expansion , Danube/Rhine/Elbe/Oder/Vistula [Epigravettian->Hamburg] I2-M223 (I2a2a) and many phylo-equivalent SNPs
12 Northwest Europe Oder->Thames [Ahrensburg] I2-Y4450, CTS11271, CTS1535, CTS9139
11.5 Greater Doggerland☑︎ [Ahrensburg] I2-CTS616, CTS9183 U5b1c (Italy☑︎☑︎), U5b1 (Switz☑︎☑︎)
11 Greater Doggerland☑︎ [Maglemose] I2-CTS10057, CTS10100
10.5 Greater Doggerland☑︎ [Maglemose] I2-Z161
9.3 Greater Doggerland☑︎ [Maglemose] I2-L801/S390, Z76, Z177, Z183, CTS2392, CTS4348, CTS6136, CTS7682, CTS7934
7.5 Greater Frisia [Kongemose] I2-L801 U5b1c2 (Portugal☑︎☑︎)
6.2 Greater Frisia [Ertebølle-Ellerbek (EBK)] I2-L801
5 Greater Frisia [Funnelbeaker (TRB) – Corded Ware] I2-L801 U5b1c2 (two samples Bavaria ~2700BCE ☑︎☑︎)
3.8 Frisia [ELP] I2-CTS6433, Z167, S2364, S2361, Y4442, Y4448
3.3 Frisia [ELP] I2-Z78, Z171
3.2 Frisia [ELP] I2-CTS8584, I2-Z180, Z185 ☑︎☑︎☑︎
3.0 Frisia [ELP-Urnfield] I2-L1198, Z166, Z187 ☑︎☑︎☑︎

SNP S2361, founded ~3.9ka, is the initial tree node where the value of STR H4 switched from 10 to 9 repeats. H4=9 was recognized as a marker for a Frisian population a decade before extensive SNP results became available.

Recently observed SNPs, Y17535, S20905, and Y6060, derive from I2-Z166. S20905 is ancestral for the long-available SNPs Z190 and Z79. As these SNPs and their further derived SNPs become independently testable, as more people test them, and as researchers provide date estimates and location hypotheses, our NRY I2-Z166+ genetic model may reach forward to historical times.

☑︎ Doggerland was somewhat larger in land area than the current island of Great Britain, consisting principally of the exposed southern end of the North Sea bed during the LGM. Initially tundra after the recession of the glaciers ending the LGM, it became a rolling plain of forest, rivers, wetlands, and shallow estuaries, a near-perfect environment for the hunter-gatherer population of the Mesolithic. Doggerland gradually was lost to rising sea levels, vanishing entirely before ~8ka.


From National Geographic: “Doggerland: The Europe That Was”

Perhaps the Z161 sister clade, M284, was stranded in England by this loss of land bridge to the Continent. There were likely devastating tsunamis that ravaged Doggerland during its last few centuries, stressing the populations there. Perhaps extinctions of some sub-populations then explains unusual discontinuities in the I1 and I2 subclades during this approximate period around 8ka. For the I2 ancestral population, there is a sudden jump in some STR markers during the transition from Z161->L801; this period also shows the largest number of unresolved phyloequivalent SNPs in the I2 lineage. Such phyloequivalence is possibly symptomatic of mass clade pruning due to a population bottleneck that left any remaining subclade fragments more susceptible to extinction. Further, such extinction events may explain the pruning of all but one of the former subclades from the I1 haplogroup tree, leaving only a monolithic ancestral I1 line over the preceding 15ky. To understand how statistically unusual this appears, consider that the similarly aged I2 tree has over 20 known subclades with origins before 6ka that are still represented in current populations. Both these I1 and I2  anomalies may be due to catastrophic sea rise in their jointly settled area, perhaps a small glimpse of our own future struggles with global climate change.

☑︎☑︎ If  known, a U5 sub-clade location is shown in parentheses. The current earliest paleogenomic U5b1c2 sample comes from Portugal ~7.5ka.

☑︎☑︎☑︎ Note also that the ISOGG SNP authenticating body does not yet recognize newer SNPs, even after a year or more. Perhaps due to this lag, new-gen sequencing chip developers do not yet address these newly relevant SNPs, slowing new results. But whole chromosome testing has recently allowed us to begin leap-frogging ISOGG.

Proceed to Upper Paleolithic Europe.

Comments Welcome

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s