How do we know so much about the history of words? And what is this Proto-Indo-European that keeps coming back in the calendar entries? On this page you find a brief introduction to our field of research, in which the most important terms are briefly explained.

Every word has a story. It may have been borrowed from another language, or it may be a formed by a combination of other words. It may have changed radically in meaning over the years, or be related to a word that looks very different nowadays. Or it can tell you something about historical events, or about the worldview of people who used the same word thousands of years ago. But how do linguists determine where a word came from?

Part of this information comes from old written sources: for example, we can see the context in which a word was used over the centuries, and use this to deduce how the meaning has changed. However, this method only works for the period from which we have written sources! In order to determine how a language developed in prehistoric times, linguists make use of a more complex technique, the so-called comparative method. This method forms the basis of our field of study, Comparative Indo-European Linguistics, or CIEL. For those who are interested in the theory behind the calendar, we gladly give a short introduction here!

Language families

In 1786, Sir William Jones, an English judge working in India, wrote that Greek, Latin and the ancient Indian language Sanskrit were so similar that nobody could study them “without believing them to have sprung from some common source, which, perhaps, no longer exists“. This observation is often seen as the starting point of comparative linguistics as a scientific discipline.

Now Jones was not the first to see the similarities between these languages, or the first to suggest that they might descend from the same ancestral language. People even knew already what such a process could look like, thanks to examples such as the Romance languages (French, Italian, Spanish, Romanian, etc.), which all come from Latin. What happens is as follows. Speakers of an original ancestral language spread out over a relatively large area, for example in search of new farmland. Because they are so widely dispersed, contact between them is limited and their languages change largely independently of each other. At first, different dialects arise, but if they grow apart for long enough, the speakers of the different dialects will eventually no longer be able to understand each other at all. Voilà, there are now a few different ‘sister languages’, all descended from the same mother.

Those daughter languages often also split up again, so that even more different languages arise and the language family becomes even bigger. Latin is a good example of this. As Jones suggested, this is itself the daughter language of an original ancestral language. However, Latin itself is also the ancestor of several daughter languages, and these are already growing further apart into even more different dialects that may become the next generation of daughter languages in the future. In this way an extensive family tree can be formed.

Comparative linguistics

All in all, it was not a new idea that Greek, Latin and Sanskrit were related languages. Other languages, such as the Germanic, Celtic, Balto-Slavic and Iranian languages were also already mentioned as probable members of this family. So why was Jones’ observation so important? He was the first to suggest that the common ancestor did not have to be a living language! Here, too, the Romance languages are a good example. French, Spanish and Italian are still very much alive, but Latin itself is no longer spoken as a native language. If we hadn’t had such an abundance of written sources, the Latin of antiquity would just as well have fallen into oblivion.

But what if such an ancestral language was never written down? That was the problem that linguists now faced. The ancestor of all those European and Asian languages had probably already gone extinct before the speakers came into contact with writing systems – so how could people ever know what kind of language it was?

The solution they came up with was to compare the various related daughter languages in order to reconstruct what their common ancestor must have looked like. This hypothetical ancestor is called Proto-Indo-European (PIE). The element Proto indicates that the language has never been written down, and that we only know it by means of reconstruction. For the same reason we put an asterisk (*) before reconstructed words or sounds.

Reconstructing: a step-by-step plan

So how does it work, this comparative linguistics? To illustrate how to reconstruct a language that has never been written down, we give an example in four steps below:

  1. Find some cognates: words in different languages that are related to each other. The clearest are words that are similar in form and have a similar meaning. For example: English to sit, Latin sed-eō ‘sit’, Russian sid-étʹ ‘sit’ and Sanskrit sád-as ‘chair’. We see that the roots of these words, sit, sed, sid and sád, have a very similar shape. This means they probably descend from the same word, which has developed differently in the different branches.
  2. Find the differences. In this case, for example, we see that English has a t at the end of the root, while we find a d in Latin, Russian and Sanskrit. (You can see that, apart from the t and d, the vowels have also changed. You can ignore that for now. Vowels are complicated.)
  3. Which older PIE-form could have developed into all of these four forms? In this case, we reconstruct PIE *d, and not *t, because the Germanic languages (including English) are the only ones that show a t, while all other languages have a d. The idea then is that the *d has become a t in the development of Germanic, while all other languages kept the original sound.
  4. Voilà: the root we reconstruct is *sed-, and the meaning must have had something to do with ‘sit’.

Sound laws

Of course, we wouldn’t be able to do much if we could only make guesses on the basis of words that exist in enough daughter languages. Fortunately, language behaves better than you might expect. Reconstructing becomes a lot easier thanks to the existence of sound laws.

A sound law is the regular change from one sound to another. In Dutch, for example, the *th always turned into a d, while English has kept the original th. For that reason we have dief next to thief, dat next to that and leder next to leather. Some sound laws only occur under certain, likewise regular circumstances. For example, the Dutch d has only disappeared between vowels (weer from older weder, kleren from older klederen), but dief did not turn into ief.

Sound laws are so regular because speakers are not aware of the (gradual) change. It is not about a change in a specific word, but about a gradual change in how a sound is pronounced. You can see the same thing happening in modern dialects. In a certain sense, the difference between the hard, northern g and the soft g in southern provinces is a kind of ‘sound change in the making’.

De ‘laryngeals’: a proven reconstruction

Thanks to our knowledge of sound laws, we can reconstruct a considerable part of the vocabulary and grammar of PIE. And although the language has never been written down, in rare cases we can even confirm our reconstructions. A famous case is that of the so-called laryngeals.

Based on certain exceptions to sound laws, linguists had deduced over a century ago that PIE must have had three ‘invisible’ consonants: these sounds had left traces in the daughter languages, but were themselves no longer visible. A vowel with such a consonant directly behind it, for example, usually changed into a long vowel, but the mysterious consonant itself disappeared. Linguists called those reconstructed consonants laryngeals. The assumption that these sounds existed solved a lot of irregularities, but of course it was difficult to prove their existence… until Hittite was found!

Hittite is an exceptionally archaic Indo-European language that was spoken in Anatolia, modern day Turkey. The texts in this language are older than any other written daughter language of PIE: they are from the 17th to the 12th century BC. And in Hittite words, linguists found ‘extra’ letters – exactly where they had been reconstructing the laryngeals! The existence of these hypothetical consonants was hereby confirmed.

In the calendar, you will sometimes encounter h1, h2 or h3 in the reconstructed PIE-forms. This is how we represent the three reconstructed laryngals. We don’t know exactly what they sounded like, but presumably they were some types of guttural sounds; if you want to try to read out the reconstructions, the convention is to pronounce the laryngals as a deep g.

Want to read more?

This is, of course, only a fraction of what we can do with comparative linguistics. By studying the history of the Indo-European family and other language families, we can learn more about how language changes in general. And reconstructing PIE is not only a fun academic puzzle. It also teaches us a lot about the history of Europe and Asia, and, thanks to the reconstructed vocabulary, offers us some insight into the society, culture and ideology of people who lived millennia ago – and all this without written sources!

Would you like to learn more about historical linguistics, Indo-European or ancient languages? Here we listed some of our favourite books for non-linguists.