Using a biological framework to resolve the early stages of linguistic divergence

The diversity observed between different present day languages have interested academics for centuries. It has been proposed that this arises partly from dialects that develop in populations that become isolated, which then evolves into a different language over time. Here, Terhi Honkola describes the work conducted by herself and colleagues, now published in BMC Evolutionary Biology, into whether isolation of speaker populations and the local environment really does contribute to language diversity.

At first glance, language and evolution seem to have very few things in common. Closer inspection shows, however, that the three parts that form the core of evolution–variation, heritability and change-causing forces (e.g. selection)–are also found in language.

Similarities between linguistic and biological evolution, already noticed by Darwin and elaborated on by a number of scholars since then, have enabled the plausible usage of modern methods of evolutionary biology to analyze language data.

Similarities between linguistic and biological evolution, have enabled the plausible usage of modern methods of evolutionary biology to analyze language data.

The majority of this work has focused on constructing phylogenies of the Indo-European, Austronesian and Uralic language families. These phylogenies in turn are used to study the taxonomy or the dispersal histories of these groups.

In these studies, languages are paralleled with species. Due to the hierarchical nature of both entities, it is also possible to parallel the substructures–i.e dialects and populations–with each other and apply population genetic methods to dialect data. The applicability of these methods on dialect data has been shown in the earlier work of our research group.

In our current work, we have combined biology and linguistics one step further. Instead of only adopting methods from population genetics, we have also utilized certain elements of the biological microevolutionary framework.

We aimed to advance the present understanding of linguistic divergence, a process which has had an important role in the emergence of the over 7000 languages that exist in the world today.

In biology, species divergence can be investigated at the early phases of the divergence process, which involves the separation of populations within a species. In our current study we examined linguistic divergence by focusing on the initial phase of the process: the divergence of dialects within a language.

In biology, species divergence can be investigated at the early phases of the divergence process, which involves the separation of populations within a species.

We took hypotheses from both biological and linguistic literature. Firstly, we studied the role of geographical distance, as it contributes to the divergence of both populations and dialects. In general, the further away the groups are from each other, the less balancing dispersal there is between them, and the more different they turn out to be.

Secondly, we adopted the biological hypothesis of isolation by environment to investigated whether mere differences in environmental conditions can isolate groups of speakers and have a role in the divergence of dialects.

Finally, we considered whether linguistic differences can be explained by cultural differences. Culture and specifically its cumulativeness is a special feature of the human species.

100 years of linguistic variations in Finland

We studied the divergence of dialects of the Finnish language. We used a dataset describing the linguistic variation of Finnish approximately one hundred years ago (Fig 1). The data is from a time when the differences between dialects were still very clear, as urbanization took place in Finland relatively late.

An example page of the Dialect Atlas of Finnish (Kettunen 1940).

As we were interested in differences between dialect groups, we used a population genetic clustering method to infer those groups. To achieve this, we organised the linguistic data in a similar way to genetic data.

In biology, the individual is the unit of study, and within them, alleles of certain loci are the topic of interest. In the dialect data, we were interested in the local variants of certain linguistic features within a local administrative unit (i.e. municipality). Thus, we paralleled individuals with municipalities, loci with linguistic features, and alleles with local variants of these linguistic features.

From Finland, there also exists old records of the spatial distribution of various cultural and environmental features. Comparatively, Finland is an environmentally and culturally homogenous country, and hence we were able to study whether small differences in the environmental and cultural conditions may already be connected with linguistic divergence

To our surprise, we found that geographical distance explained the least of the linguistic differences between the dialect groups. In other words, dialects spoken geographically close to each other may remain linguistically very different.

While cultural differences explained the majority of the linguistic differences, environmental differences also explained more than just geographical distance. The extent of the roles played by both environment and culture is a remarkable finding, as it suggests that human cultural adaptation may have had a role in the divergence of the Finnish dialects.

We formulated a hypothesis that  human’s capability for cultural adaptation in response to different environmental conditions may contribute to the divergence of dialects. While more research on this subject is needed, our study introduces interesting new perspectives on how global linguistic diversity could have taken shape.

View the latest posts on the BMC Series blog homepage