What exactly is the problem with the language of the Voynich manuscript? Why the consternation?
The problem boils down to this:
It looks like a natural language
made up of highly unnatural words.
The structures within words are unusually predictable and rigid compared to natural languages BUT the distribution of these unnatural words throughout the text is within the normal range of natural languages.
So, the words are highly unnatural, but above the level of words, the text looks like a natural language.
That's the problem.
****
The words themselves have highly artificial structures. They are rigidly tripartite. They consist of three fields (prefix, midfix, suffix) with certain letters and combinations of letters restricted to particular fields. There are many strict constraints on what letters can go where and in what combinations. Certain letters are only found in certain parts of words in a small number of combinations.
This makes Voynich words far more predictable than words in natural languages. Words in natural languages are not structured like this. It is highly artificial.
But when we look at the text as a whole, these unnatural words behave like a natural language. For example:
According to Zipf’s Law, in any natural language the most common word will be about twice as common as the second most common word, and the third will be a third as common as the first, and so on. This is a hallmark of all natural languages. (We are not sure why, but it is.)
And this is what we find in the Voynich language. Even though the words are highly artificial in themselves, the most common word is about twice as frequent as the second most, and so on, just as happens in natural languages. So, surprisingly, the Voynich language conforms to Zipf’s Law.
Other tests place it within the range of natural languages too. The number of words repeated (reduplication) is about normal for a natural language, for example. Labels in the text that begin with the letter <o> seem to be nouns, and there are indeed about the right proportion of such words in the text as nouns in natural languages. Word formation seems artificial: word distribution seems natural.
Assuredly, there are odd things about the language at every level, but at the global whole-text level many indicators say it is a natural language. Some tell-tale patterns and distributions cannot be contrived artificially. It is when we look at the distribution and behavior of characters within words, however, we find that all is contrived and very different to anything found in natural tongues.
It follows: if the Voynich language is a cipher or is gibberish - as the weird structure of words might suggest - then the author has been able to imitate the behaviors of natural language such as word distribution in a highly sophisticated way. This would be exceedingly difficult if not impossible to achieve. How does one write gibberish, or execute a cipher, that preserves the distributions of Zipf's Law?
The problem, anyway, is the disjunction of artifice and naturalness. When we zoom into the word level everything seems constructed and contrived and we are sure it cannot be natural. But when we zoom out to the whole text level it seems to fit comfortably within the expectations of a natural language. How do we explain this? Any solution to the Voynich mystery must offer a cogent explanation for this confounding incongruity. So far, no convincing explanation has been forthcoming.
R. B.
No comments:
Post a Comment