This post compares complexity in spoken languages and programming languages.
There is a theory in linguistics that all human languages are equally complex. Languages may distribute their complexity in different ways, but the total complexity is roughly the same across all spoken languages. One language may be simpler in some aspect than another but more complicated in some other respect. For example, Chinese has simple grammar but a complex tonal system.
Even if all languages are equally complex, that doesn’t mean all languages are equally difficult to learn. An English speaker might find French easier to learn than Russian, not because French is simpler than Russian in some objective sense, but because French is more similar to English.
All spoken languages are supposed to be equally complex because languages reach an equilibrium between at least two forces. Skilled adult speakers tend to complicate languages by looking for ways to be more expressive. But children must be able to learn their language relatively quickly, and less skilled speakers need to be able to use the language as well.
I wonder what this says about programming languages. There are analogous dynamics. Programming languages can be relatively simpler in some way while being relatively complex in another way. And programming languages become more complex over time due to the demands of skilled users.
But there are several important differences. Programming languages are part of a complex system of language, standard libraries, idioms, tools, etc. It may make more sense to speak of a programming “system” to make better comparisons, taking into account the language and its environment.
I do not think that all programming systems are equally complex. Some are better designed than others. Some are more appropriate for a given task than others. Some programming systems achieve simplicity by sacrificing efficiency. Some abstractions leak less than others.
On the other hand, I imagine the levels of complexity are more similar when comparing programming systems rather than just comparing programming languages. Larry Wall said something to the effect that Perl is ugly so you can write beautiful programs in it. I think there’s some truth to that. A language can always be small and elegant by simply not providing much functionality, forcing the user to implement that functionality in application code.
See Larry Wall’s article Natural Language Principles in Perl for more comparisons of spoken languages and programming languages.
There is many “definitions” of complexity for programming systems/languages :
– The complexity you discussed in this post, we can define it by difficulty to understand syntax, grammar, etc. ex : Perl > (harder) Python, C ~= Java. You can add usability of standard library, etc.
– The computational complexity which said that every programming language (i.e type-1 Chomsky language) are equally complex.
– Kolmogorov complexity is a middle on the intuitive definition and the computational one. The idea is to define complexity of a message (aka source code) by its length on a certain Turing machine M. So complexity of a message depends of the “environment”. However, there isn’t any total order.
Programming language are “simple”. I don’t trust the intuition to define if one or an other is more complex. I think there is two questions to ask :
– Firstly, when you are in the real world, it’s clear that ASM isn’t the best choice for programming Rich Interface Application. Conversely, using ActionScript to numerical simulation for jetplane design is dangerous. So, complexity of a language depends of goals.
– Secondly, the theorical question : Can I write anything in any language (again, chomsky type-1) ? have an answer : yes, you can.
As far as natural languages are concerned, what about essentially useless characteristics like gender of nouns? Perhaps linguists don’t consider them as contributing to complexity, but any language learner certainly does!
I wonder if there are analogous “features” in programming languages.
Nick, I think that grammatical gender contributes to complexity, but it has compensating benefits. For example, antecedents may sometimes be disambiguated by gender. Maybe languages with grammatical gender are simpler in some other way.
This may be an analog in programming languages. C++ has three operators (
->
,::
, and.
) that all map to just.
in C#. Most would say C# is simpler in this regard, but others would argue that C++ is more clear by, for example, making a symbolic distinction between namespace scope and object membership.Don’t overlook the value of redundancy in natural languages. I believe many seemingly useless language features persist because they add redundancy. I believe this is also true of programming languages, just not in regards to their functional aspect.
I seriously doubt that all natural languages are equally complex, but it is clear that they do have differences in types of complexity. And what about “untranslatable” terms, usually inseparable from culture?
Obligatory programming reference, then a truly bizarre and controversial language:
A common question from C users learning Fortran used to be, “What’s the Fortran command to clear the screen and put the cursor in the top left-hand corner?” My favorite answer was, “Surely you mean to ask how do you punch the top left-hand corner of the punch card.”
The most bizarre natural language I have heard of is Pirahã spoken by the Pirahã people.
John, I looked at the Pirahã link. That is really odd.
More odd than the language in some ways is the culture. Here are a couple of links to periodical articles about them:
http://www.newyorker.com/reporting/2007/04/16/070416fa_fact_colapinto
http://www.spiegel.de/international/spiegel/0,1518,414291,00.html
The idea that all natural languages are equally complex cannot really be called a theory, since linguists only make this claim in passing (usually in the context of making some point in support of a strongly nativist view of language), but if pressed could not say what they mean by “complexity”. John McWhorter received a lot of attention for saying that creoles were “simple”, by which he only meant that they systematically lack certain kinds of morphology. The only attempt I know of to really articulate what complexity would mean is a proceedings paper by Max Bane from 2007, “Quantifying and Measuring Morphological Complexity”, where he defines complexity in terms of an information theoretic measure, Kolmogorov complexity. And it is definitely not known whether languages lose or gain complexity over time, though it is generally assumed that language change is atelic.
I realize the main point of this post is to say something about programming languages, but a lot of the claims about human language are misleading, in the sense that much less is known about this question of complexity than certain comments here suggest.
See this post for pretty compelling evidence that not all languages are equally complex. Tolomako and Sakao, both spoken on the same island in Vanuatu, are pretty closely related (they were almost certainly the same a mere thousand years ago). Yet they could not be more different today, and Sakao is intensely complex and hard to learn for anybody except native speakers, definitely including people who speak Tolomako. More familiarly, French, like Sakao, has many more vowels and irregular verbs than Spanish, though they were the same language two thousand years ago.