Does “untyped” also mean “dynamically typed” in the academic CS world?
academic types use “untyped” to mean “no static types”. they are smart enough to see that values have types (duh!). context matters.
Do academically-focused computer science folks use “untyped” as a synonym of “dynamically typed” (and is this valid?) or is there something deeper to this that I am missing? I agree with Brendan that context is important but any citations of explanations would be great as my current “go to” books are not playing ball on this topic.
I want to nail this down so I can improve my understanding and because even Wikipedia doesn’t refer to this alternative usage (that I can find, anyway). I don’t want to mess up with either using the term or questioning the use of the term in future if I’m wrong 🙂
(I’ve also seen a top Smalltalker say Smalltalk is “untyped” too, so it’s not a one-off which is what set me off on this quest! :-))
Yes, this is standard practice in academic literature. To understand it, it helps to know that the notion of “type” was invented in the 1930s, in the context of lambda calculus (in fact, even earlier, in the context of set theory). Since then, a whole branch of computational logic has emerged that is known as “type theory”. Programming language theory is based on these foundations. And in all these mathematical contexts, “type” has a particular, well-established meaning.
The terminology “dynamic typing” was invented much later — and it is a contradiction in terms in the face of the common mathematical use of the word “type”.
For example, here is the definition of “type system” that Benjamin Pierce uses in his standard text book Types and Programming Languages:
A type system is a tractable syntactic method for proving the absence
of certain program behaviors by classifying phrases according to the
kinds of values they compute.
He also remarks:
The word “static” is sometimes added explicitly–we speak of a
“statically typed programming language,” for example–to distinguish the
sorts of compile-time analyses we are considering here from the
dynamic or latent typing found in languages such as Scheme (Sussman
and Steele, 1975; Kelsey, Clinger, and Rees, 1998; Dybvig, 1996),
where run-time type tags are used to distinguish diﬀerent kinds of
structures in the heap. Terms like “dynamically typed” are arguably
misnomers and should probably be replaced by “dynamically checked,”
but the usage is standard.
Most people working in the field seem to be sharing this point of view.
Note that this does not mean that “untyped” and “dynamically typed” are synonyms. Rather, that the latter is a (technically misleading) name for a particular case of the former.
I am an academic computer scientist specializing in programming languages, and yes, the word “untyped” is frequently (mis)-used in this way. It would be nice to reserve the word for use with languages that don’t carry dynamic type tags, such as Forth and assembly code, but these languages are rarely used and even more rarely studied, and it’s a lot easier to say “untyped” than “dynamically typed”.
P.S. In pure lambda calculus, the only “values” are terms in normal form, and the only closed terms in normal form are functions. But most scientists who use the lambda calculus add base types and constants, and then you either include a static type system for lambda or you are right back to dynamic type tags.
P.P.S. To original poster: when it comes to programming languages, and especially type systems, the information on Wikipedia is of poor quality. Don’t trust it.
I’ve looked into it, and found that the answer to your question is simply, and surprisingly, “yes”: academic CS types, or at least some of them, do use “untyped” to mean “dynamically typed”. For example, Programming Languages: Principles and Practices, Third Edition (by Kenneth C. Louden and Kenneth A. Lambert, published 2012) says this:
Languages without static type systems are usually called untyped languages (or dynamically typed languages). Such languages include Scheme and other dialects of Lisp, Smalltalk, and most scripting languages such as Perl, Python, and Ruby. Note, however, that an untyped language does not necessarily allow programs to corrupt data—this just means that all safety checking is performed at execution time. […]
[link] (note: bolding in original) and goes on to use “untyped” in just this way.
I find this surprising (for much the same reasons that afrischke and Adam Mihalcin give), but there you are. 🙂
Edited to add: You can find more examples by plugging
"untyped languages" into Google Book Search. For example:
[…] This is the primary information-hiding mechanism is many untyped languages. For instance PLT Scheme  uses generative
— Jacob Matthews and Amal Ahmed, 2008 [link]
[…], we present a binding-time analysis for an untyped functional language […]. […] It has been implemented and is used in a partial evaluator for a side-effect free dialect of Scheme. The analysis is general enough, however, to be valid for non-strict typed functional languages such as Haskell. […]
— Charles Consel, 1990 [link]
By the way, my impression, after looking through these search results, is that if a researcher writes of an “untyped” functional language, (s)he very likely does consider it to be “untyped” in the same sense as the untyped lambda calculus that Adam Mihalcin mentions. At least, several researchers mention Scheme and the lambda calculus in the same breath.
What the search doesn’t say, of course, is whether there are researchers who reject this identification, and don’t consider these languages to be “untyped”. Well, I did find this:
I then realized that there is really no circularity, because dynamically typed languages are not untyped languages — it’s just that the types are not usually immediately obvious from the program text.
— someone (I can’t tell who), 1998 [link]
but obviously most people who reject this identification wouldn’t feel a need to explicitly say so.
Untyped and dynamically typed are absolutely not synonyms. The language that is most often called “untyped” is the Lambda Calculus, which is actually a unityped language – everything is a function, so we can statically prove that the type of everything is the function. A dynamically typed language has multiple types, but does not add a way for the compiler to statically check them, forcing the compiler to insert runtime checks on variable types.
x could be a number, or a function, or a string, or something else (and determining which one would require solving the Halting Problem or some hard mathematical problem), so you can apply
x to an argument and the browser has to check at runtime that
x is a function.
By contrast in other languages, like C, variables carry types but values do not. In languages like Java, variables and values both carry types. In C++, some values (those with virtual functions) carry types and others do not. In some languages it is even possible for values to change types, although this is usually considered bad design.
This question is all about Semantics
If I give you this data:
12 what is it’s type? You have no way of knowing for sure. Could be an integer – could be a float – could be a string. In that sense it’s very much “untyped” data.
If I give you an imaginary language which lets you use operators like “add”, “subtract”, and “concatenate” on this data and some other arbitrary piece of data the “type” is somewhat irrelevant (to my imaginary language) (example: perhaps
add(12, a) yields
109 which is
12 plus the ascii value of
Let’s talk C for a second. C pretty much lets you do whatever you want with any arbitrary piece of data. If you’re using a function that takes two
uints – you could cast and pass anything you want – and the values will simply be interpreted as
uints. In that sense C is “untyped” (if you treat it in such a way).
However – and getting to Brendan’s point – if I told you that “My age is
12” – then
12 has a type – at least we know it’s numeric. With context everything has a type – regardless of the language.
This is why I said at the beginning – your question is one of semantics. What is the meaning of “untyped”? I think Brendan hit the nail on the head when he said “no static types” – because that’s all it can possibly mean. Humans naturally classify things into types. We intuitively know that there is something fundamentally different between a car and a monkey – without ever being taught to make those distinctions.
Getting back to my example in the beginning – a language that “doesn’t care about types” (per-se) may let you “add” an “age” and a “name” without producing a syntax error… but that doesn’t mean it’s a logically sound operation.
Is a system/language which doesn’t enforce type safety at compile/build/interpretation time “untyped” or “dynamically typed”?
In my comment on someone else’s answer I said:
To perform the same operation with C#, for example, I’d NEED an interface called