#12 Chomsky: Basic Property

Chomsky: Basic Property

Notes on Noam Chomsky’s What Kind of Creatures are We? (2016):

pp.4-5:

Basic Property: ‘each language provides an unbounded array of hierarchically structured expressions that receive interpretations at two interfaces, sensorimotor for externalization and conceptual-intentional for mental processes.’
→ formulation of Darwin’s infinite power and Aristotle’s classic dictum that language is sound with meaning, but recent work shows that sound is too narrow and the classic formulation may be misleading.
Each language has a computational procedure that satisfies the Basic Property, making it an I-language.
‘At the very least, then, each language incorporates a computational procedure satisfying the Basic Property. Therefore a theory of the language is by definition a generative grammar, and each language is what is called in technical terms an I-language—“I” standing for internal, individual, and intensional: we are interested in the discovering the actual computational procedure, not some set of objects it enumerates, what it “strongly generates” in technical terms, loosely analogous to the proofs generated by an axiom system’.
There is also weak generation, which is the set of expressions generated, and E-language, which some identify with a corpus of data or some weakly generated infinite set. It is unclear if weak generation is definable for human language, and it is derivative of the more fundamental notion of I-language. These ideas were discussed in the 1950s but not properly assimilated.
‘I will restrict attention here to I-language, a biological property of humans, some subcomponent of (mostly) the brain, an organ of the mind/brain in the loose sense in which the term “organ” is used in biology. I take the mind here to be the brain viewed at a certain level of abstraction. The approach is sometimes called the biolinguistic framework. It is regarded as controversial but without grounds, in my opinion’.

I-language / E-language: In linguistics, “I-language” stands for “internal language” or “individual language.” It refers to an individual’s internal knowledge of a particular language, including the rules and principles that allow them to produce and understand sentences in that language. An I-language is what enables an individual to generate an unbounded set of well-formed sentences and interpret them.

On the other hand, “E-language” stands for “external language.” It refers to the observable use of language in a community, including the words, sentences, and texts that are produced and used in communication. E-language is what can be observed by others and can be used to infer what an individual’s I-language may be like.

The distinction between I-language and E-language is an important one because it allows linguists to investigate the mental representation of language in the mind of an individual, separate from its use and function in a social context. While I-language focuses on the cognitive structures and processes that enable language use, E-language focuses on the empirical data that linguists can observe and analyze in their research.

Generative grammar, which is a theoretical framework for describing the structure of language, is primarily concerned with I-language, because it seeks to discover the underlying rules and principles that allow an individual to generate an unbounded set of sentences. E-language, on the other hand, is often used as a source of data for testing linguistic hypotheses and theories.

pp.5-6:

The Basic Property of language resisted clear formulation in earlier years. Ferdinand de Saussure believed that language is a storehouse of word images in the minds of members of a community and exists only by virtue of a sort of contract signed by the members of that community. Leonard Bloomfield defined language as an array of habits to respond to situations with conventional speech sounds and to respond to these sounds with actions. William Dwight Whitney conceived language as the body of uttered and audible signs by which in human society thought is principally expressed.
Edward Sapir defined language as a purely human and non-instinctive method of communicating ideas, emotions, and desires by means of a system of voluntarily produced symbols. The Boasian tradition, which holds that languages can differ arbitrarily and each new one must be studied without preconceptions, is based on such indefinite answers. Linguistic theory in this tradition consists of analytic procedures to reduce a corpus to organized form, basically techniques of segmentation and classification.
In contemporary cognitive science, similar indefinite answers about language remain current. A study on the evolution of language defines language as the full suite of abilities to map sound to meaning, including the infrastructure that supports it, which is too vague to ground further inquiry.
The lack of a clear formulation of the Basic Property is surprising, and no biologist would study the evolution of the visual system assuming no more about the phenotype than that it provides the full suite of abilities to map stimuli to percepts along with whatever supports it.

A century ago, Otto Jespersen raised the question of how the structures of language “come into existence in the mind of a speaker” on the basis of finite experience, yielding a “notion of structure” that is “definite enough to guide him in framing sentences of his own,” crucially “free expressions” that are typically new to speaker and hearer. The task of the linguist, then, is to discover these mechanisms and how they arise in the mind, and to go beyond to unearth “the great principles underlying the grammars of all languages,” and by unearthing them to gain “a deeper insight into the innermost nature of human language and of human thought”—ideas that sound much less strange today than they did during the structuralist/ behavioral science era that came to dominate much of the field, marginalizing Jespersen’s concerns and the tradition from which they derived. (p.8)

As soon as the earliest attempts were made to construct explicit generative grammars sixty years ago, many puzzling phenomena were discovered, which had not been noticed as long as the Basic Property was not clearly formulated and addressed and syntax was just considered “use of words” determined by convention and analogy. […]

One puzzle about language that came to light sixty years ago, and remains alive and I think highly significant in its import, has to do with a simple but curious fact. Consider the sentence “instinctively, eagles that fly swim.” The adverb “instinctively” is associated with a verb, but it is “swim,” not “fly.” There is no problem with the thought that eagles that instinctively fly swim, but it cannot be expressed this way. Similarly the question “Can eagles that fly swim?” is about ability to swim, not to fly. What is puzzling is that the association of the clause-initial elements “instinctively” and “can” to the verb is remote and based on structural properties, rather than proximal and based solely on linear properties, a far simpler computational operation, and one that would be optimal for processing lan- guage. Language makes use of a property of minimal structural distance, never using the much simpler operation of minimal linear distance; in this and numerous other cases, ease of pro- cessing is ignored in the design of language. In technical terms, the rules are invariably structure-dependent, ignoring linear or- der. The puzzle is why this should be so—not just for English but for every language, not just for these constructions but for all others as well, over a wide range. (pp.

The child reflexively knows the right answer in language learning cases where evidence is slight or nonexistent.
Linear order is not available to language learners in such cases, as they are guided by a deep principle that restricts search to minimal structural distance. This principle is part of Universal Grammar (UG), a genetically determined property of language.
The principle of minimal distance is extensively employed in language design and is an instance of the more general principle of Minimal Computation. There must be some special property of language design that restricts Minimal Computation to structural rather than linear distance.

Universal Grammar (UG) is a concept in linguistics proposed by Noam Chomsky, which refers to the innate linguistic abilities that humans are born with. According to Chomsky, UG includes a set of principles and parameters that are common to all languages, and which enable children to learn language rapidly and effortlessly. The principle of minimal structural distance, mentioned in the passage, is one of these principles. It is the idea that when a child is exposed to language input, they search for the simplest and most coherent structural explanation for the patterns they observe, rather than relying on linear order alone. This is thought to be a property of UG, and is what enables children to acquire language so efficiently.

There is a small industry in computational cognitive science attempting to show that these properties of language can be learned by statistical analysis of Big Data. This is, in fact, one of the very few significant properties of language that has been seriously addressed at all in these terms. Every attempt that is clear enough to be investigated has been shown to fail, irremediably. But more significantly, the efforts are beside the point in the first place. If they were to succeed, which is a virtual impossibility, they would leave untouched the original and only serious question: Why does language invariably use the complex computational property of minimal structural distance in the relevant cases, while always disregarding the far simpler option of minimal linear distance? Failure to grasp this point is an illustration of the lack of willingness to be puzzled that I mentioned earlier, the first step in serious scientific inquiry, as recognized in the hard sciences at least since Galileo. (p.12)

Linear order is not available for computation in the core parts of language involving syntax-semantics. Linear order is a peripheral part of language, a reflex of properties of the sensorimotor system, which requires it.
The auditory system of chimpanzees might be fairly well adapted for human speech, but apes cannot extract language-relevant data from the surrounding environment as human infants do.
Fundamental language design ignores order and other external arrangements, and semantic interpretation in core cases depends on hierarchy, not the order found in externalized forms.
The Basic Property of language is generation of an unbounded array of hierarchically structured expressions mapping to the conceptual-intentional interface, providing a kind of “language of thought.”
Language is not sound with meaning but meaning with sound or some form of externalization, typically sound, though other modalities are readily available.
Most use of language is never externalized and is a kind of internal dialogue.
Full-formed expressions instantly appear internally, too quickly for articulators to be involved, or probably even instructions to them. There are many interesting questions and ramifications related to these topics that could be explored through inquiry.

The latter issue aside, investigation of the design of language gives good reason to take seriously a traditional conception of language as essentially an instrument of thought. Externalization then would be an ancillary process, its properties a reflex of the largely or completely independent sensorimotor system. Further investigation supports this conclusion. It follows that processing is a peripheral aspect of language, and that particular uses of language that depend on externalization, among them communication, are even more peripheral, contrary to virtual dogma that has no serious support. It would also follow that the extensive speculation about language evolution in recent years is on the wrong track, with its focus on communication. (p.14-15)

It is, in the first place, odd to think that language has a purpose. Languages are not tools that humans design but biological objects, like the visual or immune or digestive system. Such organs are sometimes said to have functions, to be for some purpose. But that notion too is far from clear. Take the spine. Is its function to hold us up, to protect nerves, to produce blood cells, to store calcium, or all of the above? Similar questions arise when we ask about the function and design of language. (p.15)

Merge

pp.16-20:

The simplest computational operation, embedded in some manner in every relevant computational procedure, takes objects X and Y already constructed and forms a new object Z. Call it Merge. The principle of Minimal Computation dictates that neither X nor Y is modified by Merge, and that they appear in Z unordered. Hence Merge(X,Y) = {X,Y}. That does not of course mean that the brain contains sets, as some current mis-interpretations claim, but rather that whatever is going on in the brain has properties that can properly be characterized in these terms—just as we don’t expect to find the Kekulé diagram for benzene in a test tube.

Note that if language really does conform to the principle of Minimal Computation in this respect, we have a far-reaching answer to the puzzle of why linear order is only an ancillary property of language, apparently not available for core syntactic and semantic computations: language design is perfect in this regard (and again we may ask why). Looking further, evidence mounts in support of this conclusion. Suppose X and Y are merged, and neither is part of the other, as in combining read and that book to form the syntactic object corresponding to “read that book.” Call that case External Merge. Suppose that one is part of the other, as in combining Y = which book and X = John read which book to form which book John read which book, which surfaces as “which book did John read” by further operations to which I will return. That is an example of the ubiquitous phenomenon of displacement in natural language: phrases are heard in one place but inter- preted both there and in another place, so that the sentence is understood as “for which book x, John read the book x.” In this case, the result of Merge of X and Y is again {X, Y}, but with twocopies of Y (= which book), one the original one remaining in X, the other the displaced copy merged with X. Call that Internal Merge.

It is important to avoid a common misinterpretation, found in the professional literature as well. There is no operation Copy or Remerge. Internal Merge happens to generate two copies, but that is the outcome of Merge under the principle of Minimal Computation, which keeps Merge in its simplest form, not tampering with either of the elements Merged. New notions of Copy or Remerge not only are superfluous; they also cause considerable difficulties unless sharply constrained to apply under the highly specific conditions of Internal Merge, which are met automatically under the simplest notion of Merge.

External and Internal Merge are the only two possible cases of binary Merge. Both come free if we formulate Merge in the optimal way, applying to any two syntactic objects that have already been constructed, with no further conditions. It would require stipulation to bar either of the two cases of Merge, or to complicate either of them. That is an important fact. For many years it was assumed—by me, too—that displacement is a kind of “imperfection” of language, a strange property that has to be explained away by some more complex devices and assumptions about UG. But that turns out to be incorrect. Displacement is what we should expect on the simplest assumptions. It would be an imperfection if it were lacking. It is sometimes suggested that External Merge is somehow simpler and should have priority in design or evolution. There is no basis for that belief. If anything, one could argue that Internal Merge is simpler since it involves vastly less search of the workspace for computation—not that one should pay much attention to that.

Another important fact is that Internal Merge in its simplest form—satisfying the overarching principle of Minimal Computation—commonly yields the structure appropriate for semantic interpretation, as just illustrated in the simple case of “which book did John read.” However, these are the wrong structures for the sensorimotor system: universally in language, only the structurally most prominent copy is pronounced, as in this case: the lower copy is deleted. There is a revealing class of exceptions that in fact support the general thesis, but I will put that aside.

Deletion of copies follows from another uncontroversial application of Minimal Computation: compute and articulate as little as possible. The result is that the articulated sentences have gaps. The hearer has to figure out where the missing element is. As well-known in the study of perception and parsing, that yields difficult problems for language processing, so-called filler-gap problems. In this very broad class of cases too, language design favors minimal computation, disregarding the complications in the processing and use of language. Notice that any linguistic theory that replaces Internal Merge by other mechanisms has a double burden of proof to meet: it is necessary to justify the stipulation barring Internal Merge and also the new mechanisms intended to account for displacement—in fact, displacement with copies, generally the right forms for semantic interpretation.

The same conclusions hold in more complex cases. Consider, for example, the sentence “[which of his pictures] did they persuade the museum that [[every painter] likes best]?” It is derived by Internal Merge from the underlying structure “[which of his pictures] did they persuade the museum that [[every painter] likes [which of his pictures] best]?,” formed directly by Internal Merge, with displacement and two copies. The pronounced phrase “which of his pictures” is understood to be the object of “likes,” in the position of the gap, analogous to “one of his pictures” in “they persuaded the museum that [[every painter] likes [one of his pictures] best].” And that is just the interpretation that the underlying structure with the two copies provides.

Furthermore, the quantifier-variable relationship between every and his carries over in “[which of his pictures] did they persuade the museum that [[every painter] likes best]?” The answer can be “his first one”—different for every painter, as in one interpretation of “they persuaded the museum that [[every painter] likes [one of his pictures] best].” In contrast, no such answer is possible for the structurally similar expres- sion “[which of his pictures] persuaded the museum that [[every painter] likes flowers]?,” in which case “his pictures” does not fall within the scope of “every painter.” Evidently, it is the unpronounced copy that provides the structure required for quantifier-variable binding as well as for the verb-object interpretation. The results once again follow straightforwardly from Internal Merge and copy deletion under externalization. There are many similar examples—along with interesting problems as complexity mounts.

Just as in the simpler cases, like “instinctively, eagles that fly swim,” it is inconceivable that some form of data processing yields these outcomes. Relevant data are not available to the language learner. The results must therefore derive “from the original hand of nature,” in Hume’s phrase—in our terms, from genetic endowment, specifically the architecture of lan- guage as determined by UG in interaction with such general principles as Minimal Computation. In ways like these we can derive quite far-reaching and firm conclusions about the nature of UG.

Concluding Remarks

pp.24-25:

A broader research project—in recent years called the minimalist program—is to begin with the optimal assumption—the so-called strong minimalist thesis, SMT—and to ask how far it can be sustained in the face of the observed complexities and variety of the languages of the world. Where a gap is found, the task will be to see whether the data can be reinterpreted, or principles of optimal computation can be revised, so as to solve the puzzles within the framework of SMT, thus producing some support, in an interesting and unexpected domain, for Galileo’s precept that nature is simple, and it is the task of the scientist to prove it. The task is of course a challenging one. It is fair to say, I think, that it seems a good deal more realistic today than it did only a few years ago, though enormous prob- lems of course remain.

All of this raises at once a further question: Why should language be optimally designed, insofar as the SMT holds? This question that leads us to consideration of the origin of language. The SMT hypothesis fits well with the very limited evidence we have about the emergence of language, apparently quite recently and suddenly in the evolutionary time scale, as Tattersall discussed. A fair guess today—and one that opens rich avenues of research and inquiry—is that some slight re-wiring of the brain yielded Merge, naturally in its simplest form, providing the basis for unbounded and creative thought, the “great leap forward” revealed in the archaeological record, and the remarkable differences separating modern humans from their predecessors and the rest of the animal kingdom. Insofar as the surmise is sustainable, we would have an answer to questions about apparent optimal design of language: that is what would be expected under the postulated circumstances, with no selectional or other pressures operating, so the emerg- ing system should just follow laws of nature, in this case the principles of Minimal Computation—rather the way a snowflake forms.

These remarks only scratch the surface. Perhaps they can serve to illustrate why the answer to the question “What is Language?” matters a lot, and also to illustrate how close attention to this fundamental question can yield conclusions with many ramifications for the study of what kind of creature humans are.