36
Synthese (2006) 153: 69–104 © Springer 2006 DOI 10.1007/s11229-005-4063-6 J. COLLINS PROXYTYPES AND LINGUISTIC NATIVISM ABSTRACT. Prinz (Perceptual the Mind: Concepts and Their Perceptual Basis, MIT Press, 2002) presents a new species of concept empiricism, under which con- cepts are off-line long-term memory networks of representations that are ‘cop- ies’ of perceptual representations – proxytypes. An apparent obstacle to any such empiricism is the prevailing nativism of generative linguistics. The paper critically assesses Prinz’s attempt to overcome this obstacle. The paper argues that, prima facie, proxytypes are as incapable of accounting for the structure of the linguis- tic mind as are the more traditional species of empiricism. This position is then confirmed by looking in detail at two suggestions (one derived from recent con- nectionist research) from Prinz of how certain aspects of syntactic structure might be accommodated by the proxytype theory. It is shown that the suggestions fail to come to terms with both the data and theory of contemporary linguistics. 1. INTRODUCTION Jesse Prinz (2002) offers a new defense of concept empiricism; his thesis is that “all (human) concepts are copies or combinations of copies of perceptual representations” (ibid., p. 108). Prinz dubs such “copies” proxytypes: long term memory networks combined (typ- ically, but not necessarily) from ‘copies’ of the representations of dedicated input faculties, i.e., perceptual systems. Prinz argues at length that this notion, when in full dress, satisfies the full range of desiderata of a theory of concepts, outperforming its more fashion- able rivals that invariably fail on one or more of the desiderata (e.g., definitions, atomism, prototypes, theory theory, etc.). The sequel will not be concerned with Prinz’s debate with his rivals over the nature of concepts; rather, its focus will be on what Prinz has to say about linguistic nativism in light of his general the- ory of concepts. It might seem to be an obvious desideratum for a modern empiricist to undermine, or at least question, the prevail- ing nativism of linguistics. Indeed, it would not be hyperbolic to say that the demise of concept empiricism is in a large part due to the

Proxytypes and Linguistic Nativism

Embed Size (px)

Citation preview

Synthese (2006) 153: 69–104 © Springer 2006DOI 10.1007/s11229-005-4063-6

J. COLLINS

PROXYTYPES AND LINGUISTIC NATIVISM

ABSTRACT. Prinz (Perceptual the Mind: Concepts and Their Perceptual Basis,MIT Press, 2002) presents a new species of concept empiricism, under which con-cepts are off-line long-term memory networks of representations that are ‘cop-ies’ of perceptual representations – proxytypes. An apparent obstacle to any suchempiricism is the prevailing nativism of generative linguistics. The paper criticallyassesses Prinz’s attempt to overcome this obstacle. The paper argues that, primafacie, proxytypes are as incapable of accounting for the structure of the linguis-tic mind as are the more traditional species of empiricism. This position is thenconfirmed by looking in detail at two suggestions (one derived from recent con-nectionist research) from Prinz of how certain aspects of syntactic structure mightbe accommodated by the proxytype theory. It is shown that the suggestions failto come to terms with both the data and theory of contemporary linguistics.

1. INTRODUCTION

Jesse Prinz (2002) offers a new defense of concept empiricism; histhesis is that “all (human) concepts are copies or combinations ofcopies of perceptual representations” (ibid., p. 108). Prinz dubs such“copies” proxytypes: long term memory networks combined (typ-ically, but not necessarily) from ‘copies’ of the representations ofdedicated input faculties, i.e., perceptual systems. Prinz argues atlength that this notion, when in full dress, satisfies the full range ofdesiderata of a theory of concepts, outperforming its more fashion-able rivals that invariably fail on one or more of the desiderata (e.g.,definitions, atomism, prototypes, theory theory, etc.).

The sequel will not be concerned with Prinz’s debate with hisrivals over the nature of concepts; rather, its focus will be on whatPrinz has to say about linguistic nativism in light of his general the-ory of concepts. It might seem to be an obvious desideratum for amodern empiricist to undermine, or at least question, the prevail-ing nativism of linguistics. Indeed, it would not be hyperbolic to saythat the demise of concept empiricism is in a large part due to the

70 J. COLLINS

success of generative linguistics, at least within cognitive science, ifnot philosophy. Prinz appears to think so.

Prinz follows the philosophical tradition in taking a theory of con-cepts to target, for the most part, what we may refer to as ‘lexicalconcepts’. Very roughly, a lexical concept is the propositional con-stituent that is expressed by the contribution a lexical item makes tothe thought expressed by an utterance (relative to context, etc.) of asentence that features the given lexical item as a proper constituent.For example, a typical theory of concepts is concerned with the con-cept CAT, qua that ‘thing’ which ‘cat’ contributes to the propositionexpressed by ‘Cats make good pets’, ‘Dogs chase cats’, etc. Lexicalconcepts are not necessarily merely word ‘meanings’, but they at leastare the thought constituents we express via language. There are, forsure, a host of familiar problems with the presumed isomorphism ofa sentence and the thought it expresses that this schematic modelof concepts enshrines, e.g., matters of implicature, context sensitivity,unarticulated constituents, polysemy, etc. Still, at some level and tosome degree, the model serves as a default conception of a theoryof concepts. If, however, this is one’s conception – and it is Prinz’s –then the claims of generative linguistics would appear to be quiteorthogonal to one’s concerns, whether one is an empiricist or nativ-ist. Put bluntly, the theories of generative linguistics are simply notaccounts of what we believe, know or think about language in anysense which may answer to a theory of propositions and their constit-uents. Unfortunately, there is a philosophical tradition that insists onattributing to linguistic theory the task of providing the propositionsthat serve as the objects of our knowledge of language. This is cer-tainly a misreading of Chomsky in particular, and fails to make senseof the concerns that animate linguistic research (Collins 2004).1 So, itwould seem that, for good or ill, generative linguistics is no obstacleat all to an empiricist theory of concepts construed as a theory ofpropositional content.

This first impression is true enough, but rather than giving a freeride to the empiricist, it simply reveals that a theory of concepts asone of propositional constituents will not help to explain linguisticcompetence. Linguistic competence remains the paradigmatic cogni-tive capacity of our species, whose theoretical understanding involvesthe attribution of “concepts” to the speaker/hearer, e.g., categories(noun, verb, etc.), agreement features (tense, person, gender, etc.),semantic roles (agent, patient, etc.), Case features (nominative, accu-sative, etc.), phonetic features (voiced, aspirant, etc.), and so on.

PROXYTYPES AND LINGUISTIC NATIVISM 71

According to contemporary theories, the competent speaker/hearer,in some sense, possesses these notions, and her doing so is what con-stitutes her competence. But the speaker/hearer qua competent doesnot have any thoughts about these notions; her competence is notconstituted by a set of propositional attitudes towards ‘truths’ about,say, the phrase structure of English or Japanese.

The moral, then, is that if one approaches the concepts oflinguistics as if one were concerned with propositional constitu-ents, then there can be little hope of insight, still less refutation.Prinz offers us an example of this moral. Just how we oughtto understand the conceptual attribution linguistics makes to thespeaker/hearer is a complex issue that will only be partially elabo-rated here (but see Collins 2004). Indeed, in a certain sense, thereis nothing to elaborate beyond a theory of the language faculty andthe evidence to support it. The notion of a ‘concept’ is colloquial,commonsensical; there is no a priori reason to expect it to give wayto a more rigorous, theoretical understanding. Perhaps there just isnothing in our minds that answers to ‘concept’ under its colloquialor philosophical construal. For the purposes of the sequel, then,no general notion of a concept is required to stand against Prinz’sempiricism; all that is required is an account of the language facultyas a native structure that essentially falls outside of Prinz’s generalempiricist account of linguistic competence.

2. PROXYTYPES

Prinz’s notion of a proxytype is a hybrid designed to explain howconcepts might possess both content (intentional (reference) andcognitive (sense)) as propositional constituents and be employed incategorization. These desiderata seem to pull in opposite directions.On the one hand, the intentional content of a concept appearsto be constituted independently of the employment of the conceptin, say, perceptual discrimination. HORSE refers to horses, even ifone persistently confuses horses with donkeys in the dark. On theother hand, the very idea of categorization appears to involve somedegree of conceptual complexity, in terms of which categories maybe detected by a matching of constituent structure between conceptand category. The rich data we have on the matter suggests that cat-egorization is made on the basis of the detection of prototypicalfeatures, whereby, say, members of the category fish are identified

72 J. COLLINS

as such by some weighting of features such as swims, has fins, livesin the sea, etc. (see Prinz 2002, chap. 3, for a sound discussion ofthe data). Familiarly, however, the properties that constitute real cat-egories are not available for perceptual detection, e.g., genotypes,atomic numbers, etc. For instance, a typical weighting of fish fea-tures would classify dolphins as fish above eels, which would per-haps be classified along with snakes; but dolphins and eels are not,respectively, in the intentional contents of FISH and REPTILE.

Prinz’s way out of this dilemma, on the assumption that bothintentional content and categorization are properly in the scope of atheory of concepts, is (i) to posit a nomological connection betweenproxytypes and members of the categories that constitute the respec-tive extensions of the proxytypes, but (ii) to construe a proxytypeitself not as a mere label at one end of a causal relation but as acomplex that is usable in categorization. To go into the details ofthis proposal here is beyond the scope of this paper. Instead, a num-ber of remarks on the empiricism of the model will suffice for ourpurposes.

Firstly, the account amounts to a version of (Lockean) conceptempiricism to the extent that all concepts, qua proxytypes, are copiesof, or combinations of copies, of the elements of perceptual represen-tations. The claim is not a species of phenomenalism, or sense-datatheory. For Prinz, ‘perceptual’ marks a functionally defined class ofsystems dedicated to peripheral processing, such as vision, olfaction,haptic sense, etc. The representations involved in this processing arepotentially quite rich, such as Biderman’s geons. The claim is thatwhatever elements serve in these perceptual representations suffice toexhaust the elements that comprise concepts in general.

Secondly, the notion of ‘copying’ is impressionistic; it indicatesthat the features of tokenings of perceptual representations may bestored in downstream systems and be available for recall in work-ing memory. Perceptual representations are copied, then, just in thesense that their salient features, which make them apt to detect cat-egories, can be stored outside of the perceptual faculties and beactivated independent of the operation of the faculty (via stimu-lus) from which they derive. Also, Prinz doesn’t presume that thereis a common-code shared by all perceptual systems or a universal-code – a cognitive lingua franca – into which all proxytypes are for-matted. A proxytype is an inter-modal network made of the distinctrepresentational features that comprise the resources of distinct per-ceptual faculties.

PROXYTYPES AND LINGUISTIC NATIVISM 73

Thirdly, as just indicated, although proxytypes are made out ofperceptual representations, they are unlike such representations intheir original role; they are off-line in the sense of being endoge-nously controlled, not controlled by stimulus from the distal cate-gories they are designed to detect (Prinz 2002, p. 197). This givesproxytypes a simulation role not unlike the simulation model of the-ory of mind: they go proxy for perceptions. To employ a conceptis, in a very real sense, to simulate one’s perceptual detection ofthe members of the category that is the concept’s intentional con-tent, just like, so it is argued, to explain an agent’s actions via men-tal state attribution is to simulate one’s being in the mental statesattributed, where, qua off-line, the simulation produces a predictionof behaviour identified by the action one would perform if the sim-ulation were on-line.2

Fourthly, a network of stored perceptual representations countsas a concept through its activation in thought. We can think of cog-nitive processes as, in part, selecting from memory structures thatserve to categorise the aspects of the environment currently the con-cern of the agent. For example, there are very real differences betweenthinking about fish in a restaurant, a pet shop, or off Hawaii swim-ming with tiger sharks. In each case, we may presume, a distinctproxytype would be activated from a network within which all threeand numerous potential others are integrated. For each situation, weemploy that proxytype that includes the features that would be salientin the perceptual detection of fish in that situation. That said, wemay also presume that each activated proxytype has a default set offeatures that are shared, these being a direct inheritance of the spe-cies invariant structure of our perceptual faculties. Further, given ashared language, which may freely complement proxytypes (e.g., eachFISH proxytype perhaps carries a ‘called ‘fish” feature), there existsa communicative capacity by which agents may focus on the preciseconcepts another is employing. With quite abstract concepts, such asDEMOCRACY, there might not be anything more to the proxytypethan linguistic features.

Much more could be said about proxytypes. The brief of thepresent work, however, is not a wholesale critique of the theory.Independent of whatever failings or virtues the theory possessesin general, the argument of the sequel will be that it does notadequately characterise linguistic cognition in particular. It will beassumed hereafter that any unmentioned features of the theory willnot vitiate this argument.

74 J. COLLINS

3. LINGUISTIC NATIVISM

Prinz readily acknowledges the apparent problem contemporaryapproaches to linguistics pose for his account and seeks to showthat the empirical considerations in their favour are not ‘fully con-vincing’ (2002, p. 235), the bets are not yet off. It bears noticenow that this is a very weak conclusion. No scientific bets are everoff, such is empirical inquiry. To highlight an instance of this gen-eral triviality is quite nugatory. It would be significant if reasonabledoubt could be cast on the anti-empiricist considerations of linguis-tics such that its explanatory scope would be narrowed or a compet-itor account would to some degree be confirmed. As we shall see,this is not achieved.

Prinz (2002, chp. 8), after Cowie (1999) and Samuels (2002),finds nativism to be an obscure doctrine. Prinz (ibid., p. 193) restsupon an ‘operational’ identification of innateness whereby, centrally,we may say that an innate concept or capacity is one that isacquired under a poverty of stimulus. This identification does notdefine innateness; rather, it serves as a diagnostic, a reference fixerfor an ‘essence’ we have yet to discover. The arguments to follow donot rest upon any particular conception of innateness, and certainlynot one that begs any questions against Prinz’s brand of empiricism.All that is important for the sequel is that linguistic competenceexhibits features – call them innate, if you wish – that are inconsis-tent with Prinz’s perceptual model of concepts. This holds indepen-dently of just how one defines innateness. Thus, as far as the logicof the following is concerned, ‘linguistic nativism’ may be under-stood to be idiom-like, with no necessary implication for how nativ-ism should be understood in general.

Perhaps the most fundamental observation one can make abouthumans and language is that, while each ‘normal’ member of thespecies acquires (at least) a given language, any of an indefinitenumber of distinct languages could have been acquired with equalease. This observation provides us with a vivid distinction betweenthe generality of the human capacity for language and the specific-ity of the states it enables – linguistic endstates. The question whichlinguistic theory addresses is how, as a species trait, we all movefrom generality to specificity. The schematic answer it provides isthat the initial ‘general’ capacity is not so general in the sense thatit is essentially specifiable in terms of a range of conceptual struc-tures, some finite selection from which is realised by each attainable

PROXYTYPES AND LINGUISTIC NATIVISM 75

linguistic endstate. Reciprocally, the endstates are not so specific,in that they are circumscribed by the finite variation admitted inthe initial conceptual structures. The conceptual structures consti-tute the language faculty, and the acquisition of language amountsto the development of the faculty (in the context of the faculty’sintegration with other cognitive structures and mechanisms, espe-cially ones governing sound articulation and conceptual/intentionalunderstanding). The initial state of the faculty is dubbed Univer-sal Grammar (UG), and the endstates into which the faculty maydevelop are dubbed I-languages. From this, we may characterise lin-guistic nativism in the following terms:

(LN) Human genotypes realise UG that determines a range of developmentalpaths terminating at I-languages. The paths are largely under genetic control, withexperience offering only a shaping and/or selecting effect. Each I-language is avariation on a fixed, uniform theme – a repertoire of structures and concepts.

UG may be viewed as a schema, whose dimensions of variabil-ity admit a range of values determined by a set of parameters. AnI-language is a certain incarnation of the UG schema. The notionof an I-language is an idealisation in the sense that it allows us toabstract from the undoubtedly horrendous complexity of the mentalstructures that determine how an individual speaks and understandsher language the way she does. Even within the one ‘language’ suchas English, each of us have distinct vocabularies and understandour words in many subtly varied ways. Some constructions whichare in common use for one speaker are gibberish to another. Fur-ther and obviously, we all speak – articulate sounds – in differentways. The UG hypothesis, however, is that there is a structural sta-bility to this manifold provided by the developmental paths para-metrically determined by the initial state. So, while each of us hasour own I-language, it will possess a structure shared by many otherI-languages as determined by the shared initial state of UG. Theextent to which I-languages may vary is a purely empirical question.Per hypothesis, though, there will be a universally shared element,which is just whatever UG contributes that is not parameterised;this might simply be the terms of the parameters, the X, as it were,of ‘Select X or not X’.

We have, then, three aspects to an I-language. First, there isthe core structure that is inherited from UG. On current minimal-ist understanding, this includes a computational procedure, CHL,that maps from a selection of lexical items to pairs of structures –

76 J. COLLINS

an expression –, <PHON, SEM>, that code, respectively, for fea-tures that determine phonological and conceptual/intentional artic-ulation. (Here I assume a ‘level-free’ model of the faculty, as foundin Chomsky, 2000a, 2001a. It matters little for the following argu-ments if the pairs are read as the more familiar <PF, LF> as foundin Chomsky, 1995.) The structures are understood to be interfacesbetween the language faculty and other systems of the mind/brainthat put ‘language’ to use. We can think of CHL as having two ter-mini, the first is formed at a point (or cycle of points) of spell out,where just those features of the lexical selection that are relevantto phonology are stripped away, and, after morphological supple-mentation, code for ‘surface’ properties of both sound and linear-ity. The second terminus is another point of spell out (or a cyclethereof) at which the remaining features of the initially selected lex-ical items, minus those that are semantically inert, interface withconceptual systems. In general, CHL is governed by operations ofmerging lexical items to form hierarchical structures, and the move-ment and deletion of lexical items and their features within andfrom such structures. Such operations guarantee that only those fea-tures respectively legible to external systems of ‘sound’ and ‘mean-ing’ make it to the appropriate interface. It follows from this modelthat both CHL and the features that comprise the lexical items towhich CHL applies will be shared across speakers.

Second, there is the parameterised structure. Again, on minimal-ist understanding, this reduces to functional differences expressedin the morphological marking of agreement and Case. As justremarked, the extent of this structure is undetermined, althoughmuch of it will be shared. A simple example is the property of pro-drop: ‘Spanish’ and ‘Italian’ admit the absence of subjects in finiteclauses, ‘English’ and ‘German’ do not. In technical terms, the lat-ter languages require a phonologically realised specifier of a Tensehead at PHON, the former languages do not.

Third, there will be the truly idiosyncratic, the variation particu-lar to individual speakers. In essence, this variation amounts to lex-ical differences and their affect at the interfaces.

This characterisation of the language faculty is somewhat stip-ulative. Its purpose, though, is for us to have something concreteon the table before Prinz’s contrary thoughts are assessed; reasonsfor thinking that something like the above may well be true will beadvanced in response to Prinz. Clearly, however, the considerationsrequired for a full defense of the above minimalist understanding

PROXYTYPES AND LINGUISTIC NATIVISM 77

cannot be given here and are, anyhow, unnecessary to establish theinadequacy of Prinz’s empiricist position. Before we turn to Prinz’sexplicit arguments, let us first see why the above model cannotprima facie be accommodated by the proxytype theory.

A proxytype is a long-term memory network whose constitu-ent elements are drawn from perceptual representations. In effect, aproxytype is an off-line, simulated detection of the members of itsintentional content, the category of things to which it refers. Thelanguage faculty, or a steady state of it, an I-language, differs fromthis conception of concepts in two immediate respects. Firstly, thefaculty is posited as a distinct system from those that govern linguis-tic performance, the production and understanding of utterances.The faculty interfaces with these systems in the form of <PHON,SEM> pairs, but, as indicated, the properties – the constituent ‘con-cepts’ or lexical features – of these structures are determined inter-nal to the faculty, they are not read into the faculty from systemsthat detect speech in others. Thus, although the systems that detectmeaningful speech exploit the resources of the faculty, insofar asPHON and SEM go to determine how we speak and understandand so enter into how we determine what meanings go with whichsounds in the speech of others, they borrow the extant conceptsof the autonomous faculty, they don’t determine them. Secondly,an I-language is so called because, in part, it is an internal men-tal structure. An I-language is not a public item, nor does it rep-resent one. So, although performance systems make use of lexicalfeatures, it would be a mistake to think that such features serve todetect or refer to externally instantiated properties, for there are nosuch properties. Speech (or, indeed, any orthographic system) doesnot come marked with phonological features of voiced or aspirant;nor are words marked with nominal or adjectival features; nor arewords in phrases signaled to be heads or arguments or specifiers ofheads; nor do pronouns come with arrows pointing to their ante-cedents, if any; nor are all the elements so much as pronounced,such as specifiers of Tense heads (in some constructions) in Italian,or ‘understood’ subjects of infinitives in English. All of these fea-tures, and many more, belong to the mind, inherent in structuresthat relate cognitive systems to each other; they are not out therein the world ready to be detected. As indicated just above, the fea-tures assuredly enter into speech detection and processing, but theyare imposed or mapped onto whatever psychophysically transducedfeatures are first constructed by the mind rather than referentially

78 J. COLLINS

stretching beyond the mind. As we shall see, much of Prinz’s effortis directed to distorting the linguistic features so that we might thinkof them as referring to external patterns, and so as being proxy-types, structures inherited from perceptual systems that detect the‘special’ patterns.

More generally, the faculty approach to linguistic cognition isflatly inconsistent with any notion of general learning mechanismsthat may construct linguistic features from identifiable or discrim-inable properties of the environment. The proxytype theory essen-tially appeals to perceptually detectable environmental propertiesand so contradicts the two central characteristics of I-languagessketched above. Under the I-language conception, linguistic com-petence is sui generis, i.e., the characteristics of the ensemble ofinterfaces are not derived from or constructed out of independentlyavailable, or otherwise specifiable, concepts or features. In effect,this means that the concepts a speaker/hearer possesses such thatshe is linguistically competent are not retrievable from her envi-ronment, although they may well be triggered or, to some degreeshaped, by environmental factors.3 If this is so, then, it seems thatthe PHON/SEM features must spring from the native capacitiescoded for by human genotypes.

4. PRINZ’S EMPIRICIST DEFLATION

First off, Prinz does not claim that proxytypes cannot be innate.Without endorsing it, he readily admits the coherence of what hecalls weak antinativism about concepts: ‘All innate concepts are per-ceptual representations’ (Prinz 2002, p. 196). He goes on to explainand qualify: ‘If concept empiricism admits innate perceptual rep-resentations and innate perceptual concepts, it might also admitinnate principles, provided they consist of entities of these twokinds. Admitting innate principles is a problem only in cases wherethose principles involve concepts that are not perceptual represen-tations’ (op. cit.) In other words, proxytypes (= concepts) can beinnate only if the constituents that make them up are inherited fromperceptual representations that are innate. This concession, however,is somewhat of a dialectical manoeuvre on Prinz’s part, for innate-ness plays no role in the arguments for proxytypes, nor does Prinzessay any account of what it would be for perceptual representationsto be innate, save for the general diagnostic thought that poverty of

PROXYTYPES AND LINGUISTIC NATIVISM 79

stimulus characterises the fixation on innate concepts. Fortunately,the matter may be sidelined, for the thesis of linguistic nativism aspresented above denies that linguistic concepts are inherited fromany antecedent representations. So, whether or not some percep-tual representations are innate, proxytype theory will still be stymiedbecause linguistic concepts are not perceptual tout court. To Prinz’sarguments, then.

Prinz (2002, p. 200), avers that ‘one can challenge the suppositionthat the rules constituting our innate grammatical abilities are non-perceptual. The simplest strategy is to argue that the language fac-ulty qualifies as a perceptual modality in its own right’.4 Well, onecan say this, certainly, but there just is no ‘supposition’ that ‘gram-matical abilities are nonperceptual’; rather, there are highly detailedproposals about grammatical competence that strongly indicate thatthe competence is not a species of perception. The issue here, how-ever, clearly turns on what is meant by ‘perception’. I follow Prinz inthinking of perception as a cognitive system dedicated to the auto-matic detection (exogenously determined) of distal properties (seeabove). It is in this sense that linguistic competence looks not tobe a perceptual modality. In the more familiar phenomenologicalsense of ‘perception’, which is as conceptually articulated as otherpropositional attitudes such as belief, one may innocently think ofhumans as perceiving meaning, sound and structure. But cognitivetheory is not a priori constrained to capture the features of our col-loquial concepts; as with any other science, its objects of study aredetermined by the development of our best theories. Thus, to denythat linguistic competence is a species of perception is not to rejectperceptual phenomenology or legislate against our familiar propo-sitional attitude verb, ‘perceive’; nor is to deny that cognitive mod-ules of perception are innately structured; rather, the denial amountsto the claim that linguistic competence is not a dedicated system forthe processing of perceptual inputs. Let us look at just two consid-erations which clearly support the hypothesis that ‘grammatical abil-ities are nonperceptual’ in this cognitive sense.

Firstly, as has been well documented over several decades, lin-guistic competence admits the generation of infinitely many struc-tures that are unparsable on-line. This is to say that our grammaticalabilities generate structures that we cannot perceptually identify on-line as well-formed, coherent. The classic example here is the caseof centrally embedded relative clauses. Consider: The boat the sailorthe dog bit built sank.5 The sentence appears to be gibberish. Indeed,

80 J. COLLINS

somewhat like the persistence of visual illusions, even after one hasunderstood why the structure is coherent, one still has difficulty inparsing it; and even if one manages to do so, if the example ischanged (e.g., Sailors sailors sailors fight fight fight), the problemrecurs. A number of responses might be made.

Firstly, it might be thought that centrally embedded relativeconstructions (and other such examples) are obviously perceptu-ally identifiable as ‘well-formed’; otherwise, how we would wein fact know that they are ‘well-formed’? This response missesthe point. Of course, one can come to recognise a sentence as‘well-formed’ and perhaps even subsequently employ it. But thisachievement would not be merely perceptual in the relevant on-linecognitive sense; it requires breaking the sentence up in terms of con-cepts such as subject, relative clause, main verb, etc., which are notperceptually identifiable as such. If they were, then we would recog-nise the sentence as ‘well-formed’ straight-off.

There is, for sure, no little difficulty in understanding how wecomprehend signs as linguistic, and I certainly do not take Prinzto think that he has the answer. What unparsable but well-formedstructures appear to show, however, is that our linguistic compe-tence is not simply dedicated to the on-line detection of language (cf.Higginbotham 1987). Again, this is not to deny the phenomenolog-ical fact that we can get ourselves in a position to see that a sen-tence is well-formed, but so much is not to claim that language is adedicated perceptual modality; it remains open to think that linguis-tic perception in the colloquial sense is a massive interaction effect,involving dedicated modules in combination with so-called centralprocesses – the conceptual richness of the common notion wouldsuggest that this is likely to be so.

Similar reasoning applies to the cases of ill-formed structuresthat are ‘mis-perceived’ to carry an otherwise coherent content:apparently parsable structures that are ill-formed. Consider the sen-tence: Many more people have been to Paris than I have. Initially, thesentence seems perfectly coherent, meaning that many more peoplethan just me have been to Paris, but somewhat like an Escher draw-ing, on closer examination, this meaning is not encoded in the struc-ture.6 The second clause is elliptical, and when we fill in the missingprogressive phrase, the structure turns into patent gibberish: Manymore people have been to Paris than I have been to Paris. Is it a per-ceptual achievement to recover from this illusion? Again, phenom-enologically, it seems to be, but it bears emphasis that we regularly

PROXYTYPES AND LINGUISTIC NATIVISM 81

use and accept such gibberish; that is, our perceptual detection oflinguistic structure does not match interpretability as determined bysyntax. For example, it is difficult to see what is wrong with thestructure unless we fill in the ellipsis, and even with this done, itis still hard to say precisely where the structure breaks down.7 Inother words, we have to reflect linguistically on the case, expand-ing the structure, making use of ‘meanings’ and structural notions;it appears not to be an automatic on-line achievement in the scopeof a dedicated input module. In this regard, the Escher cases seemto be different in kind; their ‘illusion’ is resolved by focusing on, say,certain vertices and shifting perspective; there is no need to appealreflectively to concepts outside of those required to recognise theshapes featured in the picture. More could be said here. Neverthe-less, at a minimum, we can conclude that syntactic well-formednessdoubly dissociates from the on-line dedicated detection of linguis-tic ‘meaning’, the kind of detection Prinz takes to be constitutiveof perception. The kind of outputs parsing delivers appear to be agood deal shallower than the structures that are reflectively availableto us. I do not pretend that this consideration is decisive against anyconception that treats language as a species of perception, but it isa prima facie refutation of the kind of model Prinz appears to havein mind.

Secondly, the bullet might be bit: ‘unparsable structures are notwell-formed’. Minus empiricist dogma that linguistic structure mustbe perceptually detectable, there is little reason to take this conten-tion seriously. Consider:

(1) [TP [DP1 The boat [CP1 [DP2 the sailor] built]] sank]8

This structure is easy to parse, but the difficult structure simplyinvolves an iteration of the operation of inserting a further CP intoa DP:

(2) [TP [DP1 The boat [CP1 [DP2 the sailor [CP2 [DP3 the dog]bit]] built]] sank].

From the perspective of the structures, as it were, the only salientdifference is the exiguous one that the first is parsable, the secondnot, but that is the very phenomenon at issue. Thus, far from infer-ring that (2) is incoherent from the fact that it is unparsable, weperhaps should infer that linguistic competence is not a perceptualmodality.

82 J. COLLINS

Thirdly, it might be thought that, while (2) is indeed non-parsa-ble but coherent, it is generated from simpler perceptual represen-tations. This might be the case, although we have so far seen noreason to think so, but the proposal is a non sequitor anyhow. Weare considering the idea that linguistic competence is a perceptualmodality, i.e., that the competence is dedicated to the perceptionand identification of distal patterns. The problem with (2) is thatit is not such a pattern qua unparsable. Nevertheless, it is a per-fectly coherent structure. So, even if our achieved understanding of(2) is somehow based on perceptual representations, that very factwould establish that linguistic competence is not merely an inputdevice, for (i) it generates structures that remain unparsable and (ii)the extrapolation of these structures is not triggered or caused bytransduced perceptual input. After all, if it were, then we would notneed to extrapolate the structures in the first place.

The second principal problem with the idea that linguistic compe-tence is a species of perception is the free generation – creativity –of linguistic performance. We are not, as it were, saddled with thestructures we employ, nor, of course, are they structures predictablefrom the occurrent situation. Even routine linguistic engagements,such as asking for a pack of Malboro or ordering your favouritemeal in your favourite restaurant, may feature innumerably manydifferent structures. Such a capacity enjoyed by every competentspeaker/hearer appears to have nothing whatsoever to do with per-ception, for nothing is being perceived, nor are different structureselicited by different features of the prevailing situation. If one thinksthat they might be, with the features being ‘subtle’, then simply con-sider that one can be perverse or merely decide to ask for your cig-arettes in a baroque way. To insist that there must be features ofthe environment that control for the structures used or are other-wise (unconsciously) detected, is to insist on a dogma there is noreason to think true.9 Such considerations go back over forty yearsto Chomsky’s (1959) demolishing of Skinner.

Prinz, in fact, does not endorse the claim that language is asense. A more attractive idea for him (2002, p. 200) is that ‘languageabilities emerge from other faculties, including perceptual and motorsystems.’10 This is but a minor amendment, for, again, the idea isthat linguistic competence amounts to pattern recognition or someextrapolation from it. The proposal lacks any substance until wehave reason to think that there is any pattern to be detected. With-out this provision we are back to ‘subtle’ environmental features

PROXYTYPES AND LINGUISTIC NATIVISM 83

that have no independent credence. Recall, the nativist does notsay that grammatical structure is an external pattern, yet one toocomplex for our perceptual systems to retrieve given poor data; thenativist claim is that there is no external syntactic pattern; instead,there is an internal procedure whose character is only partiallyreflected in its outputs at the ‘surface’. What Prinz requires, then,is some argument that linguistic structure is not so abstract, it canbe recovered from the surface properties available to the child. Prinzconstrues this challenge as one to explain the ‘arbitrariness’ and‘structure sensitivity’ of grammatical rules. Let us take each in turn.

4.1. ‘Arbitrariness’: The Case of Subjacency

Prinz (2002, p. 206) says ‘proponents of an innate language facultyoften assert that their opponents are unable to explain why some ofour linguistic rules seem idiosyncratic or arbitrary.’ Prinz neglects tocite anyone who argues this way; this is not surprising, for no-oneserious does so argue.11 Firstly, ‘rules’ are understood to be arbi-trary only relative to surface patterns, i.e., regularity at the surfacecan only be captured by rules that appeal to notions not markedat the surface, such as structural Case and binding relations. Sec-ondly and consequently, the challenge to empiricism is not to saywhy ‘rules’ seem arbitrary – a vacuous query – but to explain howthe structures the rules govern may be arrived at from surface pat-terns given that, relative to such patterns, the ‘grammaticality of astructure is, prima facie, arbitrary, unpredictable.12

Prinz attempts to turn the ‘argument from arbitrariness’, as heconstrues it, back on the nativist. His thought is that arbitrarinessof structure would be a likely consequence for a system employedfor a purpose it was not designed for, e.g., an ensemble of percep-tual systems employed in the realm of language. Ceteris paribus, onewould expect regularity to go with a specifically designed faculty, orso presumes Prinz. Thus: ‘if a system was evolved for language, onemight expect less arbitrariness. Linguistic idiosyncrasies are betterevidence for perceptual piggybacking than for an innate languagefaculty’ (ibid., p. 206).

This thought is mistaken, for the ‘rules’ are not arbitrary simpli-citer. Indeed, in the curious absolute sense in which Prinz under-stands arbitrariness, the so-called ‘rules’ are not arbitrary at all. Themethodology of recent linguistics is governed by the thought that

84 J. COLLINS

particular constructions may be explained by the interaction of gen-eral principles. Theoretically, then, far from wallowing in arbitraryrules, linguistics seeks to explain all idiosyncrasies (modulo the highlymarked and the lexical) according to simple, abstract principles. Fur-ther, the guiding hypothesis of the more recent minimalist program isthat language is a perfectly designed system. The perfection at issue,however, does not pertain to our communicative use of language; insuch a sense, language is transparently far from perfect. Rather, lan-guage is potentially perfect in the sense that it realises an optimalsolution for the integration of independent systems of phonologi-cal articulation and conceptual interpretation. For our purposes, thecrucial point is that each construction is the summed effect of princi-ples, where the effect of a given principle is often far from evident inthe resultant structure. It bears emphasis that this move away fromthe arbitrary and idiosyncratic happened over twenty years ago; whatremains constant is the thought that the governing principles are arbi-trary relative to surface patterns, e.g., one cannot assess acceptabilityon the basis of analogy to other acceptable structures. Without anappreciation of this methodology and the kind of data that support it,one is unlikely to register the force of linguistic argumentation. To seethat this is so, let us turn to Prinz’s key example of how the seeminglyarbitrary may be accommodated by general perceptual mechanisms.

Prinz (2002, p. 206), targets the Subjacency principle, ‘which haslong been paraded as a rule that demands a language-specific expla-nation’. First off, Subjacency has never ‘been paraded as a rulethat demands a language-specific explanation’, for no linguist hasever thought that Subjacency is a rule, it is a principle, i.e., it con-strains acceptable structures at a certain level, it does not applyto structures to generate other ones. Further, it is far from clearthat there is a single notion of Subjacency rather than a clusterof locality conditions on movement that issue in different levels ofacceptability for different constructions in different languages. Ingeneral, there are ‘Subjacency effects’, where it is an empirical andtheoretical problem to say just what principle or principles are inplay that issue in these variable effects.13 Still, pro tem, let us fol-low Prinz’s statement of the principle: ‘phrases cannot be movedtoo far, when shifting location from one place in a sentence toanother, as in the formation of questions’, where ‘too far’ amountsto ‘leap[ing] over two bounding categories’ (ibid., pp. 206–207).14

Prinz offers an example of this ‘rule’ in operation (my phrase brac-keting):

PROXYTYPES AND LINGUISTIC NATIVISM 85

(3) a. [CP[TP Sally learned [CP that [TP the French eat slugs]]]]b. [CP Whati did [TP Sally learn [CP that [TP the

French eat ti]]]]?c. [CP [TP Sally learned [DP that fact [CP that [TP the

French eat slugs]]]]].d. ∗[CP Whati did [TP Sally learn [DP that fact [CP that

[TP the French eat ti]]]]]?

According to Prinz (2002, p. 207), ‘according to Chomskyan lin-guistics, the latter question [3d] is ungrammatical because the noun(‘slugs’) is converted to a question word (‘what’) that must leap overtwo bounding categories (a sentence [read: TP] and a noun phrase[read: DP]) as it makes its way to the beginning of the sentence.This, they [sic] say, is forbidden by a setting on an innate gram-matical rule.’ As will be noted by counting the bounding nodes, thereasoning attributed to the ‘Chomskyan’ linguist is fallacious: theraised wh-object of eat in the acceptable 3b appears to “leap” overtwo bounding nodes, viz., the two TP projections, but the structureis perfectly in order. I’ll return to the proper understanding of Sub-jacency shortly; first, a crucial point must be made.

Subjacency is not innate, or at least not in the sense in whichPrinz means it. It has long been understood that Subjacency effectsvary across languages, i.e., it is simply false to think that there is auniversal constraint against a moved element crossing a ‘sentence’node (a TP node) and a DP node. One seminal idea due to Rizzi(1982) is that Subjacency is parametric, with Italian and other lan-guages having a different selection of bounding nodes (DP and CP,rather than DP and TP) than English. In fact, it is presently unclearif there is any universal uniform Subjacency-like constraint. Thus, asPrinz presents it, Subjacency is not an innate principle, but, at best,a parametric feature of English and related languages. Of course,like any other principle, it is not an a priori matter how Subjacencyshould be understood. For present purposes, all that is important isthat Subjacency does not mark out the same structures in all lan-guages. With this in mind, let us turn to Prinz’s contention.

Prinz (ibid., p. 207), appeals to the work of Ellefson and Chris-tiansen (2000) who ‘set out to refute’ the hypothesis that Subjacencycan only be accounted for by appeal to innate linguistic concepts.Their study was in two parts. First, they trained a group of studentson two artificial languages that comprised sets of symbol stringscorresponding to ‘noun’, ‘verb’, etc. One language – NAT – was

86 J. COLLINS

English-like in not admitting apparent Subjacency violations, theother language – UNAT – violated Subjacency (see below). Whentested on the languages, the students were better at learning NATthan UNAT (overall classification of novel strings into the two lan-guages, and judgments of grammaticality on the two languages,i.e., the students were generally more competent with NAT thanUNAT). The second part of the study consisted of training a groupof neural networks (40 in all) on the two artificial languages. Theresults roughly mirrored those attained in the behavioural tests; inessence, the networks were better predictors of new inputs (post-training) that conformed to ‘Subjacency’ than ones which did not.From this, Prinz (2002, p. 207) concludes:

Simple associative networks that learn by ‘hearing’ various sentences in a lan-guage find that language easier to learn if it does not allow words to move toofar. Subjacency, on this view, is not an innate grammatical principle but a biasthat emerges from domain-general systems that rely on sequential learning.

Prinz (op. cit.) readily points out that one such ‘result’ on onesuch ‘rule’ ‘hardly qualifies as a refutation of Chomsky.’ But itdoes ‘show that we must not draw a priori conclusions about whichrules require language-specific mechanisms’ (ibid., p. 208). Well, no-one serious does draw a priori conclusions, certainly not Chomsky.What, then, does the study tell us about Subjacency? It is difficultto say.

We may leave to one side the general complaints against connec-tionist modeling of linguistic competence, e.g., the learning devel-opment of networks fail miserably to match up with the linguisticontogeny of humans and the capacities of networks to acquire anysymbolic system stands in stark contrast to the highly restrictedspace of human language (cf., Marcus, 2001). Even if we assumethat network-like processes offer an antecedently plausible modelof linguistic cognition, the present research does not further theclaim.

First let us look briefly at Ellefson and Christiansen’s (2000)reasoning, then we can return to Prinz’s reading of the study. Ellef-son and Christiansen appeal just to the ‘English’ version of Subja-cency and the test sentences of the artificial languages were basedon English (see ibid., p. 447, ‘Table 1’). Their stated purpose isto provide an ‘alternative explanation [that] suggests that subja-cency violations are avoided, not because of a biological adapta-tion incorporating the subjacency principle, but because language

PROXYTYPES AND LINGUISTIC NATIVISM 87

itself has undergone adaptations to rule out such violations inresponse to non-linguistic constraints on sequential learning’ (ibid.,pp. 646–647).15 The upshot of the experiment is that “construc-tions involving subjacency violations [are not only] hard[er] to learnin and by themselves, but their presence also makes the languageas a whole harder to learn” (ibid., p. 450). It is difficult to saywhat the behavioural and network studies might intimate in gen-eral. What is clear is that the supposition here is false and theconclusion, whether it is supported by the data or not, is alsofalse.

First off, it is unclear what relation NAT and UNNAT have toEnglish, or any other natural language; a fortiori, it is unclear whatwould constitute a Subjacency violation in the languages. To seethe problem here, consider an example Ellefson and Christiansenprovide:

(4) a. Who (did) Sara ask why everyone likes cats?WH N V WH N V N (NAT)

b. ∗What (did) Sara ask why everyone likes?WH N V WH N V (UNAT)

The idea is that the symbol strings of the artificial languages mir-ror, in effect, the content of the Subjacency principle, i.e., to learnthe languages is to learn that Subjacency applies or doesn’t apply,respectively, and here there is no innate linguistic concepts in play,just the order of the constituents of the symbol strings. Trans-parently, however, these strings do not reflect Subjacency. It mat-ters to Subjacency from where items move, and from where itemsmight have moved in a structure can be determined by knowingthe selection restrictions inherent to the words of the structure.The learning of the symbol strings of NAT and UNAT does notprovide this. To see the point here compare (4)a–b with (5)a–brespectively:

(5) a. *What (did) Sara ask why everyone likes cats?WH N V WH N V N (NAT)

b. Why (did) Sara ask what everyone likes?WH N V WH N V (UNAT)

The structure of the symbol strings of either artificial languagedo not provide the means to reflect Subjacency. In simple terms:it matters whether the ‘question-word’ is an adjunct (e.g., why,

88 J. COLLINS

how) or an argument (e.g., what, who), for adjuncts can move fromadjoined positions on TPs, arguments cannot be so adjoined (bydefinition), and so cannot move from such positions; arguments canonly move from subject or object position of a verb. So, (5)a. isdeviant because there is no position from which what might havemoved. (5)b is perfectly acceptable and in line with Subjacency:what raises from object position of like to medial [SPEC CP] inline with Subjacency, and why raises from an adjoined position tothe highest [SPEC CP]. Thus, whatever difference there is betweenNAT and UNNAT, it does not appear to reflect a Subjacency vio-lation, for strings of UNNAT that putatively violate Subjacencycan easily be seen to be consistent with it, and strings of NATthat putatively cleave to Subjacency can easily be seen to be devi-ant. Let us now look a bit more closely at the notion of Subja-cency.

As explained above, Subjacency is not an ‘innate UG principle’(ibid., p. 450); there is not a single principle but a cluster of con-straints on movement that produce different levels of acceptabilityin different languages. So, it is not so much that we should seekalternative explanations to the one which says that the English ver-sion of Subjacency is an adaptation, a species trait, but that the verynotion that it is such a trait is mistaken. Just so, an explanation thattargets English Subjacency as a general constraint that will evolveunder selective pressure on efficient sequential learning is also false;for there just is no such general constraint. We may see this clearlyby turning to Ellefson and Christiansen’s conclusion.

Given that Subjacency (again, the English version) is not a universalfeature, the idea that a non-Subjacency language will be harder to learnis preposterous. Are we to think that Italian or Chinese children havemore difficulty learning their language than English children?16 Fur-ther, even in English, the effects of Subjacency are quite variable, withapparent exceptions (Culicover 1999, pp. 230–232). Now, of course, ifwe do assume that Subjacency is a parameter-like constraint, then it iscoded in UG. The crucial point, however, is that the principle has differ-ent effects in different languages: all that is universal is that there is adescriptive constraint on what nodes a moved element may cross in onecycle. So, the presence of precisely what feature of symbol strings is sup-posed to make a language easier to learn? The experiment was whollybased on clear cases of English Subjacency violations; so, Ellefson andChristiansen appear to commit themselves to the claim that English iseasier to learn than Italian or Chinese languages.

PROXYTYPES AND LINGUISTIC NATIVISM 89

It is, then, independent of the status of NAT and UNAT vis-a-vis Subjacency, quite unclear what, if anything, either of the studiesmight tell us about natural language and Subjacency in particular.This is not surprising, for, as mentioned above, language acquisitiondoes not proceed by a regime of error correction (see below). Whatthe studies might show us, which would not be surprising, is thatit is less costly (computationally speaking) to determine local rela-tions between items than distant ones. It might be that the studiessimply reflect this, with a resulting bias against Subjacency viola-tions. It bears emphasis that the results of the studies were far fromabsolute, for example, the overall student classification rate of novelstrings was 59–54% in favour of NAT; for the networks the differ-ence was slightly narrower. Thus, without a differentiation betweendifferent types of movement (argument and adjunct), which was nottested for, all we can glean is a bias against long movement, orperhaps movement that crosses a certain repeated symbol sequence.Does this give us some ground for thinking that Subjacency effectsmay be explained by sequential relations? As we saw, Prinz reads theresults as showing that Subjacency might be simply to do with shortvs. long movement. We might consider this suggestion as a naturalgeneralisation of the intent of the studies, i.e., even though Prinz,Ellefson and Christiansen are unaware of the complex variability ofSubjacency, we might say that their hypothesis is that Subjacencyeffects in general are due to a prohibition on long movement, whichis differentially realised according to different sequences of symbols.It is difficult not to read the studies in this way, for the test stringsreflect nothing other than sequential relations of distance and pairedrepetition (i.e., the strings have no hierarchical structure). Let us,then, consider this suggestion.

Subjacency rules out certain long distance movement. So, ifa system is, in general, biased against long distant movement,because, say, it favours local relations for non-linguistic computa-tional (search) reasons, then it will, perforce, be biased against thoselong distant relations that constitute Subjacency violations. It mightnow be thought that Subjacency in general makes for a pattern inwhich one does not have to search far into a new sentence to seethat it belongs to a Subjacency language – a veritable boon to astatistical learning mechanism. Yet this is simply not the case: alanguage is free to contain constructions of long distance move-ment. That is, Subjacency constrains certain long distance move-ment just because it constrains movement, not because it constrains

90 J. COLLINS

long movement. Otherwise put, Subjacency might indeed be con-strued as a constraint on computational search (Chomsky 2000a),but the constraint does not issue from sequential demands of thekind Ellefson and Christiansen tested for, but from the domainsformed by head–argument relations, i.e., hierarchical syntactic struc-ture. Prinz, Ellefson and Christiansen are perhaps confusing twoquite separate issues. Generally, we do indeed have a bias againstlong distant movement, in that we rarely utter such sentences, nodoubt because they are relatively more difficult to parse. It mighteven be that the language acquiring child relies on short movementconstructions, although there is no data to suggest that this is so.17

The point remains, though, that mature competence supports indefi-nitely long movement. The initial formulation of Subjacency had, infact, as much to do with licensing long distance movement as it hadwith restricting it (Ross 1967, Chomsky 1973). This double aspectapplies to principles generally, and is enshrined in Rizzi’s generalmovement constraint of relativized minimality.18 An identification ofa potential bias against long movement, therefore, has, in itself, verylittle to do with Subjacency; we certainly cannot predict Subjacencyfrom a performance bias against long movement. The idea that theprinciple might reduce to such a bias is merely a signal that theprinciple has been misunderstood. Let us see this by way of someexamples.

Consider:

(6) a. Bob asked whati [DP the film with all those Russianswandering about, praying, killing and casting iron bells[CP that [TP they saw just the other night at the localart house]]] was supposed to mean <what>.

b. Whyi do [TP you think [CP that [TP Bob reckons[CP that[TP Bill believes [CP that [TP Russian films are depressing<why>]]]]]]]?

These sentences are perfectly acceptable and do not constitute Sub-jacency violations.19 First consider (6)a. Patently, the distance thewh-object of mean may move here is unbounded, for we can con-tinue to embed complements in the bracketed DP. Further note thatthese complements may otherwise act as barriers to movement, i.e.,they may be TPs and DPs. The acceptability of (6)a is due to thebracketed DP being a so-called island to movement, i.e., phrases canmove within the island (e.g., the head of the relative clause fromthe object position of see) and across it (e.g., what), but they cannot

PROXYTYPES AND LINGUISTIC NATIVISM 91

move from inside the island out. Subjacency was initially proposedas part of the explanation of such island data. Now contrast (6)awith (7):

(7) a. You met a man that wrote what?b. ∗What did you meet a man that wrote?

(7)a is a perfectly acceptable echo question. The relatively short-movement (linearly speaking) of the wh-object, however, results in(7)b – a clear Subjacency/island violation. Here the wh-object movesoutside of a relative clause, and crosses three bounding categoriesin one leap (right to left: TP, DP, TP). Per relativized minimality,the wh-object is restricted to SPEC landing sites and there are noSPEC landing sites available other than the highest [SPEC CP] (wetake (i) the SPEC of CP to be occupied by the empty antecedentof the trace/copy that has amalgamated with the overt COMP, and(ii) the SPEC of TP to be occupied by DP you). In contrast, herewe see why (6)a is acceptable: there is a landing site – the SPEC ofthe head C that – to which what may first move, crossing just thesingle TP, then from there it may move to the highest [SPEC CP],crossing just one DP. Thus, the island phenomena are explicated bySubjacency constraints on movement. The same remarks hold for(6)b, with the why adjunct cycling through the vacant SPEC posi-tions of the clause heads. Again, the movement is long distant, butno violations arise. The relevant difference between (6) and (7)bis one of structure, not distance traversed. Whatever the status ofSubjacency, it does not reduce to a bias in favour of short move-ment.20

The interesting issue is why English has certain locality con-straints, and exhibits related phenomena, such as parasitic gaps,while structures in other languages have distinct constraints. Thenetwork results do not so much as intimate answers here. They donot even indicate that Subjacency-like constraints could be acquiredby pattern identification. This is because Subjacency is a linguistic-specific principle; its violations are not even identifiable, still lessexplainable, by appeal to symbol sequences. Like other linguis-tic principles, recourse to sui generis interfaces of the faculty isnecessary. What holds for Subjacency holds throughout linguis-tic structure. We shall see another example of this in the nextsection.

92 J. COLLINS

4.2. ‘Structure-Dependence’: The Case of Aux-Sub Inversion

Prinz’s second burden is to deflate the common claim within lin-guistics that grammatical ‘rules’ are inherently structure sensitive.The thought here is that the structure in question is understood tobe essentially abstract, not dependent on, or recoverable from, theoccurrent properties of the primary linguistic data – pld. Thus, if theclaim holds and, as we are assuming, the generalisations of linguis-tic theory are accurate, then no empiricist model of the kind Prinzseeks to defend will be adequate to explain the linguistic phenomenathat are currently accommodated by the abstract ‘rules’. In essence,the problem for the empiricist here is just another version of the onerehearsed above: the empiricist is obliged to reduce abstract prin-ciples to ones definable in terms of general processes and availabledata.

Prinz targets Chomsky’s long-standing example of auxiliary-sub-ject inversion (aux-sub) in the formation of polar interrogatives(Yes–No questions). The inversion is familiarly witnessed in mono-clausal pairs such as the following:

(8) a. Bob is tallIs Bob tall?

b. Bob will sing.Will Bob sing?

c. Bob can swim.Can Bob swim?

Here we see the auxiliary verb (for present purposes: the perfecthave, the copula be, and modals) invert position with the subjectDP in the move from declarative to interrogative mood. Competentspeakers of English, we may say, in some sense know an ‘aux-sub’rule, where this knowledge constitutes the speakers’ understandingthat the members of the pairs are related as declarative to interrog-ative.21 Prima facie, the rule at issue appears to be non-structural,i.e., it can be expressed without recourse to syntactic categories. Forinstance, a rule that certainly works for the above cases is:

(S-I) To form an interrogative from a declarative, move the first‘is’, ‘will’, can’ . . . to the front of the sentence.

This rule relies merely on listing the morphological forms ofthe English auxiliary system (which is finite) and linear order. Noabstract categories are mentioned. Thus, it appears that a childcould easily learn the rule without innate information: all that it

PROXYTYPES AND LINGUISTIC NATIVISM 93

requires is there in the pld. The entertained rule, however, is trans-parently inadequate. One problem is that a sentence can feature asmany auxiliary verbs as it does clauses, and the movement of anyold auxiliary will not produce a coherent interrogative. An adequaterule must target the correct auxiliary. Consider:

(9) a. Bob, who is tall, can beat Bill.Can Bob, who is tall, beat Bill?

b. Bob can beat Bill, who is short.Can Bob beat Bill, who is short?

c. Bob, who is tall, can beat Bill, who is short.Can Bob, who is tall, beat Bill, who is short?

In a. the final auxiliary moves, in b. the first, in c, the middle one.Patently, any rule that merely instructs us to move a piece of mor-phology will not generalise, for the piece which should move canoccur in any ordinal position in a sequence of auxiliaries. In simpleterms, syntax can’t count! An alternative presents itself:

(S-D) To form an interrogative from a declarative, move thematrix auxiliary to the position immediately preceding the subjectDP.

Suffice it to say, this is highly informal, although still quiteabstract. S-D instructs us to find the main clause, find the auxiliarythere that carries tense (as opposed to perfect or progressive aspect;e.g., may in Bill, who is tall, may have been sad), and, effectively, tomove it to the front of the sentence.22 If, then, the child does fix-ate on a ‘rule’ at least as abstract as S-D, it requires sensitivity to astructure such as (10):

(10) [[[Bob][who is tall]] [can beat Bill]]

The crucial point here is that such structure appears not to be evidentin the pld in two respects. Firstly, the structure is hierarchical, notmarked either morphologically or linearly. Secondly, children appearto be able to fixate on the correct ‘rule’ from the mono-clausal cases,which are consistent with the morphological/linear rule S-I. If thisis right, then children uniformly target a structure sensitive rule onthe basis of data that does not necessitate it, i.e., they understand thepld via an abstract structure that is not confirmed by the data. As itstands, a rational conclusion is that sensitivity to linguistic structureis native to the human mind/brain. This sort of argument is the onePrinz seeks to spike.

94 J. COLLINS

About considerations along the above lines, Prinz (2002, p. 208)writes: ‘Chomsky concludes that the [language acquiring] childmust be using innate, language specific mechanisms. This conclu-sion is hasty’ [my emphasis]. The conclusion is indeed hasty; infact, absurd. The only source Prinz cites is Chomsky (1975) whosefirst chapter contains a ‘popular’ discussion of aux-sub inversion(ibid., pp. 30–33). The discussion is presented via the device of arational alien scientist who has come to study human cognition.Being unburdened by 20th century behaviourist/empiricist dogma,the scientist, when observing acquisition of polar interrogative com-petence, is as predisposed to seek an innate explanation as anempiricist one. There is, however, no ready empiricist explanation,but there is an innate one, and, ceteris paribus, it does explain thephenomenon: why children target a rule that goes well beyond theapparent data available to them and why they do not make mistakeswith constructions such as those in (9) when they are only exposedto those in (8). For Chomsky, ibid., pp. 32–33, the alien wouldjudge that ‘the only reasonable conclusion is that . . . the child’smind contains the instruction: Construct a structure-dependent rule,ignoring all structure-independent rules. The principle of structure-dependence is not learned, but forms part of the conditions forlanguage learning.’ The alien, however, is a scientist, not a metaphy-sician or a normative epistemologist; his conclusion is a hypothesisthat he seeks to ‘corroborate’ by investigating other constructions tosee if the same pattern is witnessed. The alien scientist, of course, isjust a depiction of how we should approach the issue. Rather thanhasty conclusions, we have a general hypothesis that is supported inone case at the expense of an alternative empiricist one.

Given Prinz’s misunderstanding of the argument, we may expecthis response to it to be similarly off-centre. Indeed it is. He claimsthat while the S-D rule is correct and more complex than the lin-ear rule S-I, it does not follow that the child must be working withthe abstract concepts of S-D rather than the linear/morphologicalones of the inadequate rule S-I. The suggestion is this: ‘the childconsiders her options [for movement] by considering small groups ofwords at a time and assesses the consequences of moving’ a givenauxiliary within that group (Prinz 2002, p. 209). So, consider thesentence, Bob who is tall can beat Bill. The child, we are to suppose,considers, say, the group ‘Bob who is tall’ and reasons that mov-ing the ‘is’ results in the ‘statistically bizarre phrase’ ‘Bob who tall’,while moving the ‘can’ from ‘can beat Bill’ results in a ‘perfectly

PROXYTYPES AND LINGUISTIC NATIVISM 95

familiar phrase’ ‘beat Bill’. So, the first choice involves a ‘net lossin statistical probability’, and this ‘would not be the ideal choicefor a system that worked by repeating familiar patterns’. Thus, wecan account for aux-sub competence without appeal to abstractconcepts.

Much has been written for and against such empiricist modelsthat see language acquisition as a mere instance of general statis-tical learning. Rather than go over the familiar debates, let us justconcentrate on the proposal at hand. We shall see that it both begsthe crucial issue and is demonstrably inadequate, i.e., it makes falsepredictions.

Firstly, the core idea of the suggestion is that the child seeksfamiliar patterns and so, if faced with some number of aux move-ment options, will select the aux whose movement will issue in theleast disruption, where disruption amounts to groups of words thatinclude the extraction site which are unfamiliar, improbable. Thenotion of probability here obviously must pertain to just what thechild is aware of, rather than some abstract notion of grammatical-ity. The suggestion thus presupposes that the child has some ‘data-base’ of constructions drawn from its pld upon which it may define,say, a frequency distribution. Without such a database and metricit would make no sense to think of a child finding one construc-tion more or less probable, and so it would make no sense to thinkof the child as assessing the probability of the results of movementoptions. But the database hypothesis, if it is to do the work asked ofit in the particular case of aux-sub inversion, begs the general ques-tion at issue.

The question is how the child can determine which are the cor-rect constructions in the first place. Assessments of probability willnot help here, for any given sequence of words is initially improb-able, and remains so, idioms and the like apart. Unless we imag-ine the child to be parrot-like – restricted to imitating what it isexposed to – it must, at some stage, generate ‘correct’ construc-tions that are novel, such as bi-clausal polar interrogatives. No-oneseriously doubts this; the problem for the empiricist is to spec-ify the (inductive) mechanism that can operate on the pld to pro-duce the correct grammar of which novel constructions are a part,without recourse to abstract linguistic concepts. The problem maybe posed by asking: Why isn’t the child’s database just a rag-bag of constructions that can be segmented along every possibledimension?

96 J. COLLINS

We can see the force of this query by considering a familiar sub-ject-object asymmetry, under which transitive verbs and their objectsare understood as ‘units’, but subjects and verbs are not. Thus: (i)the infinitive form lacks a (spelt out) subject, but there is no tran-sitive construction that essentially lacks an object; (ii) passive con-structions may lack subjects; they cannot lack objects: A candidatewas elected/*Elected by them; (iii) subject positions may be filledwith expletives; object positions cannot be; (iv) objects and VPs mayovertly move; subjects cannot: Oranges, I like, but not apples/*Likeoranges but not apples, I; (v) VPs and their objects undergo ellip-sis; subjects cannot: Bob loves Bill, and so does Mary/*Bob loves Bill,and does love Bill too; (vi) VPs may work as semantic units whichdetermine the semantic role of the subject; the converse relation isnot witnessed: Bob broke his arm may be read with Bob as the agentor the patient of break, with the VP meaning invariant.

The crucial feature of this asymmetry is that it is perfectly gen-eral – every transitive construction exhibits it – but is only markedstructurally, in terms of movement and interpretational possibilities.Thus, if the child is to segment the objects in its database so thatthey might properly serve as the basis of its assessment of move-ment options, then it must already, inter alia, have an understand-ing of the movement options that constitute the asymmetry. But thisjust means that a child following Prinz’s strategy cannot discover theasymmetry: the child would be asked to employ movement criteriato form the basis of the assessment of movement options.

Might not Prinz argue that this is a virtuous circle, with a boot-strapping of the structural form of the simple sentences from move-ment options? This is a non-starter, for the movement results in‘groups’ of words that are only witnessed via such movement; e.g.,‘like but not apples’, ‘so does Mary’. Hence, there is no movement-independent basis for accepting the ‘deviant’ groups, and so none foraccepting the movement. In effect, then, Prinz presumes that the childhas somehow managed to infer correct phrase structure from an ini-tial exposure to a simple subset of the language, and can then workout the correct aux-sub rule by excluding movement options that gen-erate non-phrases. Even if this method would give the correct results,which it doesn’t (see below), what we presently want to know is howthe child can fixate on phrase structure tout court, exemplified by theaux-sub case, without appeal to abstract structure.

Secondly, Prinz’s suggested mechanism does not generalise to thefull range of cases of aux-sub inversion. So, even if the suggestion

PROXYTYPES AND LINGUISTIC NATIVISM 97

were plausible for the cases so far looked at, it can not be the strat-egy children use in general. Consider cases where the declarativedoes not feature an auxiliary verb, e.g.,

(11) Ducks eat.

Here there is no auxiliary verb to invert with ‘ducks’, but a polarinterrogative clearly should be available, one which may be answeredby ‘Yes, ducks eat’. The only possible option available appears tobe to move the main verb itself: Eat ducks. Prinz’s suggestion offersno interdiction here. The ‘group’ is a perfectly good verb phrasethat is, if you like, probable. Indeed, with the right intonation/stressit can be an imperitival or interrogative sentence. Yet it cannot bethe interrogative form of (11). In fact, we know that English, unlikeFrench, does not attest to overt (main) verb movement. But ourchild does not know this; as far as she’s concerned, if you will, she islearning language, not English. In short, checking (11) for improb-able clusters of words post movement will not enable the child tolearn the correct form. The correct form, of course is Do ducks eat?,which, contrary to appearances, is a result of aux movement. Thiscan be seen where the intransitive is not the bare verb stem:

(12) a. That duck eats.b. Does that duck eat?

Here the inflection features under the T(ense) head are morphologi-cally realised on the verb in the declarative. In the interrogative case,we take the inflectional features under T to be morphologically rea-lised under CP head, where they are supported by the dummy verbdo, with the verb left in its bare infinitive form. Just so with (11),where the inflectional element is phonologically null. This wouldexplain, inter alia, why Does ducks eat? is deviant: if the inflectionisn’t overtly spelt out on the verb, as in a., then it must be rea-lised under CP. It would be of no benefit to Prinz to doubt the exis-tence of phonologically null elements, for the confounding data areindependent of them: such elements serve to explain the data, whichPrinz’s suggestion does not do.

Thirdly, Prinz’s suggestion does not even work for bi-clausalcases. Consider:

(13) Bob who can beat Bill is tall.

98 J. COLLINS

According to Prinz’s hypothesis, the child selects for movementthe auxiliary that produces, relative to other options, the morefamiliar/probable pattern. Let us test this. There are two auxiliaryverbs potentially apt for movement: can and is. The movement of‘can’ produces the following ‘groups’ in which the extraction siteoccurs minus the moved element:

(14) Bob who beat Bill is tall (sentence)Bob who beat Bill (DP with relative)who(ever) beat Bill is tall (sentence)who beat Bill (sentence)

In short, moving ‘can’ leaves no deviance in its wake. Now considerthe movement of ‘is’:

(15) *Bob who can beat Bill tall*who can beat Bill tall*can beat Bill tall*beat Bill tall*?Bill tall

The only potentially non-deviant result is the small clause Bill tall,as in, say, Bob considers Bill tall. But this, as Prinz might say, is animprobable construction. Besides which, the small clause requires aclause taking verb, which beat is not. It might be thought that thedeviance here is semantic rather than syntactic – one cannot beat aperson tall, but one can beat a panel flat. For our purposes, this dis-tinction is irrelevant; all that is important is deviance – hence low‘probability’ – whatever its source might be. So, if Prinz’s sugges-tion is right, after children have done their calculations, they shouldmove ‘can’, but this, of course, they do not do.

(16) a. *Can Bob who beat Bill is tallb. Is Bob who can beat Bill tall?

This is striking. With 16a, deviance only arises with the post-movementconstruction, with all other phrases inclusive of the extraction site beingperfectly coherent. The opposite is true for (16)b, with all the phrases inthe wake of the auxiliary being deviant (small clauses apart.) Here’s therub. The child, we are assuming, is trying to work out the correct rule,and so has neither sentence of (16) in its database. Without the benefit ofthe marked comparison of (16), however, the child who follows Prinz’sstrategy will arrive at the wrong result. On the other hand, a ‘rule’ which

PROXYTYPES AND LINGUISTIC NATIVISM 99

targets the matrix inflection will move ‘is’ and ignore, as it were, the dis-ruption its extraction leaves in its wake. As Chomsky, quoted above, putit, this is because the child does not so much as consider structure-inde-pendent rules. Further, much like the case of the lack of English (overt)verb movement (witnessed in French), the movement of ‘can’ out of therelative clause will constitute a Subjacency violation (not witnessed inItalian). Thus, if Subjacency is a parametric value of English, then thechild will not so much as ‘think’ of moving ‘can’, regardless of any netprobability function it might be trying to work out.

Fourthly, Prinz’s suggestion is at odds with the experimentaldata. It was claimed above, after Chomsky, that at no stage do chil-dren make errors consistent with a structure-independent rule; thatis, they invariably move the matrix inflection (whether auxiliary orverb stem affix). If this is so, then it reasonably follows that chil-dren are not relying on the pld, for children have many varied databut, ex hypothesi, all arrive at the same understanding. Conversely,if children were relying on the pld, then one would predict variation,which is not witnessed. The experiments and back-up studies onaux-sub inversion are striking.23 The results, for our purposes, arethat not a single child tested either moved the ‘wrong’ inflectionalelement under elicitation or found such a construction to be accept-able. What is more, while the children did make errors, they werephonological/morphological (relative to English), not structural. Forexample, it is not uncommon, and was attested in the studies, forchildren to ‘double’ a moved element, so that it occurs at bothextraction and landing sites. This is witnessed with medial-wh, as inWhat do you think what’s red? Similarly, errors of the form, Is theman who is beating a donkey is mean?, were witnessed. (This erroris consistent with a structure-independent rule, but if the childrenwere doubling on such a rule, then one would predict errors suchas Is the man who beating a donkey is mean? – no such errors wererecorded.)

Two points. Firstly, some languages, in effect, do allow dou-bling (e.g., Gaelic and Japanese), and so, relative to UG, i.e., whatis native to the species, the doubling errors of English childrenmay be understood to be performance based. After all, no-one hasever thought that children don’t make errors in acquiring their lan-guage (adults make errors); the relevant issue has only ever beenwhether children typically make errors inconsistent with UG. Sec-ondly, according to the independently motivated ‘copy theory’, allmovement is a process of copying (repetition) the ‘moved’ item

100 J. COLLINS

higher up the structure. On this view, the errors are merely onesof ‘forgetting’ that the lower head should not be phonologicallyspelt-out; it is not an error of moving the wrong item or insertingan item otherwise unrequired.

Any such hypotheses as these attract doubts about controls, theoffering of alternative interpretations, and counter-studies. Such isthe way in any science. Still, if the results are remotely on the rightlines, then Prinz’s suggestion is a non-starter. First off, as we sawjust above, the hypothesis that children work out the cost of localdisruption caused by movement matched against a database of com-mon phrases predicts that children will make certain errors. Thecited studies indicate that these errors are never made; that is, Prinz’shypothesis has false empirical consequences. Even if we were to setthis aside, Prinz’s hypothesis fails to explain any of the results; forexample, why are errors in general rare? Prinz’s hypothesis predictsa high error rate during a child’s collection and marking of phrasesas ‘familiar’, which then falls off as the database becomes suffi-ciently large to handle likely novel cases. But this is not the pat-tern of language acquisition: at no stage are children randomly inerror. Also: Why should certain errors occur, but not others? As faras Prinz’s hypothesis goes, all types of error should occur equally, atleast initially; certainly the hypothesis lacks the resources to explaindoubling errors. Indeed, since adult doubling is as rare to makeno difference as aux-sub errors, i.e., children are typically exposedto neither, if the hypothesis predicts the absence of aux-sub errors,the hypothesis should perforce predict the absence of doubling –another false consequence.

5. CONCLUSION

In sum, then, we see that the essential abstractness of linguis-tic structure that militates for nativism against empiricism remainsintact. Obviously, much more could be said in both riposte andcounter-riposte to the above arguments. They suffice to show, how-ever, that proxytypes are not apt to begin to explain the structureof linguistic competence. Whatever other virtues the theory pos-sesses, it fails to accommodate perhaps the central aspect of humanconceptuality.

PROXYTYPES AND LINGUISTIC NATIVISM 101

NOTES

1 The point here is not that we don’t know a language; any such claim wouldbe a silly piece of linguistic legislation. The point, rather, is that what we knowwhen we know a language – the object of linguistic theory – is not a set ofpropositions.2 See Gordon (1986), for the seminal statement of the simulation theory; fordebate, see Davis and Stone (1995a, 1995b).3 It bears noting that the sui generis nature of linguistic competence is premisedupon the uniqueness of the interface ensemble, not on the faculty comprisingessentially linguistic rules. For Chomsky (2001b, 2002, 2004), a principled expla-nation of linguistic competence will view the faculty proper – syntax – as beinga non-language specific combinatorial system; its unique features arise from itsproduction of <PHON, SEM> pairs.4 Erroneously, Prinz (ibid., p. 322, n. 4) attributes such a view to Fodor (1983).Fodor’s view is that there is a parser module (a language perception device)that employs UG, with UG itself not being definable on perceptual grounds. SeeFodor (1983, p. 134, n. 28; 2000, pp. 77–78). Ultimately, I don’t think that thisposition is coherent, but it is certainly not a species of empiricism.5 For the classic discussion of such cases, see, Yngve (1960), Miller and Chomsky(1963), and Chomsky (1965, chp. 1, Section 2).6 My thanks go to an anonymous referee for the analogy with Escher.7 The problem arises with the nominative pronoun I serving as the subject of theelliptical clause. The preposition than requires an accusative object (Bill, them,me, etc.) for a comparison to hold; that is, the structure is read as meaningthe same as Many more people have been to Paris than me. Here, than me is anadjunct licensed by more. In the actual structure, than appears as a phrasal coor-dinator and so fails to introduce a comparison class for more. The ‘illusion’ isenabled by acceptable structures such as She has been to Paris more than I have(been to Paris).8 TP = tense phrase; DP = determiner phrase; CP = clausal phrase. In the pres-ent cases, at least, the labels we assign to the constituent structures are far lessimportant than the bracketing of the constituents. Labels differ from theory totheory.9 I do not here suggest that such creativity is restricted to language; it appearsto be a general feature of thought. An anonymous referee pointed out that itis also a feature of what we might call perceptual imagination, a point recogni-sed by 18th century empiricists. Again, however, the notion of perception hereoperative is the automatic, dedicated module notion that Prinz appears to com-mend. By definition, the operations of such modules are not free. More generally,the deliverance of our senses appear to be outside of decision: we cannot, say,decide to see what we see. The extent to which our senses our open to decisionis perhaps the extent to which our perceptual mechanisms are not encapsulated,dedicated devices.10 It is not clear from Prinz’s discussion whether this idea of an emergent fac-ulty is intended to be read diachronically or a synchronically. The only argu-ment Prinz provides suggests that he has the former construal in mind – seebelow.

102 J. COLLINS

11 ‘Rules’ have not played a substantive role in generative linguistics for overtwenty years The demise of phrase structure rules was brought about by theirredundancy given the requirement that words in the lexicon must be marked forsub-categorization and selection as idiosyncratic features; see Lasnik and Kupin(1977), Chomsky and Lasnik (1977), Stowell (1981), and Chomsky (1981). Like-wise, given an enriched lexicon, and constraints on how lexical features maybe realised, transformation rules were seen to be at best redundant, and werereplaced with general movement operations such as ‘move/affect α’ (Chomsky1981; Lasnik and Satio, 1992) The minimalist program extends this general trendin ways which need not detain us.12 It might be, of course, that the language faculty is arbitrary in the sense thatif evolution had proceeded in a different way, then UG would be different fromas it is now. Chomsky for one rejects even this modest sense of ‘arbitrary’, spec-ulating that the character of the faculty arises from physical constraints, not onesof natural selection (see, e.g., Chomsky 2000b)13 Chomsky (2000a, 2001a) has proposed that UG operates by a phase impene-trability constraint (PIC) that may explain subjacency effects. Roughly, PIC meansthat at certain stages or phases of a derivation (Chomsky speculates just full ver-bal projections and full clauses), complement structure is spelt out to the respec-tive interfaces and so is no longer there to move higher up the structure. Forthe purpose of the following, we may eschew this recent development, for Prinz’sarguments do not fail just in its light.14 Prinz appears to have in mind the classic notion of Subjacency, but he mis-states it: the crucial constraint is that the moved element does not cross morethan one bounding node in one cycle. For the central seminal discussions of Sub-jacency, see Chomsky (1973, 1981) and Rizzi (1982).15 Curiously, Ellefson and Christiansen presume that Subjancency is understoodto be an adaptation. It is difficult to see what could conceivably be adaptiveabout Subjancey. Issues of phylogeny appear to be wholly orthogonal to the issue.They perhaps become relevant if Subjacency is misunderstood as a ‘rule’ to dowith sequential learning. See below.16 Of course, they might, but as a matter of fact they don’t.17 Lightfoot (1991) conjectures that all parameters are triggered by 0-embeddedclauses, which, ipso facto, means that they are not triggered by clauses that fea-ture long distant movement. The developmental issue, however, is to do withclauses, not movement; clauses can be multiply embedded without featuring overtmovement at all.18 Rizzi’s (1990) notion of relativized minimality constrains movement such thatX0s (heads, such as modals) must land at head positions and cannot skip headpositions, and XPs (complements, such as wh-phrases), must land at SPEC posi-tions and cannot skip Spec positions – see below.19 Ellefson and Christiansen give no indication that they controlled for long dis-tant movement consistent with Subjacency. Further, they understand Subjacencyto pertain just to wh-movement, which is in fact false: it applies to movementgenerally.20 According to Chomsky’s (2000a, 2001a), PIC, the data will be explained inmore or less the same way, with the wh-items moving to the edge of their phrasesavailable for still higher movement, with the complements of the phases spelt out.

PROXYTYPES AND LINGUISTIC NATIVISM 103

21 A rule of aux-sub inversion was a key transformational rule of early genera-tive grammar. Linguistic theory has long since dropped any such rule. We mayretain talk of ‘rules’ simply as a convenient description for an aspect of whateverconstraint the speaker/hearer understands that makes him competent with polarinterrogatives and their corresponding declaratives.22 More contemporary treatments generally take the auxiliary to raise from headT(tense), with the subject at [SPEC TP], to head CP, attracted by a morpholog-ically-zero interrogative feature. The matter is still very much an open one. Forour purposes, the data themselves are more important than the theory.23 See Crain and Thornton (1998, chp. 20), for an overview of the research onaux-sub inversion. For a defense of these studies against Cowie’s (1999), riposte,see Collins (2003).

REFERENCES

Chomsky, N.: 1959, ‘A Review of B.F. Skinner’s Verbal Behaviour’. Language 35,26–58.

Chomsky, N.: 1965, Aspects of the Theory of Syntax. MIT Press, Cambridge, MA.Chomsky, N.: 1973, ‘Conditions on Transformations’, in S. Anderson and P. Kipa-

sky (eds.), A Festschrift for Morris Halle, Holt, Rinehart and Wiston, New York.pp. 232–286.

Chomsky, N.: 1975, Reflections on Language. Fontana, London.Chomsky, N.: 1981, Lectures on Government and Binding. Foris, Dordrecht.Chomsky, N.: 1995, The Minimalist Program. MIT Press, Cambridge, MA.Chomsky, N.: 2000a, ‘Minimalist Inquiries: The Framework’, in R. Martin,

D. Michaels, and J. Uriagereka (eds.), Step by Step: Essays on Minimalist Syntaxin Honor of Howard Lasnik, MIT Press, Cambridge, MA, pp. 89–155.

Chomsky, N.: 2000b, The Architecture of Language, Oxford University Press, NewDelhi.

Chomsky, N.: 2001a, ‘Derivation by Phase’, In M. Kenstowicz (ed.), Ken Hale: ALife in Language, MA: MIT Press, Cambridge, pp. 1–52.

Chomsky, N.: 2001b, ‘Beyond Explanatory Adequacy’, MIT Occasional Papers inLinguistics 20, 1–28.

Chomsky, N.: 2002, ‘An Interview on Minimalism’, In Nature and Language, Cam-bridge University Press, Cambridge, MA, pp. 92–161.

Chomsky, N.: 2004, The Generative Enterprise Revisited. Mouton de Gruyter, Berlin.Chomsky, N. and H. Lasnik: 1977, ‘Filters and Control’, Linguistic Inquiry 8, 425–

504.Collins, J.: 2003, ‘Cowie on the Poverty of Stimulus’, Synthese 136, 159–190.Collins, J.: 2004, ‘Faculty Disputes’. Mind and Language 19, 503–533Cowie, F.: 1999, What’s Within? Nativism Reconsidered. Oxford University Press,

Oxford.Crain, S. and R. Thornton: 1998, Investigations in Universal Grammar: A Guide to

Experiments on the Acquisition of Syntax and Semantics, MIT Press, Cambridge,MA.

Culicover, P.: 1999, Syntactic Nuts: Hard Cases, Syntactic Theory, and LanguageAcquisition, Cambridge University press, Cambridge.

104 J. COLLINS

Davis, M. and T. Stone (eds.): 1995a, Mental Simulation: Evaluations and Explana-tions, Blackwell, Oxford.

Davis, M. and T. Stone, (eds.): 1995b, Folk Psychology: The Theory of Mind Debate,Blackwell, Oxford.

Ellefson, M. and M. Christiansen: 2000, ‘Subjacency Constraints without UniversalGrammar: Evidence from Artificial Language Learning and Connectionist Model-ing’, in Proceedings of the Twenty-Second Annual Conference of the Cognitive Sci-ence Society, Lawrence Erlbaum, Mahwah, N.J., pp. 645–650.

Fodor, J.: 1983, Modularity of Mind, MIT Press, Cambridge, MA.Fodor, J.: 2000, The Mind doesn’t Work That Way: The Scope and Limits of Compu-

tational Psychology, MIT Press, Cambridge, MA.Gordon, R.: 1986, ‘Folk Psychology as Simulation’, Mind and Language 1, 158–171.Higginbotham, J.: 1987, ‘The Autonomy of Syntax and Semantics’, In J. Garfield

(ed.), Modularity in Knowledge Representation and Natural Language Understand-ing, MIT Press, Cambridge, MA, pp. 119–131.

Lasnik, H. and J. Kupin: 1977, ‘A Restrictive Theory of Transformational Gram-mar’. Theoretical Linguistics 4, 173–196.

Lasnik, H. and M. Saito: 1992, Move α, MIT Press, Cambridge, MA.Lightfoot, D.: 1991, How to Set Parameters: Arguments from Language Change, MIT

Press, Cambridge, MA.Marcus, G.: 2001, The Algebraic Mind, MIT Press, Cambridge, MA.Miller, G. and N. Chomsky: 1963, ‘Finitary Models of Language Users’, in R. Luce,

R. Bush, and E. Galtner (ed.), Handbook of Mathematical Psychology, Vol. III,Wiley, New York, pp. 419–492.

Prinz, J.: 2002, Furnishing the Mind: Concepts and their Perceptual Basis, MIT Press,Cambridge, MA.

Rizzi, L.: 1982, Issues in Italian Syntax, Foris, Dordrecht.Rizzi, L.: 1990, Relativized Minimality, MIT Press, Cambridge, MA.Ross, J.: 1967, ‘Constraints on Variables in Syntax’, Diss., MIT, Cambridge, MA.Samuels, R.: 2002, ‘Nativism and Cognitive Science’, Mind and Language 17, 233–

265.Stowell, T.: 1981, ‘Elements of Phrase Structure’, Diss., MIT, Cambridge, MA.Yngve, V.: 1960, ‘A Model and an Hypothesis for Language Structure’, Proceedings

of the American Philosophical Society 104, 444–466.

University of East AngliaSchool of PhilosophyNorwich NR4 7TJUKE-mail: [email protected]