Seeing Patterns and Learning to Do Things and what that has to do with language David Tuggy SIL-Mexico

Embed Size (px)

Citation preview

  • Slide 1

Seeing Patterns and Learning to Do Things and what that has to do with language David Tuggy SIL-Mexico Slide 2 Is the language faculty a black box? Is language something totally different from the rest of what we do in our minds? Is language something totally different from the rest of what we do in our minds? If not, how are they connected? What does what we do mentally in general tell us about language? If not, how are they connected? What does what we do mentally in general tell us about language? (And what does language tell us about our mental capacities and activities generally?) (And what does language tell us about our mental capacities and activities generally?) Slide 3 Cognitive Grammar (CG) claims that much that we find in language dovetails with what we know about other aspects of cognition. Cognitive Grammar (CG) claims that much that we find in language dovetails with what we know about other aspects of cognition. Language is amazing, but it is not totally different from or unrelated to the rest of our mental activities. Language is amazing, but it is not totally different from or unrelated to the rest of our mental activities. Non-linguistic cognition is pretty amazing too. Non-linguistic cognition is pretty amazing too. Slide 4 Outline I plan to divide this talk into two sections: I plan to divide this talk into two sections: I.We have amazing abilities to Acquire complex and flexible habits (learn to do things) Acquire complex and flexible habits (learn to do things) Compare and categorize experiences (see patterns and apply them in novel ways) Compare and categorize experiences (see patterns and apply them in novel ways) II. Understanding these abilities can clarify our understanding of what language is and how it functions. In particular We should be careful not to simplify by setting learning and the application of patterns against each other as if they were mutually exclusive. We should be careful not to simplify by setting learning and the application of patterns against each other as if they were mutually exclusive. Slide 5 We are good at learning to do things Think about what is involved in driving a car. Think about what is involved in driving a car. One way to assess it is to consider what it would take to teach a robot to do the same. One way to assess it is to consider what it would take to teach a robot to do the same. There are a host of more-basic skills that must be mastered, that are recruited into the skill of driving. There are a host of more-basic skills that must be mastered, that are recruited into the skill of driving. Slide 6 We are good at learning to do things For instance (on the perception side of things): For instance (on the perception side of things): Binocular visual perception: triangulation and depth perception. Binocular visual perception: triangulation and depth perception. Perception of 3-dimensional space and assessment of your position in it. Perception of 3-dimensional space and assessment of your position in it. Calculation of your, and your cars, motion, rate of motion, direction of motion, etc. Calculation of your, and your cars, motion, rate of motion, direction of motion, etc. Calculation of other vehicles, and pedestrians, etc. motion, rate, direction, etc. Calculation of other vehicles, and pedestrians, etc. motion, rate, direction, etc. Slide 7 We are good at learning to do things (still on the perception side of things): (still on the perception side of things): Perception of where the parts of your body are with respect to each other and to the immediate surroundings (like the car seat, gear shift, steering wheel.) Perception of where the parts of your body are with respect to each other and to the immediate surroundings (like the car seat, gear shift, steering wheel.) Hearing car sounds, horns, and road noise, and evaluation of their significance. Hearing car sounds, horns, and road noise, and evaluation of their significance. Seeing and recognizing details like turn signals and brake lights. Seeing and recognizing details like turn signals and brake lights. Knowing where your mirrors are, and how to interpret what you see in them. Knowing where your mirrors are, and how to interpret what you see in them. Slide 8 We are good at learning to do things More on the motor side of things: More on the motor side of things: Turning your head and eyes for optimum seeing. Turning your head and eyes for optimum seeing. Knowing how to move other body parts. Knowing how to move other body parts. Knowing how to move (without watching) to the controls of the car and then move the controls. Knowing how to move (without watching) to the controls of the car and then move the controls. Assessing what your motions will do. Assessing what your motions will do. Assessing and controlling how hard, fast and far the motions will/should carry. Assessing and controlling how hard, fast and far the motions will/should carry. Adjusting all of the above to the perceptions mentioned before, in real time. Adjusting all of the above to the perceptions mentioned before, in real time. Slide 9 We are good at learning to do things These (and other) skills are combined and coordinated in various sophisticated, highly flexible ways, into such higher-order skills as: These (and other) skills are combined and coordinated in various sophisticated, highly flexible ways, into such higher-order skills as: Starting and accelerating Starting and accelerating Steering to the right or to the left Steering to the right or to the left Staying on the road and in your lane Staying on the road and in your lane Shifting gears Shifting gears Slowing and stopping, not running into cars ahead Slowing and stopping, not running into cars ahead Changing lanes, passing other vehicles Changing lanes, passing other vehicles Obeying traffic signals and signs. Obeying traffic signals and signs. Slide 10 We are good at learning to do things These in turn are likely to form part of such ordinary activities as: These in turn are likely to form part of such ordinary activities as: Going to work. Going to work. Running an errand. Running an errand. Visiting your parents. Visiting your parents. The whole package is so complex that it takes considerable time to learn to do it well The whole package is so complex that it takes considerable time to learn to do it well We continue to upgrade and relearn these skills even after we have mastered them. We continue to upgrade and relearn these skills even after we have mastered them. Slide 11 We are good at learning to do things They become so ordinary to us that we can do them on autopilot, as it were, hardly paying any attention to what we are doing, much less taking in the full complexity of it all. They become so ordinary to us that we can do them on autopilot, as it were, hardly paying any attention to what we are doing, much less taking in the full complexity of it all. We adapt them with exquisite precision to new situations. We adapt them with exquisite precision to new situations. The consequences of doing them poorly are likely to be lethal. The consequences of doing them poorly are likely to be lethal. Yet we regularly and almost unthinkingly trust ourselves and thousands of others to do them right (or at least well enough). Yet we regularly and almost unthinkingly trust ourselves and thousands of others to do them right (or at least well enough). Slide 12 We are good at learning to do things Besides the perceptual and motor-related skills are more autonomous ones; Besides the perceptual and motor-related skills are more autonomous ones; E.g. evaluation of other cars motions and inferences about their drivers intentions, reading signs, judgment of the passage of time, calculation of odds, calculations about what speed to take a corner or a speed bump at, and comparison of the result with the anticipated situation, etc. etc. E.g. evaluation of other cars motions and inferences about their drivers intentions, reading signs, judgment of the passage of time, calculation of odds, calculations about what speed to take a corner or a speed bump at, and comparison of the result with the anticipated situation, etc. etc. Slide 13 We are good at learning to do things Language is a set of skills of this sort. Language is a set of skills of this sort. It involves coordinating hugely complex muscular, perceptual, and autonomous cognitive skills. It involves coordinating hugely complex muscular, perceptual, and autonomous cognitive skills. Slide 14 We are good at learning to do things Many levels of such skills are recruited as parts of other, higher-level skills. Many levels of such skills are recruited as parts of other, higher-level skills. We can say I wouldnt have believed that she would have said anything of the sort and understand it in context while hardly paying any attention to it, certainly without consciously realizing its enormous complexity. We can say I wouldnt have believed that she would have said anything of the sort and understand it in context while hardly paying any attention to it, certainly without consciously realizing its enormous complexity. Slide 15 We are good at learning to do things CG recognizes this, defining a language as a structured inventory of conventionalized linguistic units. CG recognizes this, defining a language as a structured inventory of conventionalized linguistic units. A unit is a skill we have mastered, a cognitive routine we can run through without having to put constructive effort into it. A unit is a skill we have mastered, a cognitive routine we can run through without having to put constructive effort into it. There are hierarchies upon hierarchies of such skills involved in our use of language. There are hierarchies upon hierarchies of such skills involved in our use of language. Slide 16 What is it that we learn to do? An important point is that though we learn these skills (linguistic or otherwise) from our experiences, they cannot be equated with particular actual experiences. An important point is that though we learn these skills (linguistic or otherwise) from our experiences, they cannot be equated with particular actual experiences. Not every neuron that fired will fire again in exactly the same way the next time we implement the skill (e.g. of perceiving a car braking on the road ahead, or of saying she wouldntve said it). Not every neuron that fired will fire again in exactly the same way the next time we implement the skill (e.g. of perceiving a car braking on the road ahead, or of saying she wouldntve said it). Slide 17 What is it that we learn to do? Rather these are patterns of activation. Rather these are patterns of activation. They permit a certain amount of slop or leeway. They permit a certain amount of slop or leeway. This slop or leeway is extremely important. This slop or leeway is extremely important. It is what permits us to recognize a new situation as one of a kind weve seen before. It is what permits us to recognize a new situation as one of a kind weve seen before. It also permits us to act in a new way that is nevertheless one of a kind we have done before. It also permits us to act in a new way that is nevertheless one of a kind we have done before. Slide 18 Extracting patterns CG talks about this in terms of all the units being schematic to one degree or another. CG talks about this in terms of all the units being schematic to one degree or another. Think Schema = Pattern. Think Schema = Pattern. There are higher-level (more abstract) and lower-level (more specific) patterns, and patterns of many kinds There are higher-level (more abstract) and lower-level (more specific) patterns, and patterns of many kinds It is patterns all the way down, as far as language is concerned. It is patterns all the way down, as far as language is concerned. Slide 19 Extracting patterns Schemas arise as experiences are compared and commonalities noted. Schemas arise as experiences are compared and commonalities noted. A schema embodies the commonalities of its subcases. A schema embodies the commonalities of its subcases. Consider the (already schematic yet still rather specific) concept of a pencil. Consider the (already schematic yet still rather specific) concept of a pencil. Slide 20 Extracting patterns As this concept is compared to the similar concept of a ballpoint pen, there are notable similarities. As this concept is compared to the similar concept of a ballpoint pen, there are notable similarities. Slide 21 Extracting patterns These similarities together constitute a schema (pattern) we can call writing instrument. These similarities together constitute a schema (pattern) we can call writing instrument. Slide 22 Extracting patterns This kind of relationship is traditionally represented in CG by an arrow from schema to subcase: A B means A is schematic for B; B is a subcase of A. This kind of relationship is traditionally represented in CG by an arrow from schema to subcase: A B means A is schematic for B; B is a subcase of A. Slide 23 Extracting patterns This relationship is by nature asymmetrical. This relationship is by nature asymmetrical. Every specification of the schema (pattern) holds true of the subcases; Every specification of the schema (pattern) holds true of the subcases; Not vice versa. Not vice versa. Slide 24 Extracting patterns There is an interesting sense in which either the subcase(s) or the pattern can be seen as basic to the other. There is an interesting sense in which either the subcase(s) or the pattern can be seen as basic to the other. Slide 25 Extracting patterns (1) The schema is extracted from, and comes into being because of, the subcases. In this sense the system is built bottom-up (1) The schema is extracted from, and comes into being because of, the subcases. In this sense the system is built bottom-up (2) Once it is estab-lished (learned), the schema legitimizes (sanctions) its subcases in top-down fashion. (2) Once it is estab-lished (learned), the schema legitimizes (sanctions) its subcases in top-down fashion. Slide 26 Applying patterns productively Particularly, a well-established schema can sanction novel structures. Particularly, a well-established schema can sanction novel structures. This includes partial sanction, where the subcase contradicts some of the schemas specifications. This includes partial sanction, where the subcase contradicts some of the schemas specifications. Slide 27 Extracting and applying patterns This is the way linguistic rules work under CG. This is the way linguistic rules work under CG. Rules are simply schemas. Applying a rule is letting the rule sanction a more specific subcase. Rules are simply schemas. Applying a rule is letting the rule sanction a more specific subcase. If the subcase is a new one, the rule is applied productively. If the subcase is a new one, the rule is applied productively. Like any other linguistic structures, rules are part of the language to the extent that they are learned conventionally (thus known and known to be known by all in the relevant group.) Like any other linguistic structures, rules are part of the language to the extent that they are learned conventionally (thus known and known to be known by all in the relevant group.) Once learned, they can sanction novel structures. Once learned, they can sanction novel structures. Slide 28 Learning and using patterns E.g. a kid may learn the words sugary and salty, and by comparing them, extract a schema FOOD-y. E.g. a kid may learn the words sugary and salty, and by comparing them, extract a schema FOOD-y. FOOD-y is a nascent rule, and the child may use it to invent new words like vinegary or orangey. FOOD-y is a nascent rule, and the child may use it to invent new words like vinegary or orangey. Slide 29 Learning & Patterns From all of this it should be clear that learning things (establishing units) and extracting schemas (making generalizations) and applying them are not mutually-exclusive activities. From all of this it should be clear that learning things (establishing units) and extracting schemas (making generalizations) and applying them are not mutually-exclusive activities. Slide 30 Learning & Patterns Everything we have learned (e.g. all the established structures in the diagram below) are generalizations (schemas, patterns). Everything we have learned (e.g. all the established structures in the diagram below) are generalizations (schemas, patterns). Slide 31 Learning & Patterns The schemas arent much good to us until we have learned them (mastered them as units). The schemas arent much good to us until we have learned them (mastered them as units). Once we have done so, we can use them to come up with new subcases, which may in turn be learnt. Once we have done so, we can use them to come up with new subcases, which may in turn be learnt. Slide 32 Learning & Patterns Different people can learn slightly different units, as long as their system is close enough to somebody elses that they can talk. Different people can learn slightly different units, as long as their system is close enough to somebody elses that they can talk. Vinegary or orangey may be learned, but if not, they are still understandable because they are sanctioned by the schema (rule) FOOD-y. Vinegary or orangey may be learned, but if not, they are still understandable because they are sanctioned by the schema (rule) FOOD-y. Slide 33 Learning & Patterns Knowing (having mastered) a schema and knowing (having mastered) a subcase are not mutually-exclusive propositions. Knowing (having mastered) a schema and knowing (having mastered) a subcase are not mutually-exclusive propositions. To the contrary, knowing the subcases helps extract the schema, and knowing the schema reinforces the subcases. To the contrary, knowing the subcases helps extract the schema, and knowing the schema reinforces the subcases. Slide 34 The traditional contrast between Regularity and Irregularity (Shifting gears downshifting??) (Shifting gears downshifting??) In most linguistics of the last 100 years, the contrast between what is regular and what is irregular is given enormous importance. In most linguistics of the last 100 years, the contrast between what is regular and what is irregular is given enormous importance. (Regular = according to rule, i.e. it fits a schema) (Regular = according to rule, i.e. it fits a schema) Slide 35 The traditional contrast between Regularity and Irregularity It is often considered important to maximize the regular and minimize the irregular in our models of language (so as to be scientific). It is often considered important to maximize the regular and minimize the irregular in our models of language (so as to be scientific). The problem is it has been assumed that only irregular things are learned. The problem is it has been assumed that only irregular things are learned. Slide 36 The traditional contrast between Regularity and Irregularity It is assumed that: Regular = systematic = predictable = produced by rule. It is assumed that: Regular = systematic = predictable = produced by rule. Irregular = idiosyncratic = arbitrary = learned Irregular = idiosyncratic = arbitrary = learned There is assumed to be a dichotomy between these two categories. There is assumed to be a dichotomy between these two categories. Slide 37 The traditional contrast between Regularity and Irregularity This difference is typically made into part of the architecture of linguistics. The regular/predictable is the province of grammar, the irregular is the province of the lexicon. This difference is typically made into part of the architecture of linguistics. The regular/predictable is the province of grammar, the irregular is the province of the lexicon. Slide 38 The traditional contrast between Regularity and Irregularity The system assumes nice neat modules. The system assumes nice neat modules. It is therefore considered important to establish if a particular kind of structure is to be accounted for in the grammar or in the lexicon. It is therefore considered important to establish if a particular kind of structure is to be accounted for in the grammar or in the lexicon. Slide 39 The traditional contrast between Regularity and Irregularity Structures are taken to be of fundamentally different sorts, and are processed in very different ways, if they are in the grammar, than if they are in the lexicon. Structures are taken to be of fundamentally different sorts, and are processed in very different ways, if they are in the grammar, than if they are in the lexicon. Slide 40 The traditional contrast between Regularity and Irregularity So, if a word like sugary, or a phrase like over the top, could be produced by rule, the presumption is that in fact it is produced by rule. So, if a word like sugary, or a phrase like over the top, could be produced by rule, the presumption is that in fact it is produced by rule. Slide 41 The traditional contrast between Regularity and Irregularity The schema is real, the subcases are epiphenomenal. The schema is real, the subcases are epiphenomenal. In effect, if you first learned the specific structure, as soon as you learn how to produce it by rule, you forget it and remember only the rule. In effect, if you first learned the specific structure, as soon as you learn how to produce it by rule, you forget it and remember only the rule. Slide 42 The traditional contrast between Regularity and Irregularity All members of the category alike are produced by the rule rather than learned. All members of the category alike are produced by the rule rather than learned. This is justified because it makes the model simpler and more predictive. (Science is all about prediction, right?) This is justified because it makes the model simpler and more predictive. (Science is all about prediction, right?) Slide 43 The traditional contrast between Regularity and Irregularity Now this was so obviously wrong for many words that the model was modified: morphology (word- formation) was distinguished from syntax (real grammar), Now this was so obviously wrong for many words that the model was modified: morphology (word- formation) was distinguished from syntax (real grammar), because (oversimplifying) so often many examples of a morphological rule had clearly been learned. because (oversimplifying) so often many examples of a morphological rule had clearly been learned. Slide 44 The traditional contrast between Regularity and Irregularity As a result, morphological structures and rules were taken to be different in kind from syntactic structures and rules; they were taken care of in a different module. As a result, morphological structures and rules were taken to be different in kind from syntactic structures and rules; they were taken care of in a different module. Slide 45 The CG view For CG, the dimensions of the predictability distinction are gradual, and though they tend to line up, they are not exactly parallel. For CG, the dimensions of the predictability distinction are gradual, and though they tend to line up, they are not exactly parallel. Slide 46 The CG view The distinction between what is produced by rule and what is learned is of especial interest to CG. The distinction between what is produced by rule and what is learned is of especial interest to CG. It is the only one of these four that is directly cognitive (dealing with how the system processes the structure). It is the only one of these four that is directly cognitive (dealing with how the system processes the structure). Slide 47 The CG view It is closely tied to the two abilities we have been discussing. It is closely tied to the two abilities we have been discussing. Producing something by rule is using a schema to sanction it, especially if it itself is not (yet) learnt. Producing something by rule is using a schema to sanction it, especially if it itself is not (yet) learnt. Learning is (of course) learning, routinizing a skill, making a sequence of cognitive activations into a unit, then recalling that unit, as needed, from cognitive storage (memory). Learning is (of course) learning, routinizing a skill, making a sequence of cognitive activations into a unit, then recalling that unit, as needed, from cognitive storage (memory). Slide 48 The computer analogy A standard (and largely useful) to talk about these issues is on the analogy of a computer. A standard (and largely useful) to talk about these issues is on the analogy of a computer. Learned information is analogous to what is stored on the hard drive, and information produced by rule is analogous to information produced by a program and not stored. Learned information is analogous to what is stored on the hard drive, and information produced by rule is analogous to information produced by a program and not stored. Slide 49 The computer analogy This makes it less than immediately obvious that the distinction is one of degree. This makes it less than immediately obvious that the distinction is one of degree. What degree is there between information that is on the hard disk and information that is not? What degree is there between information that is on the hard disk and information that is not? In a sense, none. In a sense, none. Slide 50 The computer analogy But thats like saying there is an absolute, binary, modular difference between the word giraffe written here on the screen and giraffe here, or in a book. But thats like saying there is an absolute, binary, modular difference between the word giraffe written here on the screen and giraffe here, or in a book. Its true in a sense, but for most purposes its much more important to see that its the same word (pattern = schema) either place. Its true in a sense, but for most purposes its much more important to see that its the same word (pattern = schema) either place. Slide 51 The computer analogy For information from a program or from the hard disk to be useful, it has to be brought to working memory (RAM). For information from a program or from the hard disk to be useful, it has to be brought to working memory (RAM). Once its there, it doesnt much matter where it came from. Once its there, it doesnt much matter where it came from. The same information (pattern = schema) can be in both places at once, and transferred back and forth. The same information (pattern = schema) can be in both places at once, and transferred back and forth. Slide 52 The computer analogy There is not a dichotomic difference between kinds of information that are on the hard drive and those that are produced by computation. There is not a dichotomic difference between kinds of information that are on the hard drive and those that are produced by computation. Once its in working memory, you cant necessarily tell, from the kind of information it is, where it came from. Once its in working memory, you cant necessarily tell, from the kind of information it is, where it came from. Slide 53 The computer analogy The original information, the program that massages it, and the resulting computed data can all be together on the hard drive, or all together in the working memory, or in both, at the same time, and still be accessible. The original information, the program that massages it, and the resulting computed data can all be together on the hard drive, or all together in the working memory, or in both, at the same time, and still be accessible. Slide 54 Non-linguistic cognition again Whats 9 x 8? Whats 9 x 8? How did you figure it? How did you figure it? 10 x 8 = 80, -8 = 72 10 x 8 = 80, -8 = 72 10 x 9 = 90, -(9 x 2 = 18), = 72 10 x 9 = 90, -(9 x 2 = 18), = 72 8 x 8 = 64, +8 = 72 8 x 8 = 64, +8 = 72 Or just, oh yeah, 9 x 8 = 72. Or just, oh yeah, 9 x 8 = 72. You can figure (compute it, produce it by rule) in any of a number of different ways, or just remember it. You can figure (compute it, produce it by rule) in any of a number of different ways, or just remember it. Different ones can retrieve a date (say your fathers birthday) in different ways. Different ones can retrieve a date (say your fathers birthday) in different ways. For most practical purposes, how you get it doesnt matter at all, as long as you get it right. For most practical purposes, how you get it doesnt matter at all, as long as you get it right. Slide 55 Humans arent computers anyway Two important differences between computers and humans cognition are (1) salience and (2) how storage works. Two important differences between computers and humans cognition are (1) salience and (2) how storage works. Slide 56 Humans arent computers anyway Patterns in humans minds differ in salience (cognitive prominence). Non-salient patterns are less clearly there. Salient patterns stand out and attract attention. Patterns in humans minds differ in salience (cognitive prominence). Non-salient patterns are less clearly there. Salient patterns stand out and attract attention. Slide 57 Humans arent computers anyway The result is that categories are not homogenous. Some members may be novel, produced by rule, others, while sanctioned by the rule, are also learned in their own right. The result is that categories are not homogenous. Some members may be novel, produced by rule, others, while sanctioned by the rule, are also learned in their own right. Slide 58 Humans arent computers anyway And the cases which have been learned differ in their salience according to their usage. And the cases which have been learned differ in their salience according to their usage. Slide 59 Humans arent computers anyway Repetition increases salience; all else being equal, often-repeated structures are entrenched with ever-greater salience. Repetition increases salience; all else being equal, often-repeated structures are entrenched with ever-greater salience. This may give a better picture of the general way a category might develop : This may give a better picture of the general way a category might develop : And so forth. And so forth. Slide 60 Humans arent computers anyway (To repeat :) (To repeat :) Repetition increases salience; all else being equal often-repeated structures are entrenched with ever-greater salience. Repetition increases salience; all else being equal often-repeated structures are entrenched with ever-greater salience. (Repetition isnt the only thing that enhances salience, but we wont go into the others here.) (Repetition isnt the only thing that enhances salience, but we wont go into the others here.) frequency counts correlate highly with salience. Common structures are especially important. frequency counts correlate highly with salience. Common structures are especially important. Slide 61 Humans arent computers anyway Humans start storing experiences automatically, but need repetition to learn. Humans start storing experiences automatically, but need repetition to learn. (Computers store on command, and only need one command.) (Computers store on command, and only need one command.) Humans cant very well avoid learning what they experience over and over. Again, frequency counts are a useful index of likelihood of being established. Humans cant very well avoid learning what they experience over and over. Again, frequency counts are a useful index of likelihood of being established. Whether or not a structure is (ir)regular or (un)predictable is a much less useful indicator. Whether or not a structure is (ir)regular or (un)predictable is a much less useful indicator. Slide 62 What difference does all this make? So what? What does all this have to do with linguistics? So what? What does all this have to do with linguistics? I would like to suggest 6 maxims having to do with this difference between learning a pattern and calculating it (producing it by rule). They contradict much of the received wisdom of traditional linguistics. I would like to suggest 6 maxims having to do with this difference between learning a pattern and calculating it (producing it by rule). They contradict much of the received wisdom of traditional linguistics. They bear especially on the modular architecture of many models. They bear especially on the modular architecture of many models. They affect phonology, semantics, lexicon, and grammar alike. They affect phonology, semantics, lexicon, and grammar alike. Slide 63 Background for the maxims The maxims generally take the form: You cant assume 100% computation and 0% storage, or verce visa. It is an empirical question how much of each. The maxims generally take the form: You cant assume 100% computation and 0% storage, or verce visa. It is an empirical question how much of each. I wont question that what is unpredictable yet known must have been learned. I wont question that what is unpredictable yet known must have been learned. But I do question the assumption that what is predictable or can be produced by rule therefore is not learned. But I do question the assumption that what is predictable or can be produced by rule therefore is not learned. Slide 64 Background for the maxims Many cognitive scientists have concluded that humans (in comparison with current digital computers) vastly maximize storage (with sophisticated retrieval) and minimize computation. Many cognitive scientists have concluded that humans (in comparison with current digital computers) vastly maximize storage (with sophisticated retrieval) and minimize computation. I.e., generally, we learn more than we need to, not less. Experts are those that have learned more, not that can figure things faster. I.e., generally, we learn more than we need to, not less. Experts are those that have learned more, not that can figure things faster. Our cognitive system has lots of redundancy. Our cognitive system has lots of redundancy. So (surprise surprise) does language. So (surprise surprise) does language. Slide 65 Maxim #1 Frequently linguists have argued that some member(s) of Category X are computed (=predictable=produced by rule), or, as the case might be, irregular (=unpredictable =learned=in the lexicon). Frequently linguists have argued that some member(s) of Category X are computed (=predictable=produced by rule), or, as the case might be, irregular (=unpredictable =learned=in the lexicon). They then conclude that all members of Category X are treated the same way. They then conclude that all members of Category X are treated the same way. Slide 66 Maxim #1 E.g. Chomsky (1967 Lexicalist Hypothesis): E.g. Chomsky (1967 Lexicalist Hypothesis): Derived nominals are IRREGULAR. (cites exx.) A lexical treatment of DNs is the natural way to capture this irregular behavior. Derived nominals are IRREGULAR. (cites exx.) A lexical treatment of DNs is the natural way to capture this irregular behavior. Crucial assumption: if some DNs (the exx. cited) are lexical, all are. Crucial assumption: if some DNs (the exx. cited) are lexical, all are. Slide 67 Maxim #1 CG denies this gratuitous assumption. Rather CG denies this gratuitous assumption. Rather Showing that one member of a class is learned or computed does not show that all members of the class are treated the same way. Showing that one member of a class is learned or computed does not show that all members of the class are treated the same way. I.e. I.e. What happens to one form doesnt have to happen to all of them. Slide 68 Maxim #1 What happens to one form doesnt have to happen to all of them. Think of er nominalizations. Are they lexical? Novel ones like flinger or gulper are presumably produced by rule. Others are clearly learned. Think of er nominalizations. Are they lexical? Novel ones like flinger or gulper are presumably produced by rule. Others are clearly learned. flinger < screamer < swimmer < reader < computer < propeller < rocker < ruler < drawer flinger < screamer < swimmer < reader < computer < propeller < rocker < ruler < drawer So, are er nominalizations learned, or produced by rule? Answer: It depends. So, are er nominalizations learned, or produced by rule? Answer: It depends. Slide 69 Maxim #1 What happens to one form doesnt have to happen to all of them. flinger < screamer < swimmer < reader < computer < propeller < rocker < ruler < drawer flinger < screamer < swimmer < reader < computer < propeller < rocker < ruler < drawer Some forms are well-learnt and well-established, others may be novel and only allowed because sanctioned by the rule. This is a more realistic model. Some forms are well-learnt and well-established, others may be novel and only allowed because sanctioned by the rule. This is a more realistic model. Slide 70 Maxim #1 What happens to one form doesnt have to happen to all of them. Some forms are computed, some are learned and retrieved from memory. You will have to check each case: it is an empirical issue. Some forms are computed, some are learned and retrieved from memory. You will have to check each case: it is an empirical issue. Objection: This is redundant: it violates simplicity. Why should we posit that people learn things when theres a perfectly good way to figure them? Objection: This is redundant: it violates simplicity. Why should we posit that people learn things when theres a perfectly good way to figure them? Answer: It is good that the model is redun- dant that is true to the cognitive reality. Answer: It is good that the model is redun- dant that is true to the cognitive reality. Slide 71 Maxim #1 What happens to one form doesnt have to happen to all of them. Arguing otherwise is like arguing that a computer cant have on its hard disk information that could be calculated by a program, because that would be redundant. Arguing otherwise is like arguing that a computer cant have on its hard disk information that could be calculated by a program, because that would be redundant. Whether or not its redundant, it happens. Whether or not its redundant, it happens. Restating: Restating: It is ultimately an empirical issue whether a particular linguistic form is stored in language speakers minds or whether it is computed from other information. It is ultimately an empirical issue whether a particular linguistic form is stored in language speakers minds or whether it is computed from other information. Slide 72 Maxim #1 What happens to one form doesnt have to happen to all of them. Saying that it is an empirical issue doesnt mean there is necessarily any easy empirical test to let you know. Saying that it is an empirical issue doesnt mean there is necessarily any easy empirical test to let you know. It does mean that in principle it could be either way, and you must examine relevant data to settle the question in a given case. It does mean that in principle it could be either way, and you must examine relevant data to settle the question in a given case. Slide 73 Maxim #2 Chomsky 1965 (and almost everybody else): Linguistic theory is concerned primarily with an ideal speaker- listener, in a completely homogenous speech-community, who knows its language perfectly, and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of the language in actual performance. Linguistic theory is concerned primarily with an ideal speaker- listener, in a completely homogenous speech-community, who knows its language perfectly, and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of the language in actual performance. Primarily = in practice, only; grammatically irrelevant begs important questions. Primarily = in practice, only; grammatically irrelevant begs important questions. Speech communities are not homogenous, however: far from it. And this fact is relevant. Speech communities are not homogenous, however: far from it. And this fact is relevant. Slide 74 Maxim #2 Some speakers computing or learning a form doesnt necessarily mean all speakers do it. Some speakers computing or learning a form doesnt necessarily mean all speakers do it. Your doing it doesnt mean everybody else has to. Slide 75 Maxim #2 Your doing it doesnt mean everybody else has to. Shifting from second to third gear may be perfectly automatic for one person, and require considerable thought and calculation for another. Shifting from second to third gear may be perfectly automatic for one person, and require considerable thought and calculation for another. For a friend that was all she wrote was a clichd, dead metaphor he pulled off the shelf. For me it was brand new, made me laugh out loud. For a friend that was all she wrote was a clichd, dead metaphor he pulled off the shelf. For me it was brand new, made me laugh out loud. Slide 76 Maxim #2 Your doing it doesnt mean everybody else has to. Relatedly, one type of computation by one speaker doesnt guarantee the same computation by another: Relatedly, one type of computation by one speaker doesnt guarantee the same computation by another: Un[[believabl]y] vs. un[[believe]ably] vs. [unbelief]able]y], vs. [unbelievable]y, etc. Un[[believabl]y] vs. un[[believe]ably] vs. [unbelief]able]y], vs. [unbelievable]y, etc. hangman = V + O (guy who hangs a man) or V + S (man who hangs people) hangman = V + O (guy who hangs a man) or V + S (man who hangs people) Like calculating 8 x 9 in different ways: who cares how you did it? The result is the same (near enough). Like calculating 8 x 9 in different ways: who cares how you did it? The result is the same (near enough). Slide 77 Maxim #2 Your doing it doesnt mean everybody else has to. Restating: Restating: It is ultimately an empirical issue whether a particular form is learned by a particular individual speaker, or whether he (or she) computes it in one way or another. It is ultimately an empirical issue whether a particular form is learned by a particular individual speaker, or whether he (or she) computes it in one way or another. Slide 78 Maxim #3 Most of us have had the experience of tumbling to an analysis of something that had previously been monomorphemic to us. Most of us have had the experience of tumbling to an analysis of something that had previously been monomorphemic to us. E.g. rue + th = ruth(less) vile + th = filth like true + th = truth heal + th = health E.g. rue + th = ruth(less) vile + th = filth like true + th = truth heal + th = health Similarly, derivedness can fade. Awesome used to be more saliently awe + some for me. Similarly, derivedness can fade. Awesome used to be more saliently awe + some for me. Moral: what has been exclusively accessed from memory or exclusively computed can start to be processed the other way too. Moral: what has been exclusively accessed from memory or exclusively computed can start to be processed the other way too. Slide 79 Maxim #3 You can change the way you do it. Restating: Restating: It is ultimately an empirical issue whether a particular form thats always been processed in one way by a particular speaker, will be processed exclusively in that way in the future. It is ultimately an empirical issue whether a particular form thats always been processed in one way by a particular speaker, will be processed exclusively in that way in the future. Slide 80 Maxim #4 Maxim #4 is in a way the same thing as #3: it just recognizes that you can make such changes as #3 repeatedly. Maxim #4 is in a way the same thing as #3: it just recognizes that you can make such changes as #3 repeatedly. Having computed ru-th or fil-th one time, you very likely wont bother to run through the computation the next time. Having computed ru-th or fil-th one time, you very likely wont bother to run through the computation the next time. But you might the time after that. But you might the time after that. Slide 81 Maxim #4 You dont have to do next time what you did this time. Restating: Restating: It is ultimately an empirical issue how a particular form is processed by a particular speaker on a particular occasion. It is ultimately an empirical issue how a particular form is processed by a particular speaker on a particular occasion. Slide 82 Maxim #5 Nothing stops you from bringing a form up from memory, then checking it by computation. Nothing stops you from bringing a form up from memory, then checking it by computation. Or computing it, then thinking Yes, thats right, I remember that. Or computing it, then thinking Yes, thats right, I remember that. Our minds have enough parallel processing capacity to do both simultaneously. Our minds have enough parallel processing capacity to do both simultaneously. Slide 83 Maxim #5 You can do both at the same time. Restating: Restating: It is ultimately an empirical issue for a particular form, on a particular occasion, whether a particular speaker exclusively calculates it, exclusively retrieves it from storage, or does both in some degree, either sequentially or simultaneously. It is ultimately an empirical issue for a particular form, on a particular occasion, whether a particular speaker exclusively calculates it, exclusively retrieves it from storage, or does both in some degree, either sequentially or simultaneously. Slide 84 Maxim #6 These considerations apply to polysemy as well. These considerations apply to polysemy as well. Some have claimed that certain meanings are always derived from certain others, e.g. that Shakespeare w The literary work(s) written by Shakespeare, is necessarily accessed via Shakespeare p the person William Shakespeare. Some have claimed that certain meanings are always derived from certain others, e.g. that Shakespeare w The literary work(s) written by Shakespeare, is necessarily accessed via Shakespeare p the person William Shakespeare. Slide 85 Maxim #6 This fits a rule (schema) for naming literary works by their authors. This fits a rule (schema) for naming literary works by their authors. Slide 86 Maxim #6 The same rule can be used productively to let you use a name such as Harry Smith to refer to what Harry Smith wrote. The same rule can be used productively to let you use a name such as Harry Smith to refer to what Harry Smith wrote. Slide 87 Maxim #6 You might say that Shakespeare (as opposed to Harry Smith) is lexically marked to undergo this particular kind of metonymic extension; You might say that Shakespeare (as opposed to Harry Smith) is lexically marked to undergo this particular kind of metonymic extension; The extended meaning Shakespeare w then need not itself be learned, and listed in the lexicon. The extended meaning Shakespeare w then need not itself be learned, and listed in the lexicon. Slide 88 Maxim #6 CGs position on this should, by now, be no surprise. You can do it either way. CGs position on this should, by now, be no surprise. You can do it either way. You can access Shakespeare w through Shakespeare p. You can access Shakespeare w through Shakespeare p. Slide 89 Maxim #6 However, Shakespeare w can also become established in its own right and so linked that you can also access it directly. However, Shakespeare w can also become established in its own right and so linked that you can also access it directly. So what if its redundant; it happens. So what if its redundant; it happens. Slide 90 Maxim #6 You can go the long way to get to a meaning or go to it directly. Restating: Restating: It is ultimately an empirical issue whether a meaning is activated only as a result of a computational process starting with another meaning, or activated directly, or both. It is ultimately an empirical issue whether a meaning is activated only as a result of a computational process starting with another meaning, or activated directly, or both. (for a particular meaning in the mind of a particular speaker on a particular occasion) (for a particular meaning in the mind of a particular speaker on a particular occasion) Slide 91 The import of the Maxims These Maxims have obvious applications to the way the lexicon is conceived, but also to phonology, syntax, and semantics. These Maxims have obvious applications to the way the lexicon is conceived, but also to phonology, syntax, and semantics. If CG is right on these points, the architecture of language must look rather different than what many other theories have portrayed it to be. If CG is right on these points, the architecture of language must look rather different than what many other theories have portrayed it to be. We will zip past a few ways it affects things. We will zip past a few ways it affects things. Slide 92 The import of the Maxims: the Lexicon-Grammar distinction Obviously, it is not going to be possible to simply state, of many classes of structures or of many particular structures, These are part of the lexicon, not part of the grammar, and act as if that ended the matter. Obviously, it is not going to be possible to simply state, of many classes of structures or of many particular structures, These are part of the lexicon, not part of the grammar, and act as if that ended the matter. By the same token, you wont be able to say These are always produced by the grammar, and not learned. By the same token, you wont be able to say These are always produced by the grammar, and not learned. Lots of structures will be in-between, or oscillating between, in various ways. Lots of structures will be in-between, or oscillating between, in various ways. Slide 93 The import of the Maxims: the Lexicon-Grammar distinction You will be a lot better off if your theory doesnt make that mean that they are drastically changing their nature and functions each time they cross from one category to the other. You will be a lot better off if your theory doesnt make that mean that they are drastically changing their nature and functions each time they cross from one category to the other. Slide 94 The import of the Maxims: the Lexicon-Grammar distinction You will be a lot better off if your theory doesnt make that mean that they are drastically changing their nature and functions each time they cross from one category to the other. You will be a lot better off if your theory doesnt make that mean that they are drastically changing their nature and functions each time they cross from one category to the other. Slide 95 The import of the Maxims: the Lexicon-Grammar distinction This assumes, of course, that Lexicon is defined as the repository of what is learned & not produced by rule. It is also true under all other definitions of the lexicon I know. This assumes, of course, that Lexicon is defined as the repository of what is learned & not produced by rule. It is also true under all other definitions of the lexicon I know. For me, the lexicon is most usefully viewed as the set of structures clearly learned in fully detailed form (at least with all their phonemes specified.) For me, the lexicon is most usefully viewed as the set of structures clearly learned in fully detailed form (at least with all their phonemes specified.) You may (if you wish) add relatively simple in their morphemic structure. You may (if you wish) add relatively simple in their morphemic structure. Slide 96 The import of the Maxims: the Lexicon-Grammar distinction All these parameters (learnedness, schematicity, complexity) are matters of degree. All these parameters (learnedness, schematicity, complexity) are matters of degree. In this way, the lexicon differs only in degree from the grammar, and from what is not (yet) part of the language. In this way, the lexicon differs only in degree from the grammar, and from what is not (yet) part of the language. Slide 97 The import of the Maxims: the Lexicon-Grammar distinction Like this: Like this: Slide 98 The import of the Maxims: the Lexicon-Grammar distinction Salty would be clearly lexical for most English speakers, vinegary is less so because it is not as thoroughly learnt, may be learnt by some but not others, etc. Salty would be clearly lexical for most English speakers, vinegary is less so because it is not as thoroughly learnt, may be learnt by some but not others, etc. FOOD-y would be less lexical because part of it is too schematic (its phonemes are not specified.) FOOD-y would be less lexical because part of it is too schematic (its phonemes are not specified.) N-Adjr would be even less so, and more clearly part of the grammar. N-Adjr would be even less so, and more clearly part of the grammar. But they should all be basically the same sort of structure. (CG describes them so.) But they should all be basically the same sort of structure. (CG describes them so.) Slide 99 The import of the Maxims: Conventional expressions (Langacker 1987) (Langacker 1987) This dichotomous perspective [of syntax vs. lexicon] made it inevitable that a large body of data belonging to neither category would be mostly ignored. I refer here to the huge set of stock phrases, familiar collocations, formulaic expressions, and standard usages that can be found in any language and thoroughly permeate its use. This dichotomous perspective [of syntax vs. lexicon] made it inevitable that a large body of data belonging to neither category would be mostly ignored. I refer here to the huge set of stock phrases, familiar collocations, formulaic expressions, and standard usages that can be found in any language and thoroughly permeate its use. Slide 100 The import of the Maxims: Conventional expressions This is why a seemingly perfect knowledge of the grammar of a language (in the narrow sense) does not guarantee fluency in it; learning its full complement of conventional expressions is probably by far the largest task involved in mastering it. This is why a seemingly perfect knowledge of the grammar of a language (in the narrow sense) does not guarantee fluency in it; learning its full complement of conventional expressions is probably by far the largest task involved in mastering it. Slide 101 The import of the Maxims: the Lexicon-Grammar distinction Yet conventional expressions have received so little attention that I found it necessary to invent this term for the class as a whole. The grammar [i.e. the linguistic description] of a language is responsible for listing its full set of conventional expressions (such as go for a walk, absolutely incredible, have a good time, cheap imitation, the seconds are ticking away, and so on, and so on). To furnish such a list would obviously be a vast undertaking, for there are many thousands of such expressions, and new ones are always forming. Yet conventional expressions have received so little attention that I found it necessary to invent this term for the class as a whole. The grammar [i.e. the linguistic description] of a language is responsible for listing its full set of conventional expressions (such as go for a walk, absolutely incredible, have a good time, cheap imitation, the seconds are ticking away, and so on, and so on). To furnish such a list would obviously be a vast undertaking, for there are many thousands of such expressions, and new ones are always forming. Slide 102 The import of the Maxims: the Lexicon-Grammar distinction The issue of whether conventional expressions should be included in a grammar is factual rather than methodological in a framework taking seriously the goal of psychological reality in linguistic description. If a speaker does in fact learn a large set of conventional expressions as fixed units, it is incumbent on the grammar to represent this fact by providing an inventory of these expressions. The issue of whether conventional expressions should be included in a grammar is factual rather than methodological in a framework taking seriously the goal of psychological reality in linguistic description. If a speaker does in fact learn a large set of conventional expressions as fixed units, it is incumbent on the grammar to represent this fact by providing an inventory of these expressions. Slide 103 The import of the Maxims: the Lexicon-Grammar distinction The simplest description that accurately accommodates all the data must by definition include such a list.* Langacker 1987:35-36, 41 The simplest description that accurately accommodates all the data must by definition include such a list.* Langacker 1987:35-36, 41 [*Footnote*: With apologies to Sapir, we can say that not only do all grammars leak, they also list (massively).] [*Footnote*: With apologies to Sapir, we can say that not only do all grammars leak, they also list (massively).] Slide 104 The import of the Maxims: Phonology An obvious application to phonology is that all common rule-governed forms will tend to be learned (stored). An obvious application to phonology is that all common rule-governed forms will tend to be learned (stored). A rule may tell you that the v of liv leaf devoices word-final. This does not mean that you dont learn livz leaves, livd leaved, and lif leaf, in their own right. A rule may tell you that the v of liv leaf devoices word-final. This does not mean that you dont learn livz leaves, livd leaved, and lif leaf, in their own right. Slide 105 The import of the Maxims: Phonology Suppletion thus overlaps massively with rule-governed phonology, and often proves decisive in language change. Suppletion thus overlaps massively with rule-governed phonology, and often proves decisive in language change. E.g. people might (and in fact some do) start saying lifs leaves, or lift leaved using the salient singular form as basic and recomputing the others, directly contradicting the rule. E.g. people might (and in fact some do) start saying lifs leaves, or lift leaved using the salient singular form as basic and recomputing the others, directly contradicting the rule. (Phonological systems empirically dont always change the ways the rules would lead you to expect. This is one reason why.) (Phonological systems empirically dont always change the ways the rules would lead you to expect. This is one reason why.) Slide 106 The import of the Maxims: Semantics vs. Pragmatics One of the major criteria linguists seem to use to decide what is pragmatic as opposed to semantic is whether something is (in some degree) predictable (especially predictable from the context.) One of the major criteria linguists seem to use to decide what is pragmatic as opposed to semantic is whether something is (in some degree) predictable (especially predictable from the context.) Slide 107 The import of the Maxims: Semantics vs. Pragmatics E.g. the meaning Shakespeare w Shakespeares works, is a pragmatic extension (because its predictable) of the real meaning Shakespeare p the person William Shakespeare. E.g. the meaning Shakespeare w Shakespeares works, is a pragmatic extension (because its predictable) of the real meaning Shakespeare p the person William Shakespeare. Slide 108 The import of the Maxims: Semantics vs. Pragmatics As already argued, nothing stops both meanings being learned, conventionally associated with the phonological form ejkspir, and thus real meanings. As already argued, nothing stops both meanings being learned, conventionally associated with the phonological form ejkspir, and thus real meanings. Slide 109 The import of the Maxims: Semantics vs. Pragmatics Semantics is conventionalized pragmatics. Semantics is conventionalized pragmatics. A lot more gets conventionalized (learned and known to have been learned by all the relevant speakers) than might seem strictly necessary. A lot more gets conventionalized (learned and known to have been learned by all the relevant speakers) than might seem strictly necessary. A lot of what is presented as pragmatics is in fact semantic. A lot of what is presented as pragmatics is in fact semantic. Slide 110 Summary It has traditionally been assumed in linguistics that what is regular, systematic and predictable is not learned but rather produced by rule, while what is irregular, idiosyncratic and arbitrary is learned. It has traditionally been assumed in linguistics that what is regular, systematic and predictable is not learned but rather produced by rule, while what is irregular, idiosyncratic and arbitrary is learned. Slide 111 Summary It is normally assumed as well that these distinctions are binary and coincide with each other so exactly as to be practically equivalent. It is normally assumed as well that these distinctions are binary and coincide with each other so exactly as to be practically equivalent. Important aspects of the architecture of many theoretical models depend on these assumptions, notably the modular distinction between lexicon (where the irregular, idiosyncratic, arbitrary and learned structures reside) and grammar (the domain of rules for producing the unlearned, regular, systematic, and predictable phenomena of language). Important aspects of the architecture of many theoretical models depend on these assumptions, notably the modular distinction between lexicon (where the irregular, idiosyncratic, arbitrary and learned structures reside) and grammar (the domain of rules for producing the unlearned, regular, systematic, and predictable phenomena of language). Slide 112 Summary CG maintains, on the contrary, that the distinctions are all gradual rather than binary, and that they do not coincide exactly. In particular, much that is regular, systematic and in some degree predictable may nevertheless be learned. CG maintains, on the contrary, that the distinctions are all gradual rather than binary, and that they do not coincide exactly. In particular, much that is regular, systematic and in some degree predictable may nevertheless be learned. This has important implications for the structure of the CG framework, notably for the ways lexicon and other aspects of grammar grade into each other. This has important implications for the structure of the CG framework, notably for the ways lexicon and other aspects of grammar grade into each other. Slide 113 Summary If CG is right on these points, the architecture of language must look rather different than what many other theories have portrayed it to be. If CG is right on these points, the architecture of language must look rather different than what many other theories have portrayed it to be. Slide 114 Power Point available at www.sil.org/~tuggyd