103
Week 8. Neurons and Week 8. Neurons and impoverished stimuli impoverished stimuli GRS LX 865 GRS LX 865 Topics in Topics in Linguistics Linguistics

Week 8. Neurons and impoverished stimuli GRS LX 865 Topics in Linguistics

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Week 8. Neurons andWeek 8. Neurons andimpoverished stimuliimpoverished stimuli

GRS LX 865GRS LX 865Topics in Topics in

LinguisticsLinguistics

Rules and brainsRules and brains

(Generative) linguistics has traditionally (Generative) linguistics has traditionally been done in term of symbolic rules.been done in term of symbolic rules.

S S NP VP NP VP VV[past][past] V + V + -ed-ed / / -d-d / - / -tt

But people studying neurophysiology But people studying neurophysiology complain that there’s no obvious way to complain that there’s no obvious way to “write” that in neurons.“write” that in neurons.

A neuronA neuron

Neural connectionsNeural connections Individual neurons are connected to one another via Individual neurons are connected to one another via

excitatory and inhibitory connectionsexcitatory and inhibitory connections, and has a , and has a certain level of certain level of activationactivation. When a neuron’s level of . When a neuron’s level of activation reaches a critical threshold, the neuron activation reaches a critical threshold, the neuron firesfires, spreading positive activation to other neurons , spreading positive activation to other neurons that it is excitatorily connected to and negative that it is excitatorily connected to and negative activation to neurons that it is inhibitorily connected activation to neurons that it is inhibitorily connected to.to.

““Neurons that fire together wire together”Neurons that fire together wire together”. . Connections are developed or strengthened Connections are developed or strengthened between neurons whose firings temporally coincide. between neurons whose firings temporally coincide. Function has changed. Function has changed. MemoryMemory. . It becomes likely It becomes likely now that if one fires the other will too. now that if one fires the other will too. Long-term Long-term memory?memory?

synapssynapseeImpulse

Presynaptic neuron

Vesicle

Transmitters

Synaptic cleft

Receptors

Postsynapticneuron

Postsynaptic activity

ConnectionismConnectionism

A connectionist A connectionist system consists of a system consists of a set of set of interconnected interconnected nodesnodes (“neurons”). (“neurons”).

Each connection has a Each connection has a certain certain strengthstrength and and polaritypolarity..

ConnectionismConnectionism

Each node has an Each node has an activation level activation level and a and a threshold valuethreshold value..

ConnectionismConnectionism

Each node has an Each node has an activation level activation level and a and a threshold valuethreshold value..

When a node reaches When a node reaches the threshold level, it the threshold level, it firesfires—and transfers its —and transfers its activation (additively activation (additively or subtractively) along or subtractively) along the connections.the connections.

ConnectionismConnectionism

If this pushes a If this pushes a connected node over connected node over its threshold, its threshold, itit fires. fires.

ConnectionismConnectionism

If this pushes a If this pushes a connected node over connected node over its threshold, its threshold, itit fires. fires.

ConnectionismConnectionism

If this pushes a If this pushes a connected node over connected node over its threshold, its threshold, itit fires. fires.

And so forth…And so forth… Clearly, the Clearly, the

interactions can interactions can quickly become mind-quickly become mind-bogglingly complex.bogglingly complex.

ConnectionismConnectionism

Certain nodes are Certain nodes are designated as designated as input nodesinput nodes. . These have an activation These have an activation level driven by the level driven by the perceptual system. So, perceptual system. So, maybe the node will be maybe the node will be active if the currently active if the currently perceive word starts with perceive word starts with tt……

inputworld

ConnectionismConnectionism

Other nodes are Other nodes are designated as designated as output output nodesnodes. The status of . The status of these nodes determine these nodes determine the system’s reaction to the system’s reaction to the input.the input.

So, it’s a complex way to So, it’s a complex way to compute a compute a function function from from input (patterns) to output input (patterns) to output (patterns). (patterns).

inputworld

output

decision

ConnectionismConnectionism

Finally, the Finally, the learninglearning aspect. The aspect. The way neural nets are trained is way neural nets are trained is generally:generally: Provide an input with a known Provide an input with a known

“correct” output.“correct” output. Check the output the system Check the output the system

provides.provides. If the system’s output doesn’t If the system’s output doesn’t

match the correct output, match the correct output, adjust adjust the connection weights in the the connection weights in the networknetwork using a general “back-using a general “back-propagation” algorithm to make propagation” algorithm to make it come closer next time.it come closer next time. input

world

output

decision

ConnectionismConnectionism

After a After a lotlot of training, the of training, the neural net can produce the neural net can produce the appropriate outputs from the appropriate outputs from the given inputs.given inputs.

The neural net has The neural net has abstracted abstracted out the systematicityout the systematicity in the in the input data—but in ways that input data—but in ways that are almost always far too are almost always far too mathematically complicated mathematically complicated to fathom.to fathom.

inputworld

output

decision

ConnectionismConnectionism

Then, when presented with novel Then, when presented with novel inputs, the neural net will inputs, the neural net will generalizegeneralize its training to make its training to make decisions. This was previously decisions. This was previously considered to be a sure signal of considered to be a sure signal of following a rule.following a rule.

When trained on rule-governed When trained on rule-governed material, it tends to “follow the material, it tends to “follow the rule” even with novel forms. rule” even with novel forms. Neural nets are also great Neural nets are also great pattern pattern recognizersrecognizers— they latch onto any — they latch onto any kind of statistical regularity. kind of statistical regularity. Medical diagnosis, image Medical diagnosis, image reconstruction, …reconstruction, … input

world

output

decision

ConnectionismConnectionism

What’s the point?What’s the point? The point is that neural nets The point is that neural nets

can “learn” rule-like can “learn” rule-like behavior from statistical behavior from statistical regularity without being regularity without being taught the rule (and in fact taught the rule (and in fact without there even without there even beingbeing a a rule). There are just neurons rule). There are just neurons and connections (vaguely and connections (vaguely like the human brain).like the human brain).

So maybe those rules were So maybe those rules were just just approximationsapproximations..

inputworld

output

decision

Connectionism is too Connectionism is too hardhard

Connectionist research often has this property, Connectionist research often has this property, looking at a very small problem on the boundary looking at a very small problem on the boundary between grammatical knowledge and lexical between grammatical knowledge and lexical memorization, where it is not at all clear that we memorization, where it is not at all clear that we could generalize the results to language as a could generalize the results to language as a whole grammatical system (or even come close to whole grammatical system (or even come close to understanding what the network is even understanding what the network is even doingdoing).).

At this point, At this point, connectionism is too hardconnectionism is too hard—for a —for a network large enough to do anything interesting network large enough to do anything interesting and predictive, the generalizations it reaches will and predictive, the generalizations it reaches will be completely inaccessible to us analyzing it from be completely inaccessible to us analyzing it from the outside.the outside.

Rumelhart & McClellandRumelhart & McClelland

R&M 1986 created a connectionist R&M 1986 created a connectionist network to learn English past tense.network to learn English past tense.

English past tense forms come in English past tense forms come in regularregular and and irregularirregular kinds. kinds. Walk/walked, kick/kickedWalk/walked, kick/kicked Tow/towed, rub/rubbedTow/towed, rub/rubbed Melt/melted, right/rightedMelt/melted, right/righted

Break/broke, sing/sang, light/lit, Break/broke, sing/sang, light/lit, grow/grewgrow/grew

RegularitiesRegularities The The regularregular kind are the ones that are kind are the ones that are

easily described in terms of rules.easily described in terms of rules. VV[past][past] V + V + -ed-ed / / -d-d / - / -tt

(depends on voicing and place of the last (depends on voicing and place of the last consonant of the stem).consonant of the stem).

The irregular kind you have to memorize. The irregular kind you have to memorize. The regular kind you can build on the fly.The regular kind you can build on the fly.

Claim of the connectionists is that they’re Claim of the connectionists is that they’re all really the same kind of thing.all really the same kind of thing.

SubregularitiesSubregularities

The thing is, the “irregulars” have The thing is, the “irregulars” have some similarities as well, or at least some similarities as well, or at least can be grouped. can be grouped. (sing(sing//sangsang, , ringring//rangrang, , sitsit//satsat, …), …)

This has the feel of the kind of thing This has the feel of the kind of thing an associationist network would be an associationist network would be good at—seeing similarities, good at—seeing similarities, regularities and subregularities.regularities and subregularities.

He goed.He goed. One of the arguments in favor of rules in One of the arguments in favor of rules in

morphology is the “overregularization” morphology is the “overregularization” that kids are seen to do.that kids are seen to do. For a while they seem get things right, For a while they seem get things right,

including irregulars (“he went”)including irregulars (“he went”) But then they start saying things like “he But then they start saying things like “he

goed” (over-regularizing).goed” (over-regularizing). Eventually, they get it right again.Eventually, they get it right again.

Idea: In that middle step, they learned the Idea: In that middle step, they learned the rule. In the first step, they’d just rule. In the first step, they’d just memorized everything. memorized everything.

RMRM The connectionist model (taking The connectionist model (taking

advantage of the fact that regulars are advantage of the fact that regulars are also overwhelmingly common) will also also overwhelmingly common) will also tend to over-regularize for a while. Like tend to over-regularize for a while. Like kids?kids?

There were various problems in the RM There were various problems in the RM model, amply documented by Pinker & model, amply documented by Pinker & Prince (1988). Among them, the Prince (1988). Among them, the apparent overregularization RM saw apparent overregularization RM saw seemed to be more a function of the seemed to be more a function of the input input changing.changing.

GeneralizationGeneralization

Eventually RM’s model could come Eventually RM’s model could come up with the right answers for up with the right answers for everything it was trained on. Then, everything it was trained on. Then, the test is: What will it do with novel the test is: What will it do with novel words? Will it do what people do?words? Will it do what people do?

72 new regular verbs.72 new regular verbs.

Generalization in RMGeneralization in RM

6: refused to answer6: refused to answer jump, pump, soak, warm, trail, glarejump, pump, soak, warm, trail, glare

4: grossly bizarre (not a human-type 4: grossly bizarre (not a human-type mistake)mistake) Squat/squakt, mail/membled, tour/toureder, Squat/squakt, mail/membled, tour/toureder,

mate/madedmate/maded 7: double-marked7: double-marked

Type/typeded, snap/snappeded, smoke/smokededType/typeded, snap/snappeded, smoke/smokeded ……This is a pretty poor simulation of adult This is a pretty poor simulation of adult

knowledge.knowledge.

Dual mechanismsDual mechanisms

Pinker and various others have Pinker and various others have championed a version of morphology championed a version of morphology in which there are in which there are bothboth rules and an rules and an associationist network.associationist network.

Rules are for regularsRules are for regulars Network (memory) is for irregulars.Network (memory) is for irregulars.

Logical problem of Logical problem of language acquisitionlanguage acquisition

The grammar that people end up with is The grammar that people end up with is very complicated, and underdetermined by very complicated, and underdetermined by the data.the data.

The main argument for this (“poverty of the The main argument for this (“poverty of the stimulus”) is that there are many stimulus”) is that there are many generalizations that a kid generalizations that a kid could could make on make on the basis of the input data that would be the basis of the input data that would be wrong, that would not result in a language wrong, that would not result in a language that conforms to the principles that we’ve that conforms to the principles that we’ve discovered seem to hold true of all adult discovered seem to hold true of all adult languages.languages.

Language is really Language is really complicatedcomplicated

1)1) Frasier threw Frasier threw out out Martin’s chair.Martin’s chair.2)2) Frasier threw Martin’s chair Frasier threw Martin’s chair outout..3)3) Daphne walked Daphne walked out out the door.the door.4)4) *Daphne walked the door out.*Daphne walked the door out.

5)5) What did Roz say Niles bought?What did Roz say Niles bought?6)6) What did Roz sayWhat did Roz say that that Niles bought?Niles bought?7)7) Who did Roz say bought an espresso doppio?Who did Roz say bought an espresso doppio?8)8) *Who did Roz say *Who did Roz say thatthat bought an espresso doppio? bought an espresso doppio?

Language is really Language is really complicatedcomplicated

9)9) HisHis mother thinks mother thinks BillBill is a genius.is a genius.

10)10) HeHe thinks thinks BillBill is a genius, too.is a genius, too.

11)11) MaryMary saw saw herher..

12)12) MaryMary saw saw herher duck. duck.

13)13) I asked Mary to buy I asked Mary to buy rumrum..

14)14) What What did you ask Mary to buy ?did you ask Mary to buy ?

15)15) I saw the book about I saw the book about snakes snakes on the table.on the table.

16)16) **WhatWhat did you see the book about on the table? did you see the book about on the table?

Language is really Language is really complicatedcomplicated

John ate.John ate. John ate a fish.John ate a fish. John is too clever to catch.John is too clever to catch. John is too clever to catch a fish.John is too clever to catch a fish.

Who does Arnold wanna make breakfast Who does Arnold wanna make breakfast for?for?

*Who does Arnold wanna make breakfast?*Who does Arnold wanna make breakfast? Do you know what that’s doing up there?Do you know what that’s doing up there? *Do you know what that’s up there?*Do you know what that’s up there?

Yet people know this Yet people know this stuff…stuff…

Adult native speakers uniformly Adult native speakers uniformly and overwhelmingly agree.and overwhelmingly agree.

To know English is to have To know English is to have knowledge of (how to determine) knowledge of (how to determine) which sentences are possible and which sentences are possible and which are impossible in English.which are impossible in English.

How one comes to have this How one comes to have this knowledge is going to be our knowledge is going to be our primary focus.primary focus.

GrammarGrammar People eventually end up with a People eventually end up with a systemsystem with with

which they can produce (and rate) sentences: a which they can produce (and rate) sentences: a grammargrammar..

Even if a native speaker of English has never Even if a native speaker of English has never heard either of these sentences before, s/he heard either of these sentences before, s/he knows which one is possible in English and knows which one is possible in English and which one isn’t:which one isn’t:

15)15) Eight very adept sea lions played trombones.Eight very adept sea lions played trombones.16)16) Eight sea lions very adept trombones played.Eight sea lions very adept trombones played.

How do people know How do people know this?this?

Every native speaker of English knows these Every native speaker of English knows these things.things.

Nobody who speaks English as a first language Nobody who speaks English as a first language was explicitly taught (growing up) “was explicitly taught (growing up) “You can’t You can’t question a subject in a complement embedded question a subject in a complement embedded with with thatthat” or “” or “You can’t use a proper name if it’s You can’t use a proper name if it’s c-commanded by something coindexed with itc-commanded by something coindexed with it.”.”

Trying to use any simple kind of general learning Trying to use any simple kind of general learning principle based on (analogy to) the sentences you principle based on (analogy to) the sentences you get seems almost sure to lead you astray.get seems almost sure to lead you astray.

That’s the setupThat’s the setup

Language involves a complex Language involves a complex grammar.grammar.

Adults end up with knowledge of this Adults end up with knowledge of this grammar, quite uniformly.grammar, quite uniformly.

Children seem to go through advancing Children seem to go through advancing stages of language sophistication; they stages of language sophistication; they are learning, the end result being the are learning, the end result being the adult language system.adult language system.

Next question: Next question: What is the nature of What is the nature of the children’s learning?the children’s learning?

LinguistsLinguists

As As linguistslinguists trying to figure out the grammatical trying to figure out the grammatical system of a language, we…system of a language, we… Look at which sentences are grammaticalLook at which sentences are grammatical Look at which sentences are ungrammaticalLook at which sentences are ungrammatical Compare them to describe generalizations about what Compare them to describe generalizations about what

the crucial factors are differentiating the grammatical the crucial factors are differentiating the grammatical from the ungrammatical.from the ungrammatical.

Check the predictions of the hypothesized generalization Check the predictions of the hypothesized generalization by looking at more complex sentences.by looking at more complex sentences.

Are Are kidskids just little linguists? just little linguists?

Kids are not just little Kids are not just little linguists.linguists.

*What did you see the book about on the *What did you see the book about on the table?table?

*Who did Mary say that bought coffee?*Who did Mary say that bought coffee? Eight very adept sea lions played trombones.Eight very adept sea lions played trombones.

Linguists’ theories:Linguists’ theories: built by considering both built by considering both grammatical grammatical andand ungrammaticalungrammatical sentences sentences (often of a fairly complex type).(often of a fairly complex type).

Kids:Kids: Don’t hear ungrammatical sentences, Don’t hear ungrammatical sentences, nor even all of the grammatical sentences nor even all of the grammatical sentences (often of a simpler type).(often of a simpler type).

So how do they do it?So how do they do it? One hypothesis holds suggests that parents One hypothesis holds suggests that parents

actually help kids along (though not actually help kids along (though not consciously).consciously).

It’s well known that people seem to It’s well known that people seem to instinctively talk to little kids in kind of a instinctively talk to little kids in kind of a weird way; exaggerated intonation, simpler weird way; exaggerated intonation, simpler words, more repetition. “Baby talk” or as it is words, more repetition. “Baby talk” or as it is sometimes known, “sometimes known, “MothereseMotherese”.”.

Many have entertained the idea that this Many have entertained the idea that this simpler, more carefully articulated, speech simpler, more carefully articulated, speech might guide kids along the path of language might guide kids along the path of language acquisition.acquisition.

Some properties of Some properties of “Motherese”“Motherese”

Slower speech, longer pausesSlower speech, longer pauses Higher pitch, greater pitch rangeHigher pitch, greater pitch range Exaggerated intonation and stressExaggerated intonation and stress More varied loudnessMore varied loudness Fewer disfluenciesFewer disfluencies More restricted vocabularyMore restricted vocabulary More rephrasingsMore rephrasings More repetitionsMore repetitions Shorter, less complex utterancesShorter, less complex utterances More imperatives and questionsMore imperatives and questions Fewer complex (multiclause) sentencesFewer complex (multiclause) sentences

Does “Motherese” drive Does “Motherese” drive acquisition?acquisition?

Initially tempting, perhaps, but no.Initially tempting, perhaps, but no. If “Motherese” were crucial for acquisition, it If “Motherese” were crucial for acquisition, it

must be available to all language acquirers, must be available to all language acquirers, universally.universally.

Several documented cultures don’t even speak to Several documented cultures don’t even speak to the kids until they reach linguistic sophistication. the kids until they reach linguistic sophistication. (Of course, they’re exposed to language in the (Of course, they’re exposed to language in the environment, but not directed at them in environment, but not directed at them in “Motherese”)“Motherese”)

Does “Motherese” drive Does “Motherese” drive acquisition?acquisition?

If you give a 4-month old the choice of If you give a 4-month old the choice of whether to listen to “Motherese” or to whether to listen to “Motherese” or to normal adult-directed speech, the kid normal adult-directed speech, the kid will choose to listen to “Motherese”…will choose to listen to “Motherese”…

……so it is quite likely that “Motherese” so it is quite likely that “Motherese” forms a significant part of the PLD for forms a significant part of the PLD for the kid, but it can’t be the kid, but it can’t be necessarynecessary for for successful language acquisition.successful language acquisition.

Simpler isn’t really Simpler isn’t really betterbetter

Linguists look to Linguists look to complex sentencescomplex sentences to to differentiate between predictions of different differentiate between predictions of different hypotheses about how the grammar works.hypotheses about how the grammar works.

Generally, prior to considering complex Generally, prior to considering complex sentences, the data sentences, the data underdetermines the underdetermines the grammargrammar; there are (at least) two systems ; there are (at least) two systems compatible with the data observed so far.compatible with the data observed so far.

If linguists need to look to complex sentences If linguists need to look to complex sentences to figure out the intricacies of the rules to figure out the intricacies of the rules (which all adult native speakers seem to end (which all adult native speakers seem to end up with), up with), kids should need this information kids should need this information tootoo..

Positive and negative Positive and negative evidenceevidence

Kids need to know the grammatical Kids need to know the grammatical system by the time they are adults.system by the time they are adults.

Kids hear grammatical sentencesKids hear grammatical sentences((positive evidencepositive evidence))

Kids are not told which sentences Kids are not told which sentences are ungrammaticalare ungrammatical((no negativeno negative evidenceevidence))

Let’s consider Let’s consider no negative no negative evidenceevidence further… further…

Negative evidenceNegative evidence

Negative evidence Negative evidence (information that a (information that a given sentence is ungrammatical)given sentence is ungrammatical) could could come in various conceivable forms.come in various conceivable forms. ““The sentence The sentence Bill a cookie ateBill a cookie ate is not a is not a

sentence in English, Timmy. No sentence sentence in English, Timmy. No sentence with SOV word order is.”with SOV word order is.”

Upon hearing Upon hearing Bill a cookie ateBill a cookie ate, an adult might, an adult might Offer negative reinforcementOffer negative reinforcement Not understandNot understand Look painedLook pained Rephrase the ungrammatical sentence Rephrase the ungrammatical sentence

grammaticallygrammatically

Kids Kids resistresist instruction… instruction…

McNeill (1966)McNeill (1966) Nobody don’t like me.Nobody don’t like me. No, say ‘nobody likes me.’No, say ‘nobody likes me.’ Nobody don’t like me.Nobody don’t like me.

[[repeats eight timesrepeats eight times]]

No, now listen carefully; say ‘nobody likes me.’No, now listen carefully; say ‘nobody likes me.’ Oh! Nobody don’t likes me.Oh! Nobody don’t likes me.

Kids Kids resistresist instruction… instruction…

Braime (1971)Braime (1971) Want other one spoon, daddy.Want other one spoon, daddy. You mean, you want the other spoon.You mean, you want the other spoon. Yes, I want other one spoon, please Daddy.Yes, I want other one spoon, please Daddy. Can you say ‘the other spoon’?Can you say ‘the other spoon’? Other…one…spoonOther…one…spoon Say ‘other’Say ‘other’ OtherOther ‘‘Spoon’Spoon’ SpoonSpoon ‘‘Other spoon’Other spoon’ Other…spoon. Now give me other one Other…spoon. Now give me other one

spoon?spoon?

Kids Kids resistresist instruction… instruction…

Cazden (1972) Cazden (1972) (observation attributed to Jean Berko (observation attributed to Jean Berko Gleason)Gleason)

My teacher holded the baby rabbits and we My teacher holded the baby rabbits and we patted them.patted them.

Did you say your teacher held the baby rabbits?Did you say your teacher held the baby rabbits? Yes.Yes. What did you say she did?What did you say she did? She holded the baby rabbits and we patted She holded the baby rabbits and we patted

them.them. Did you say she held them tightly?Did you say she held them tightly? No, she holded them loosely.No, she holded them loosely.

Negative evidence via Negative evidence via feedback?feedback?

Do kids get “implicit” negative Do kids get “implicit” negative evidence?evidence?

Do adults Do adults understandunderstand grammatical grammatical sentences and sentences and not understandnot understand ungrammatical ones?ungrammatical ones?

Do adults Do adults respond positivelyrespond positively to to grammatical sentences and grammatical sentences and negativelynegatively to to ungrammatical ones?ungrammatical ones?

Approval or Approval or comprehension?comprehension?

Brown & Hanlon (1970):Brown & Hanlon (1970): Adults understood Adults understood 42%42% of the grammatical sentences. of the grammatical sentences. Adults understood Adults understood 47%47% of the ungrammatical ones. of the ungrammatical ones.

Adults expressed approval after Adults expressed approval after 45%45% of the of thegrammatical sentences.grammatical sentences.

Adults expressed approval after Adults expressed approval after 45%45% of the of the ungrammatical sentences.ungrammatical sentences.

This doesn’t bode well for comprehension or approval as a This doesn’t bode well for comprehension or approval as a source of negative evidence for kids.source of negative evidence for kids.

Kids’ experience differsKids’ experience differs

Parents respond differentlyParents respond differently Eve & Sarah’s parents ask clarification Eve & Sarah’s parents ask clarification

questions after questions after ill-formedill-formed whwh-questions.-questions. Adam’s parents ask clarification after Adam’s parents ask clarification after

well-formedwell-formed whwh-questions…and after -questions…and after past tense past tense errorserrors..

How can kids figure out what How can kids figure out what correlates with grammaticality in correlates with grammaticality in theirtheir situation? situation?

Kids’ experience differsKids’ experience differs

Piedmont Carolinas: Heath (1983):Piedmont Carolinas: Heath (1983):Trackton adults do not see babies or Trackton adults do not see babies or

young children as suitable partners for young children as suitable partners for regular conversation…[U]nless they regular conversation…[U]nless they wish to issue a warning, give a wish to issue a warning, give a command, provide a recommendation, command, provide a recommendation, or engage the child in a teasing or engage the child in a teasing exchange,exchange, adults rarely address speech adults rarely address speech specifically to young childrenspecifically to young children..

Feedback Feedback disappearsdisappears

Adam and Sarah showed almost no Adam and Sarah showed almost no reply contingencies after age 4…reply contingencies after age 4…

But they still But they still mademade errors after age 4 errors after age 4

And they still stopped making those And they still stopped making those errors as adults (errors as adults (learning learning didn’t didn’t cease).cease).

Three possible types of Three possible types of feedbackfeedback

CompleteComplete:: consistent response, indicates consistent response, indicates unambiguously “grammatical” or unambiguously “grammatical” or “ungrammatical.”“ungrammatical.”

PartialPartial:: if there is a response, it indicates if there is a response, it indicates “grammatical” or “ungrammatical”“grammatical” or “ungrammatical”

NoisyNoisy:: response given to both response given to both grammatical and ungrammatical grammatical and ungrammatical sentences, but with different/detectible sentences, but with different/detectible frequencyfrequency..

Statistics (from Marcus Statistics (from Marcus 1993)1993)

Suppose response Suppose response RR occurs occurs 20%20% of the time of the time for ungrammatical sentences, for ungrammatical sentences, 12%12% of the of the time for grammatical sentences.time for grammatical sentences.

Kid gets response RKid gets response R to utterance U, there’s to utterance U, there’s a a 63%63% chance chance (20/32)(20/32) that U is that U is ungrammatical. ungrammatical. Guess: ungrammatical, Guess: ungrammatical, but 38% chance of being wrong.but 38% chance of being wrong.

Kid Kid doesn’tdoesn’t get response R get response R, 52% chance , 52% chance (88/168)(88/168) it’s grammatical. it’s grammatical. Guess: Guess: grammatical, but 48% chance of being grammatical, but 48% chance of being wrong.wrong.

Statistics (from Marcus Statistics (from Marcus 1993)1993)

Suppose response Suppose response RR occurs occurs 20%20% of the time of the time for ungrammatical sentences, for ungrammatical sentences, 12%12% of the of the time for grammatical sentences.time for grammatical sentences.

Suppose kid got response R to U, and is 63% Suppose kid got response R to U, and is 63% confident that U is ungrammatical—ok, but confident that U is ungrammatical—ok, but nowhere near good enough to build a nowhere near good enough to build a grammar.grammar.

This is a serious task, a kid’s going to want to This is a serious task, a kid’s going to want to be be suresure. Suppose kid is aiming for 99% . Suppose kid is aiming for 99% confidence confidence (adults make at most 1% speech (adults make at most 1% speech errors of the relevant kind—pretend this errors of the relevant kind—pretend this reflects 99% confidence)reflects 99% confidence)..

Lacking confidenceLacking confidence Based on R (20%-12% differential), they’d have Based on R (20%-12% differential), they’d have

to repeat U to repeat U 446446 times (and compile feedback times (and compile feedback results) to reach a 99% confidence level.results) to reach a 99% confidence level.

Based on various studies on noisy feedback, a Based on various studies on noisy feedback, a realistic range might be from realistic range might be from 8585 times times (for a (for a 35%-14% differential)35%-14% differential) to to 679679 times times (for a 11.3%-(for a 11.3%-6.3% differential)6.3% differential)..

This sounds rather unlike what actually happens.This sounds rather unlike what actually happens.

In a way, it’s moot In a way, it’s moot anyway…anyway…

One of the striking things about child language is One of the striking things about child language is how few errors they actually make.how few errors they actually make.

For negative feedback to work, the kids have to For negative feedback to work, the kids have to make the errorsmake the errors (so that it can get the negative (so that it can get the negative response).response).

But they don’t make enough relevant kinds of But they don’t make enough relevant kinds of errors to determine the complex grammar.errors to determine the complex grammar.

Yes-no questionsYes-no questions

17)17) The man is here.The man is here.18)18) Is the man here?Is the man here?

Hypothesis 1:Hypothesis 1: Move the first Move the first isis (or modal, (or modal, auxiliary) to the front.auxiliary) to the front.

Hypothesis 2:Hypothesis 2: Move the first Move the first isis after the after the subject noun phrasesubject noun phrase to the front. to the front.

19)19) The man who is here is eating dinner.The man who is here is eating dinner.

Yes-no questionsYes-no questions

19)19) The man who is here is eating dinner.The man who is here is eating dinner.20)20) *Is the man who here is eating dinner? *Is the man who here is eating dinner? (*H1)(*H1)21)21) Is the man who is here eating dinner? Is the man who is here eating dinner? (√H2)(√H2)

No kid’s ever said (20) to mean (21), which would No kid’s ever said (20) to mean (21), which would have been necessary to distinguish hypotheses have been necessary to distinguish hypotheses 1 and 2… Why not?1 and 2… Why not?

It seems that kids don’t even It seems that kids don’t even entertainentertain Hypothesis Hypothesis 1.1.

And that’s fine, because it seems like Hypothesis 1 And that’s fine, because it seems like Hypothesis 1 is a kind of rule not found in is a kind of rule not found in anyany adult adult language. language.

Abstract principlesAbstract principles Principle CPrinciple C:: Nothing coreferential can c-Nothing coreferential can c-

command a proper name. command a proper name.

*He*Heii believes John believes Johnii’s teacher.’s teacher.

HisHisii teacher believes John teacher believes Johnii..

Study of adult grammar reveals that Study of adult grammar reveals that c-c-commandcommand is the appropriate abstract notion, is the appropriate abstract notion, defined on syntactic structures. defined on syntactic structures. But how do But how do kids learn about c-command? You can’t kids learn about c-command? You can’t hearhear c- c-command.command.

What’s more, study of adult grammar reveals What’s more, study of adult grammar reveals that Principle C holds in every languagethat Principle C holds in every language!!

Kids don’t make as many mistakes as would Kids don’t make as many mistakes as would be needed for hypothesis testing. be needed for hypothesis testing.

Kids seem to receive no relevant negative Kids seem to receive no relevant negative evidence while learning language anyway.evidence while learning language anyway.

Kids learn Kids learn fastfast.. Kids become adults with all the Kids become adults with all the

grammatical knowledge pertaining thereto grammatical knowledge pertaining thereto (uniform, highly complex)(uniform, highly complex)

Kids come to know abstract principles (like Kids come to know abstract principles (like Principle C) without access to evidence Principle C) without access to evidence determining them. In many cases, these determining them. In many cases, these principles are observed in all human principles are observed in all human languages. languages. “Poverty of the stimulus”“Poverty of the stimulus”

So, we’ve got…So, we’ve got…

Having language = being Having language = being humanhuman

A linguistic capacity is part of being human.A linguistic capacity is part of being human.

Like having two arms, ten fingers, a vision Like having two arms, ten fingers, a vision system, humans have a language faculty.system, humans have a language faculty.

Specification of having arms instead of Specification of having arms instead of wings, etc., is somehow encoded wings, etc., is somehow encoded genetically.genetically.

Structure of the language faculty is Structure of the language faculty is predetermined, like the structure of the predetermined, like the structure of the vision system is.vision system is.

The language faculty (tightly) constrains The language faculty (tightly) constrains what kinds of languages a child can learn.what kinds of languages a child can learn.

=“=“Universal GrammarUniversal Grammar” (UG). ” (UG).

Universal GrammarUniversal Grammar

UG tightly constrains the learning process.UG tightly constrains the learning process. Study of Syntax, phonology, etc., is Study of Syntax, phonology, etc., is

generally trying to uncover properties of generally trying to uncover properties of Language, to specify what Language, to specify what kind kind of of languages a child can learn, to see what languages a child can learn, to see what kinds of restrictions UG places on kinds of restrictions UG places on language.language.

But kids don’t just enter the world But kids don’t just enter the world speaking like adults—there’s speaking like adults—there’s developmentdevelopment..

And, adults don’t all end up speaking the And, adults don’t all end up speaking the same language—there is same language—there is learninglearning..

LearnabilityLearnability

The The Principles & Parameters modelPrinciples & Parameters model is is designed to address the learnability designed to address the learnability problem we faced:problem we faced: Languages are very complex.Languages are very complex. Languages differ (Languages differ (something something has to be has to be

learned).learned). Children get insufficient and variable Children get insufficient and variable

evidence to deduce the uniform rules of evidence to deduce the uniform rules of grammar they end up with.grammar they end up with.

Children have adult-like grammars Children have adult-like grammars relatively quickly.relatively quickly.

The proposed solution to the apparent The proposed solution to the apparent paradox is to suppose that paradox is to suppose that to a large extent all to a large extent all human languages are the samehuman languages are the same.. The The grammatical systems obey the same principles grammatical systems obey the same principles in all human languages.in all human languages.

UG Japanese

English

Principles and Principles and ParametersParameters

Languages differ, but only in highly limited Languages differ, but only in highly limited ways.ways. In the order between the verb and the object.In the order between the verb and the object. In whether the verb raises to tenseIn whether the verb raises to tense ……

UG Japanese

English

Principles and Principles and ParametersParameters

This reduces the task for the child This reduces the task for the child immensely—all that the kid needs to do is immensely—all that the kid needs to do is to determine from the input which setting to determine from the input which setting each of the parameters needs to have for each of the parameters needs to have for the language in his/her environment.the language in his/her environment.

UG Japanese

English

Principles and Principles and ParametersParameters

The standard pictureThe standard picture

The way this is usually drawn schematically is The way this is usually drawn schematically is like this. The like this. The Primary Linguistic DataPrimary Linguistic Data (PLD)(PLD) serves as input to a serves as input to a Language Acquisition Language Acquisition DeviceDevice (LAD) (LAD), which makes use of this , which makes use of this information to produce a information to produce a grammargrammar of the of the language being learned.language being learned.

LADPLD grammar

The standard pictureThe standard picture

This isolates the innately specified language This isolates the innately specified language faculty into a single component in the picture.faculty into a single component in the picture. The LAD contains (a specification for) all of the The LAD contains (a specification for) all of the principlesprinciples and the and the parametersparameters, and has a , and has a procedureprocedure for going from PLD to parameter for going from PLD to parameter settings.settings.

LADPLD grammar

We may be able to avoid confusion We may be able to avoid confusion later, though, if we differentiate the later, though, if we differentiate the innately provided system into its innately provided system into its conceptual components.conceptual components.

This is my rendition of a way to think This is my rendition of a way to think about UG, parameters, and LAD.about UG, parameters, and LAD.

LAD

PLDUG

SubjacencyBinding Theory

Modeling human Modeling human language capacitylanguage capacity

UG UG providesprovides the parameters the parameters and contains and contains the grammatical system the grammatical system (including the (including the principles, like Subjacency, Binding Theory, principles, like Subjacency, Binding Theory, etc.)etc.) that makes use of them. that makes use of them.

LAD LAD setssets the parameters the parameters based on the PLD. based on the PLD. Responsible for getting language to kids.Responsible for getting language to kids.

LAD

PLDUG

SubjacencyBinding Theory

Modeling human Modeling human language capacitylanguage capacity

The idea behind this diagram is that UG is The idea behind this diagram is that UG is something like the something like the shape of language shape of language knowledgeknowledge.. Knowledge of language can only take a certain, Knowledge of language can only take a certain,

innately pre-specified “shape”.innately pre-specified “shape”. A system with this “shape” has certain properties, A system with this “shape” has certain properties,

among them Binding Theory, Subjacency, … the among them Binding Theory, Subjacency, … the Principles.Principles.

LAD

PLDUG

SubjacencyBinding Theory

Modeling human Modeling human language capacitylanguage capacity

The The ParametersParameters are different ways in are different ways in which stored knowledge can conform to which stored knowledge can conform to the “shape” of UG.the “shape” of UG.

The The LADLAD is a system which analyzes the is a system which analyzes the PLD and PLD and setssets the parameters the parameters..

LAD

PLDUG

SubjacencyBinding Theory

Modeling human Modeling human language capacitylanguage capacity

So two languages which differ with So two languages which differ with respect to one parameter setting might respect to one parameter setting might be represented kind of like this.be represented kind of like this. This is of course a cartoon view of things, This is of course a cartoon view of things,

but perhaps it might be useful later.but perhaps it might be useful later.

LanguageA Language

B

Principles and Principles and ParametersParameters

Principles and Principles and ParametersParameters

So what So what are are the Principles and Parameters?the Principles and Parameters? Good questionGood question!! —and that’s what theoretical —and that’s what theoretical

linguistics is all about.linguistics is all about. Since 1981, many principles and parameters have Since 1981, many principles and parameters have

been proposed. As our understanding of language been proposed. As our understanding of language grows, new evidence comes to light, and previous grows, new evidence comes to light, and previous proposals are discarded in favor of better proposals are discarded in favor of better motivated ones. It’s hard to keep a current tally of motivated ones. It’s hard to keep a current tally of “the principles we know of” because of the active “the principles we know of” because of the active nature of the field.nature of the field.

Principles and Principles and ParametersParameters

Some of the (proposed) Parameters that Some of the (proposed) Parameters that have received a fair amount of press are:have received a fair amount of press are: Bounding nodes for SubjacencyBounding nodes for Subjacency Binding domain for anaphors and pronounsBinding domain for anaphors and pronouns Verb-object orderVerb-object order Overt verb movement (V moves to tense)Overt verb movement (V moves to tense) Allowability of null subject (Allowability of null subject (propro) in tensed ) in tensed

clausesclauses

We’ll look at each of them in due course…We’ll look at each of them in due course…

Verb-object orderVerb-object orderThe parameter for The parameter for verb-object orderverb-object order (more (more

generally, the “generally, the “head parameterhead parameter” setting out the ” setting out the order between Xorder between X-theoretic head and complement) -theoretic head and complement) comes out as:comes out as:

Japanese:Japanese: Head-finalHead-final (X follows complement) (X follows complement) English:English: Head-initialHead-initial (X precedes complement). (X precedes complement).

Figuring out which type the target language is is Figuring out which type the target language is is often fairly straightforward. Kids can hear often fairly straightforward. Kids can hear evidence for this quite easily. evidence for this quite easily. (Not (Not trivialtrivial, though, though—consider German SOV-V2)—consider German SOV-V2)

Principle APrinciple A

22)22) Sam believes [that Sam believes [that HarryHarry overestimates overestimates himselfhimself]]

23)23) SamSam-wa [-wa [HarryHarry-ga -ga zibunzibun-o tunet-ta to] it-ta]-o tunet-ta to] it-ta]Sam-top Harry-nom Sam-top Harry-nom selfself-acc pinch-past-that say--acc pinch-past-that say-pastpast‘Sam said that Harry pinched him(self).’‘Sam said that Harry pinched him(self).’

Principle APrinciple A

Principle A.Principle A. A reflexive pronoun A reflexive pronoun must have a higher antecedent in its must have a higher antecedent in its binding domain.binding domain.

Parameter: Binding DomainParameter: Binding Domain Option (a):Option (a): domain = domain = smallest clausesmallest clause

containing the reflexive pronouncontaining the reflexive pronoun Option (b):Option (b): domain = domain = utteranceutterance

containing the reflexive pronouncontaining the reflexive pronoun

But how can you set this But how can you set this parameter?parameter?

Every sentence a kid learning English hears is Every sentence a kid learning English hears is consistent with consistent with both both values of the parametervalues of the parameter!!

If a kid learning English decided to opt for the If a kid learning English decided to opt for the “utterance” version of the domain parameter, “utterance” version of the domain parameter, nothing would ever tell the kid s/he had made a nothing would ever tell the kid s/he had made a mistake.mistake.

S/he would end up with non-English intuitions.S/he would end up with non-English intuitions.

But how can you set this But how can you set this parameter?parameter?

A kid learning Japanese can tell right A kid learning Japanese can tell right away that their domain is the away that their domain is the sentence, since they’ll hear sentence, since they’ll hear sentences where sentences where zibunzibun refers to an refers to an antecedent outside the clause.antecedent outside the clause.

But how can you set this But how can you set this parameter?parameter?

The set of sentences allowed in English is a The set of sentences allowed in English is a subsetsubset of the set of sentences allowed in of the set of sentences allowed in Japanese.Japanese. If you started assuming the If you started assuming the English value, you could learn the Japanese English value, you could learn the Japanese value, but not vice-versa.value, but not vice-versa.

Sentences allowed in Sentences allowed in JapaneseJapanese (domain = (domain = utteranceutterance))

Sentences allowed in Sentences allowed in EnglishEnglish (domain = (domain = clauseclause))

Subset principle/defaultsSubset principle/defaults

Leads to:Leads to: The acquisition device The acquisition device selects selects the most restrictive the most restrictive parametric valueparametric value consistent with consistent with experience. experience. ((Subset principleSubset principle))

That is, for the That is, for the Principle A domain Principle A domain parameterparameter, you (a LAD) , you (a LAD) startstart assuming you’re learning assuming you’re learning EnglishEnglish and switch to and switch to JapaneseJapanese only if only if presented with evidence.presented with evidence.

What it takes to set a What it takes to set a parameterparameter

Binding domain parameterBinding domain parameter Option (a): Binding domain is Option (a): Binding domain is clauseclause.. Option (b): Binding domain is Option (b): Binding domain is utteranceutterance..

English English = option a, = option a, Japanese Japanese = = option b.option b.

EJ

What it takes to set a What it takes to set a parameterparameter

Binding domain Binding domain parameterparameter Kids should start under Kids should start under

the assumption that the the assumption that the parameter has the parameter has the EnglishEnglish setting.setting.

If they hear only English If they hear only English sentences, they will stick sentences, they will stick with that setting.with that setting.

If they hear Japanese If they hear Japanese sentences, they will have sentences, they will have evidence to move to the evidence to move to the JapaneseJapanese setting. setting.

EJ

What it takes to set a What it takes to set a parameterparameter

Null subject parameterNull subject parameter Option (a): Null subjects are Option (a): Null subjects are permittedpermitted.. Option (b): Null subjects are Option (b): Null subjects are not permittednot permitted..

Italian Italian = option a, = option a, English English = option b.= option b.

E

IVery sensible. Now, let’s consider another parameter of variation across languages.

What it takes to set a What it takes to set a parameterparameter

The Subset principle says that The Subset principle says that kids should start with the English kids should start with the English setting and setting and learnlearn Italian if the Italian if the evidence appears.evidence appears.

But even But even EnglishEnglish kids are well- kids are well-known to drop subjects early on in known to drop subjects early on in acquisition. As if had the Italian acquisition. As if had the Italian setting for this parameter.setting for this parameter.

E

I

Moreover…Moreover… EnglishEnglish kids hear kids hear looks goodlooks good and and seems okseems ok and and

stop that right nowstop that right now. Why don’t they end up . Why don’t they end up speaking Italian? If they mis-set the parameter, speaking Italian? If they mis-set the parameter, how could they ever recover?how could they ever recover?

ItalianItalian kids hear subjectless sentences—why kids hear subjectless sentences—why don’t they interpret them as imperatives or don’t they interpret them as imperatives or fragments (so as not to have to change the fragments (so as not to have to change the parameter from the default)?parameter from the default)?

TriggersTriggers

It seems like It seems like actual occurrence of actual occurrence of null subjectsnull subjects isn’t a very good clue isn’t a very good clue as to whether a subject is a null as to whether a subject is a null subject language or not.subject language or not.

Are there better clues? If a Are there better clues? If a strapping young LAD were trying strapping young LAD were trying to set the null subject parameter, to set the null subject parameter, what should it look for? what should it look for?

TriggersTriggers Turns out:Turns out: Only true subject-drop languages Only true subject-drop languages

allow null subjects in tensed embedded allow null subjects in tensed embedded clauses.clauses.

24)24) *John knows that [— must go]. *John knows that [— must go]. ((EnglishEnglish))

25)25) Juan sabe que [— debe ir].Juan sabe que [— debe ir]. ((SpanishSpanish))‘Juan knows that [he] must go.’‘Juan knows that [he] must go.’

Perhaps the LAD “knows” this and looks for Perhaps the LAD “knows” this and looks for exactly this evidence.exactly this evidence. Null subjects in Null subjects in embedded tensed clauses would be a embedded tensed clauses would be a triggertrigger for the (positive setting of the) null subject for the (positive setting of the) null subject parameter.parameter.

TriggersTriggers A potential problem with the proposed A potential problem with the proposed

subject-drop trigger is that it requires subject-drop trigger is that it requires complexcomplex sentences—you need to look at an sentences—you need to look at an embedded sentence to check for the trigger.embedded sentence to check for the trigger.

Such sentences might be too complicated for Such sentences might be too complicated for kids to process.kids to process.

Degree-1 learnability:Degree-1 learnability: Triggers need look no Triggers need look no lower than lower than 1 level of embedding1 level of embedding..

Degree-0 learnability:Degree-0 learnability: Triggers need look Triggers need look only at only at main clausesmain clauses..

TriggersTriggers Many who work on learnability haveMany who work on learnability have

adopted the hypothesis that triggersadopted the hypothesis that triggersneed to be degree-0 learnable.need to be degree-0 learnable.

SubjacencySubjacency. *[. *[whwh [ [ … [ … [ … … tt … ] ] … ] ]where where and and are bounding nodes. are bounding nodes.

Bounding node parameter for IP:Bounding node parameter for IP: Option (a):Option (a): IP IP isis a bounding node (English). a bounding node (English). Option (b):Option (b): IP IP is notis not a bounding node (French, a bounding node (French,

Italian).Italian).

IP and TP are often

used inter-changeably

TriggersTriggers

Thus, a kid learning French couldn’t choose Thus, a kid learning French couldn’t choose option (b) by hearing this…option (b) by hearing this…

28)28) Violà un liste de gens… ‘there is a list of people…’Violà un liste de gens… ‘there is a list of people…’

[[à quià qui on n’a pas encore trouvé [ on n’a pas encore trouvé [quoiquoi envoyer envoyer t tt t ]] ]]to whomto whom one has not yet found [ one has not yet found [whatwhat to send]] to send]]

……since that’s a degree-2 trigger. But…since that’s a degree-2 trigger. But…

TriggersTriggers

29)29) CombienCombien as- [ as- [IPIP tu vu [ tu vu [NPNP t t de personnes]]?de personnes]]?

How-manyHow-many have you seen of people have you seen of people‘How many people did you see?’‘How many people did you see?’

If IP If IP werewere a bounding node, this should be a bounding node, this should be ungrammatical in French, so this can serve ungrammatical in French, so this can serve as (degree-0) evidence for option (b).as (degree-0) evidence for option (b).

TriggersTriggers

Principles are part of UGPrinciples are part of UG

Parameters are defined by UGParameters are defined by UG

Triggers for parameter settings are Triggers for parameter settings are defined as part of the LAD.defined as part of the LAD.

Navigating grammar Navigating grammar spacesspaces

Regardless of the technical details, Regardless of the technical details, the idea is that in the space of the idea is that in the space of possible grammars, there is a possible grammars, there is a restricted set that correspond to restricted set that correspond to possible possible humanhuman grammars. grammars.

Kids must in some sense navigate that Kids must in some sense navigate that space until they reach the grammar space until they reach the grammar that they’re hearing in the input data.that they’re hearing in the input data.

LearnabilityLearnability

So how do they do it?So how do they do it? Where do they start?Where do they start? What kind of evidence do they need?What kind of evidence do they need? How much evidence do they need?How much evidence do they need?

Research on Research on learnabilitylearnability in language in language acquisition has concentrated on acquisition has concentrated on these issues.these issues.

Are we there yet?Are we there yet? There are a lot of grammars to choose from, even There are a lot of grammars to choose from, even

if UG limits them to some finite number.if UG limits them to some finite number. Kids have to try out many different grammars to Kids have to try out many different grammars to

see how well they fit what they’re hearing.see how well they fit what they’re hearing. We don’t want to require that kids remember We don’t want to require that kids remember

everything they’ve ever heard, and sit there and everything they’ve ever heard, and sit there and test their current grammar against the whole test their current grammar against the whole corpus of utterances—that’ a lot to remember.corpus of utterances—that’ a lot to remember.

Are we there yet?Are we there yet?

We also want the kid, when they get We also want the kid, when they get to the right grammar, to stay there.to the right grammar, to stay there.

Error-driven learningError-driven learning Most theories of learnability rely on a Most theories of learnability rely on a

kind of kind of error-detectionerror-detection.. The kid hears something, it’s not The kid hears something, it’s not

generable by their grammar, so they generable by their grammar, so they have to switch their hypothesis, to move have to switch their hypothesis, to move to a new grammar.to a new grammar.

PlasticityPlasticity Yet, particularly as the navigation progresses, we Yet, particularly as the navigation progresses, we

want them to be zeroing in on the right want them to be zeroing in on the right grammar.grammar.

Finding an error doesn’t mean that you (as a kid) Finding an error doesn’t mean that you (as a kid) should jump to some random other grammar in should jump to some random other grammar in the space.the space.

Generally, you want to move to a nearby Generally, you want to move to a nearby grammar that improves your ability to generate grammar that improves your ability to generate the utterance you heard—move in baby steps.the utterance you heard—move in baby steps.

TriggersTriggers

Gibson & Wexler (1994) looked at Gibson & Wexler (1994) looked at learning word order in terms of three learning word order in terms of three parameters (head, spec, V2).parameters (head, spec, V2).

Their Their triggering learning algorithmtriggering learning algorithm says if you hear something you can’t says if you hear something you can’t produce, try switching produce, try switching oneone parameter parameter and see if it helps. If so, that’s your and see if it helps. If so, that’s your new grammar. Otherwise, stick with new grammar. Otherwise, stick with the old grammar and hope you’ll get the old grammar and hope you’ll get a better example.a better example.

Local maximaLocal maxima A problem they encountered is that there are A problem they encountered is that there are

certain places in the grammar space where you certain places in the grammar space where you end up more than one switch away from a end up more than one switch away from a grammar that will produce what you hear.grammar that will produce what you hear.

This is This is locally as good as it getslocally as good as it gets—nothing next to —nothing next to it in the grammar space is better—yet if you it in the grammar space is better—yet if you consider the whole grammar space, there is a consider the whole grammar space, there is a better fit somewhere else, you just can’t get better fit somewhere else, you just can’t get there with baby steps.there with baby steps.

Local maximaLocal maxima

This is a point where any move you make This is a point where any move you make is worse, so a conservative algorithm will is worse, so a conservative algorithm will never get you to the best place. never get you to the best place. Something a working learning algorithm Something a working learning algorithm needs to avoid. (And kids, after all, make needs to avoid. (And kids, after all, make it).it).