Semantics and Pragmatics: Introduction

Semantics and Pragmatics: Introduction

John Barnden

School of Computer ScienceUniversity of Birmingham

NaturNatural al Language ProcessLanguage Processinging 1 1 2010/11 Semester 2

In following slides,

[R] = you Read it later – I won’t go through it.

But you Really do need to read it.

[O] = optional material.

Aims of This Section

• Give you an idea of what semantics and pragmatics are in general.

• Cover some specific semantic and pragmatic issues, including aspects of pronoun reference.

• Cover the concept of semantic compositionality and its link to syntactic compositionality, and consider ways it can fail to exist (some idioms, some metaphor, ...).

• Introduce one scheme for constructing semantic representations from syntax trees, via “quasi-logical forms”.

• And through all of the above giving you info you’ll need for the exam!

Contributions to Meaning• “The number three is the smallest odd prime number.” Said to us by Mary.

• We can envisage getting an understanding just by suitably putting together the meanings of the individual words, in a way that’s guided by the syntax tree.

  So individual word meanings and syntax make contributions to meaning of whole sentences, and these contributions are in some sense local to the sentence itself. What else?

• “That number is the smallest odd prime number.” Said to us by Mary, pointing at the blackboard.

  What number? We need knowledge of the discourse situation (the situation in which the discourse is taking place – also called the situational context), as well as general knowledge about blackboards and about how pointing works (dogs and very small children don’t know the latter), in order to know what specific number is at issue.

Contributions to Meaning, contd• But do we need to know the specific number? Isn’t the following proposition at

least one perfectly good meaning of the sentence?: The number Mary is pointing to, whatever it is, is the smallest odd prime number.

• So there are different types of meaning we could consider, differing as to specificity, richness, etc.

Contributions to Meaning, contd• “That is the smallest odd prime.” Said to us by Mary, pointing at the

blackboard.

  What number? Notice now that That is a (demonstrative) pronoun, not a demonstrative adjective as above. We don’t immediately know Mary is pointing at a number. If we have enough general knowledge about primes to know that they can be numbers (amongst other things), we can infer that what she’s pointing at is a number.

  Notice that smallest and odd help us interpret prime as a number. But primes can be orthographic marks and can be psychological stimuli, and can be odd in the sense of weird, and can be small in physical senses. So we have a problem with ambiguous word senses, and the possible need to use linguistic context to help us make choices.

  If we can see that the blackboard contains mainly numbers, or we already know that the discourse is about numbers, then of course we get great help. Discourse situation again.

Contributions to Meaning, contd• “Robert is the tallest spy.”

  Who is Robert, and do we even need to know? Here, knowledge of the discourse situation, knowledge of the situation being described, and/or general knowledge about specific spies, could help.

• Let’s use the term world knowledge to cover those types of knowledge

• “Robert McNamara and his assistants came to see me. Robert is the tallest spy.”

  At least we can be fairly sure that the second Robert is the same person as the first, although we may not know anything about the first. Again, use of linguistic context, and an assumption of discourse coherence between sentences.

• “Robert McNamara and his assistants came to see me. He is the tallest spy.” Similar comments, now concerning the He.

  Notice the linguistic knowledge that Robert is probably being used here as a person’s first name, and that as such it refers to a male person (usually).

Pronouns, contd• “Robert and Bill came to see me. He is a fine fellow.”

  Which one is the fine fellow?

• “Robert came to see me. He brought Bill. He is a fine fellow.”

  Stronger tendency to regard the second He as referring to Bill (unless there’s pressure from context to think there’s something fine about bringing Bill). Tendency stronger in:

• “Robert and Mary came to see me. She brought Bill. He is a fine fellow.”

  So: recency of mention as well as the right gender, person (=third here) and number are significant in determining the antecedent of a pronoun.

Pronouns, contd• But what does “last mentioned” and “right person and number” mean? .......

• “My boss came to see me. She’s a great administrator.”

  The first sentence doesn’t specify the gender. Rather, the pronoun is actually informative about the gender (assuming it does refer to the boss).

• “The Smiths came dinner. He’s a cook and she’s an architect.”

  The mention of a man and a woman is only implicit, and indeed partly constituted by the he and she, which are thus somewhat informative. In other circumstances The Smiths could refer to an entirely male family or other group.

  We have inferred a man and a woman from The Smiths.

• And of course, a pronoun referent might merely be in the discourse context, rather than in the linguistic context. We saw this with pronoun That above, but of course can apply to he, she, etc. as well.

Pronouns and Anaphora• Use of pronouns and names to refer to entities via previous linguistic context is

a form of anaphora. The pronouns and names are anaphors. (There are other types of anaphor.)

  They can also refer to entities that are merely in the discourse situation or the described situation.

• More problems with pronoun anaphora:

• “The policeman was worried about the running man. He looked as though he might have a bomb.”

  “The policeman was worried about the running man. He thought he might be facing a terrorist situation.”

  It’s only quite elaborate general knowledge and inference, including about why a policeman might be worried, that allows us to interpret the He differently in these two cases.

Issues with Anaphora, etc., contd.• “The policeman was worried about the running man. He looked as though he

might have a bomb.”

  The definite description “The policeman” is somewhat similar to definite descriptions like that number.

• A definite description is an NP that uses a description and a definite determiner such as “the” or “that” or “John’s” to try to refer to a unique entity in context.

  When used to refer back via linguistic context, it’s an example of another type of anaphor.

  Or it may be that the referent is given by discourse situation, by described situation, or by general world knowledge.

Issues with Anaphora, contd.• As with pronoun referents, the referent of a definite description may be

implicit:

• “The teapot fell on the floor. The handle broke.”

  What handle? Another use of inference to achieve discourse coherence.

  Such inference to bridge between discourse chunks is called bridging inference.

• Similar examples:

• “Power companies have been scolded by the regulator. The bosses will be angry.”

  “Tom tried to cut his steak. But the knife, which was made of plastic, wasn’t up to the job.”

  “Tom tried to cut his steak. But the plastic knife wasn’t up to the job.”

  Notice that the plastic aspect doesn’t necessarily appeal to any existing knowledge – it can be informative.

I• “I” is relatively straightforward, but still pretty complex!

• Unless the utterer is joking, imitating, acting, etc., it refers to the utterer.

• The utterer may not be the utterer of the overall sentence, but the utterer of a quoted expression.

• The understander may not know who the utterer (or a non-quoted utterance) is – so may have to imagine one.

• The utterer is part of the discourse situation. The utterer, addressee, time and location of an utterance are considered especially significant aspects of the discourse situation.

• One account that has been suggested is to treat “I” by effectively replacing it by something like “the utterer of the present utterance”. But of course at some stage the understander often has to resolve this to a specific person.

• Note the metaphorical use of “I” as in a poster on a car that says “I’m for sale.” (More precisely, the car is being metaphorically viewed as an uttering person, licensing the use of “I”.)

[R] Normal You• “You” is still relatively straightforward, but less so than “I”.

• For a start, of course, it’s ambiguous between singular or plural. The understander’s choice here could depend on linguistic context or world knowledge.

• “Could you lend me a pen”: probably singular.

• “You need to hand in your assignments by May 9th”: probably plural (even though one student could possibly have several assignments).

• When plural, the set of people meant may be unclear and/or vague.

[R] Generic You• Generic use: “In Britain, when you go to the cinema, you often have to put up

with noisy popcorn crunchers, sweet-packet crinklers and inconsiderate chatterers.”

  (My GOM-beef of the week.)

• NB: generic “you” is grammatically singular:

  * “If you blow your noses, hold your handkerchiefs to them”  [when not uttered to Zaphod Beeblebrox]

• Generic “you” is essentially synonymous with generic “one” (as in “When one goes to the cinema …”) but more informal, and probably also more strongly connotes that the situation described is one that the addressee(s) may be personally interested in.

• Notice cases like “You’re going down the street. You bump into someone. They immediately start shouting at you. What do you do?” Arguably this is not generic “you”: you’re being asked to imagine you yourself engaged in something. But there’s probably a gradation between non-generic and generic uses of “you.”

We• Complex, vague, variable, highly context-dependent, though certainly includes

the utterer (putting aside joking, acting, etc.).

• One division is between including the addressee(s) and excluding them:

  “We are going to create a new module in Astrological Computation”

  “We are going to study Fate and Music of the Spheres in this module.”

• Can include any number of people aside from utterer and addressees, even the whole of some large category of people, or even the whole of humanity:

  “We don’t yet have a reliable way of creating reliable large-scale IT systems.”

  “We in CS are constantly warning the Government about this.”

  “We’re affecting the climate in all sorts of ways.”

  “We are the only rational bipeds.”

• Exercise: “We are the champions.”

[R] They• Much like “we”: Complex, vague, variable, highly context-dependent.

• Special colloquial use of “they” to mean roughly “people in authority over us or with authority about a topic” – as in “What are they going to inflict on us now”, “Do they know how to cure baldness yet?”.

• Special colloquial use of “they” as convenient replacement for “he and she”.

  NB: grammatically plural though semantically singular.

  Various people through history (including me!) have publically proposed using “it” to cover “he, she or it”.

  My own motive is partly to cover future robots!

[R] Anaphora to Propositions, Events, etc.

• “The snake came in and curled itself round my leg. I could see its slobbery fangs. That’s the most frightening thing that’s happened to me.”

• “Oh, is that true? What about when you were mugged?”

• “It”, “those”, “these”, “they” can be used in similar ways.

• Computational work on anaphora hasn’t touched the above types very much.

• Extremely context-sensitive, and big opportunities for vagueness. The referent may be implicit over a large, ill-defined stretch of discourse.

[R] Special Cases / Exceptions / Weirdities• Pleonastic “it” [very common]: “It’s raining”.

  Extraposed uses of “it” [v. common], as in “It’s Mary who has the prettiest pigtails.”

• Royal “we” by a (British) king or queen. Authorial “we”: single authors often say “we” to mean just themselves, especially in an academic paper, in a misguided avoidance of “I”. (But “we” OK if it includes reader.)

• “It” for baby or small child, in (at least) British English [somewhat old-fashioned; causes amazement in USA]

  “She” for boats and sometimes for other large, beautiful or impressive things. “She” used (at least at one time) by some male gays to refer to feminine-style partners. [Could be regarded as a case of metaphor.]

• Animals, especially pets, can be referred to by “he” or “she” as appropriate, but can also be referred to by “it” even when the gender is salient. Gendered pronouns can be used for dolls, gods and other things that are merely person-like in some way.

  Plants: not referred to by gendered pronouns even when their gender is salient?

So Far …• We’ve seen that individual words and syntactic structure can at least

contribute to meaning, aside from use of linguistic context, world knowledge (discourse situation, described situation or general knowledge) or inference.

• But meaning, or richer meaning, does often require use of context (= linguistic context or world knowledge), and inference that uses context.

• We’ve looked at how that arises in the case of referring expressions such as names, pronouns (including demonstrative ones) and definite descriptions.

• We now look at some other ways in which context needs to come in. (Not an exhaustive survey.)

Refinement of Meaning• Recall previous discussion of “cut”.

  “Mary cut the cake / the grass / her hair / the string / her finger / the dress / the budget / … “

  If this is not a case of ambiguity (so that the task is selection from a pre-defined set of discrete forms of cutting), then our understanding of the particular form the cutting took involves a refinement of some (semi-)generic meaning.

  World knowledge and inference involving (at least) both the cutter and cuttee come into play.

• Similarly with “The apple is in that bowl over there” even when the apple is perched on top of a pile of fruit in the bowl, and not surrounded by the bowl volume. The meaning of “in” is very context-sensitive. “I’m in the swimming pool” allows part of me to be poking above the edge.

• Words like “here” and “there” are notoriously vague and context-sensitive. But they take a core of their meaning from the discourse situation.

Refinement of Meaning, contd• “Robert is the tallest spy” :

  What’s the frame of reference for understanding “tallest” ? In the immediate discourse situation, or in the described situation, or in the whole world, or somewhere in between?

• Similarly with quantifiers:

• “When Robert came into the room, everyone laughed.” :

  Did everyone in the world laugh?! Did Robert himself laugh?

• “Does someone have a pencil?”

  Why isn’t the answer to this question necessarily always “yes”?!

[R] But …• Are such refinements part of the “meanings” of the sentences? Or merely

inferences from that meaning?? Is this just a terminological matter??

• E.g., with “Robert is the tallest spy”, is the frame-of-reference issue strictly part of meaning? Should the meaning of the sentence merely be that: Robert is tallest in some relevant frame?

• Similarly with “here”: just take it to mean: some locality or other that includes the utterer location?

Choice between Meanings• “Peter was standing beside the bank”.

  Need linguistic context or world knowledge to know whether it’s a river bank or a financial building or …

• General problem with ambiguous (including polysemous) words.

• Sometimes other wording within the same sentence helps:

  “Peter was changing money at the bank.”

  But it’s still a matter of world knowledge that a financial bank is a place where you can change money.

• NB: that knowledge does not necessarily need to be applied through inference in the normal sense – it can often be handled implicitly and statistically, through statistics about occurrence of words near to each other (“co-occurrence”).

Lexical Meaning: Sparse or Encyclopaedic?

• In considering the contribution of “cat” to meaning of “The cat sat on the mat”:

is there a restricted lexical meaning (word sense) for “cat” (e.g., as a mammal with such-and-such an appearance), and similarly for the other words, so that we can construct a sort of meaning of the sentence based just on those meanings and the syntactic structure?

• So richer meaning would then rely on inference using encyclopaedic knowledge?

• Or is all our knowledge about cats, mats, etc. available at all times in meaning construction, there being no restricted bodies of knowledge that can be called the meanings or senses of the words?

• In a nutshell: is world knowledge separable from lexical knowledge?

Covert Parameters (?)• “It’s raining,” says Mary to us.

We would usually take her to mean it’s raining in her locality.

This uses the utterer-location feature of the discourse situation.

• It’s as if the meaning of “to rain” has a covert parameter that gets filled by the utterer-location value, by default.

• But this default can be overriden, as in “It’s raining in New York” (said in London).

• But caution: “When it’s raining, it’s useful to have an umbrella.” There’s no use of the utterer location here, in general.

• Also: Bill says on the phone in NY, “I’m staying at home tonight”. Mary in London replies: “It’s raining, then.” She means in NY.

Also: astronomer peering at a planet through a telescope might say “it’s raining” meaning it’s raining somewhere or other on that planet.

[R] Covert Parameters (?) contd

• Similar phenomenon: “It’s horrid,” says Mary to us, looking at the squashed bird.

  But things aren’t objectively horrid. We take Mary to mean: horrid to her.

• Use of utterer-identity feature of the discourse situation.

[R] No Covert Parameters?• Some language theorists say that the addition of the location information, etc.,

as above is a matter of free enrichment, i.e. addition of information by unrestricted processes of inference, not a filling of some discrete list of predefined parameters.

• This seems plausible to me in particular.

• In saying “It’s raining” Mary cannot simply mean that it’s raining somewhere or other, as that would normally be pointless. So we work out from context what location she would be focussing on, given that encyclopaedic knowledge tells us that rain is usually a localized thing.

• Such theorists feel: So many words have some degree of standard context-sensitivity (e.g., as to location or experiencer), and the types of context-sensitivity are so various and subtle, that it’s difficult to see how some straightforward, limited process of parameter-filling can do the trick.

So Far Again …

• We’ve seen various ways in which context needs to come in, be it linguistic context, or discourse situation, or described situation, or the world more generally.

Inference, often elaborate and subtle, can be involved, not just “knowledge”.

• “knowledge” is in quote marks because: it’s all very much about what utterers and addressees believe to be the case—about the world and each other—not what’s actually the case. [No time to go into this huge and fascinating topic further.]

• We now move to consider the notions of pragmatics and semantics, and then

• Go over a particular way of computing a particular, limited form of semantics.

• Read textbook sections 21.3 to 21.6.2 inclusive for more on anaphora and algorithms for resolving it.

Pragmatics• Pragmatics is the study of how meaning (in a broad sense: any

understanding we get from utterances) of a language unit (e.g., paragraph, sentence, clause, phrase, word) is affected by

– linguistic context beyond that unit (especially if a sentence) or

– by knowledge of the described situation or

– knowledge of the world more generally or

– by (anything beyond the most trivial) inference using any such information.

• Semantics is the study of the non-pragmatic aspects of meaning, e.g. use of sparse lexical senses and of syntactic structure.

  Theorists differ as to whether use of the discourse situation is part of pragmatics or semantics. Semanticists like to include it in semantics because it’s difficult otherwise to get a useful semantics even started.

  The question of whether (some core aspects of) intra-sentential or inter-sentential anaphora is semantic or pragmatic is also an issue. Certainly, some aspects must be accepted to be pragmatic.

Better Terminology• In my view, what’s usually called semantics would better be called something

like “local semantics” [because it tries to treat syntactic constituents as isolated from context, except perhaps from the discourse situation].

Yet better would be “lexico-syntactic semantics”.

The term “syntax-driven semantic analysis” in the textbook (section 18.1) is quite good.

• Then “semantics” could then mean what it most naturally should mean, i.e. the study of meaning in all its glory, including pragmatic aspects.

[R] Better Terminology

• The usual distinction between semantics and pragmatics is in any case murky because it rests on particular theoretical assumptions, e.g. (typically) that there are sparse word senses. If this doesn’t hold, then pragmatics enters even into local semantics. You’re just left with syntax and pragmatics.

So the usual distinction is highly “theory-laden”, not an objective description of the phenomena themselves.

• But we now press on with a look at (lexico-syntactic) semantics.

Towards Lexico-Syntactic Semantics

• Language has syntactic compositionality: there are composition rules that, whenever they are given constituents of particular types, produce constituents (or whole sentences) of particular types, where the resulting types are dependent only on the rule and the constituents given to it.

Semantic Compositionality

• (Lexico-syntactic) semantics is very much, though not exclusively, concerned with semantic compositionality: this occurs to the extent that

the meaning of a syntactic constituent that is syntactically composed from other constituents in a certain way is a function of (i.e., dependent only on)

– the meanings of those constituents

– the way they are syntactically composed (i.e. what grammar rule used, and taking into account any ancillary information such as grammatical-feature values).

• The base of the recursion: the meanings of the lexical forms (that don’t have syntactic structure) are almost always taken to be “word senses”: packages of information of some restricted sort that are associated in advance with the words.

• Depending on the theory, referents of names and pronouns may also be allowed instead of normal word senses.

Semantic Compositionality, contd

• For example: “Dogs are friendly”.

• Suppose for simplicity that this is viewed as having a syntactic structure coming from the (unrealistic) rule

  S Noun(plural) BE-Verb(plural,Tense) Adj

  where the plural parameter-value and the Tense parameter are for handling grammatical features.

• Then the meaning of the sentence is fully defined by applying some semantic function associated with that particular rule, to the meanings of the noun, verb and adjective, and using the Number and Tense values.

• This function might, for example, do the following:

• when Number = plural, deliver the situation that all the entities of the entity-type that is the meaning of the noun have in some period (depending on the Tense value) the quality that is the meaning of the Adjective;

• when Number = singular, deliver the situation that the single entity that is the referent of the noun has in some period …

[R] Semantic Compositionality, contd

• If a grammar rule is non-branching, i.e. just has one component on its RHS as in NP Noun,

  then the meaning of the constituent viewed as being of the RHS type (Noun) is typically just “carried up” without change to become the meaning of the constituent viewed as being of the LHS type (NP).

  E.g. the NP “dog” has the same meaning as the noun “dog” (but it needn’t in principle).

Ambiguity, Meaninglessness and Sem. Comp.

• Semantic compositionality is not defeated by lexical or syntactic ambiguity.

• You just get a different lexico-syntactic meaning for each selection of lexical meanings for the words and syntactic structure for the whole unit.

– But you don’t necessarily need to compute all the possibilities explicitly – see later.

• Also, potentially, a unit may not have any lexico-syntactic meaning, for a particular choice of lexical senses and syntactic structure, or perhaps even for any such choice.

  Some theorists claim that this is true of things like “The ATM ate my credit card” when “eat” has a normal biological sense, because it is simply impossible to combine that sense with the sense of “ATM”.

• So the definition above of semantic compositionality—which actually follows the usual way it is described, and indeed with some extra precision—needs some additional modification to account for ambiguity and cases of meaninglessness. Exercise: reword it suitably.

Failures of Semantic Compositionality

• One major way a language can fail to be fully compositional semantically is by virtue of (certain sorts of) (more or less) “fixed expression” (many of which are often called “idioms”), under certain assumptions about how they are processed:

• Kick the bucket [die]

• Hit the roof [quickly get very angry]

• By and large, spic and span, raining cats and dogs

• Spinning one’s wheels [doing stuff that isn’t advancing one’s goals]

Failures of Semantic Compositionality

• One approach, for some cases at least, would be to take such a phrase as just a multi-unit lexical item of a particular lexical category and with its own predefined lexical sense.

• Then there is no violation of semantic compositionality, because the phrase acts normally within bigger structures, and we are not decomposing the phrase syntactically.

• Works pretty well for the absolutely fixed phrases such as “by and large” and “spic and span”.

• But NB: need various versions of some such phrases, generated by need for different grammatical-feature values etc.:

  kicking the bucket / kicked the bucket / hadn’t kicked the bucket

  spinning his/her/their/your/…. wheels

Failure of Semantic Compositionality, contd.

• But sometimes “hit the roof” , “kick the bucket” etc. are indeed meant literally.

• So it may be better to treat (many) “fixed” phrases as having ordinary syntactic structure going down to the ordinary wordfom level.

• But then the actual meaning produced is not a function of any stored meanings of the words and the way they are put together, so semantic compositionality fails.

Failure of Semantic Compositionality, contd.

• Another way in which “fixed” phrases can lack of semantic compositionality: you typically can’t do synonymous substitutions of key words, e.g. can’t replace “hit the roof” by “hit the house’s top surface” even if at some point one ends up putting the same meanings together.

• But some substitutions are possible: “kick the pail” is a known (though misguided) variant.

– It’s misguided because the origin of “kick the bucket” (under one theory) is pigs being slaughtered kicking a rail called a “bucket” from which they are hanging. It’s nothing to do with buckets in the normal sense.

• Also, jokey substitutions are almost always possible, as in replacing “playing a game of cat and mouse” by “engaging in a contest of feline pet and small squeaky rodent”.

Such things (the phenomena, not the animals!) are often brushed under the carpet, but they are nevertheless a language phenomenon that need a proper account.

[R] Crazy Words and Grammar

• In some fixed phrases

• the grammar is crazy relative to the normal POSs of the words (as in “by and large”),

• and in some phrases the words are “cranberry” words, i.e. don’t occur elsewhere—cf. “spic” in “spic and span”—so it’s difficult to claim they have a lexical sense at all. (cf. also “span”: although a wordform occurring elsewhere, has no current meaning related to that of “spic and span”).

[R] Fixed Phrases and Bogus Referring Phrases

• Fixed phrases often contain referring phrases that don’t actually have reference to anything (except possibly in some scenario being metaphorically appealed to):

• Hit the roof / kick the bucket / hit it / spinning your wheels

[R] Metaphor: a related case of compositionality problems

• Many idioms / fixed phrase have metaphorical aspects, as in “hit the roof” and “spill the beans”.

• In latter case, we can think metaphorically of our minds as containers and ideas as physical objects, so that revealing a secret is viewed as if it were spilling something out of a container ...

  though there’s no particular motivation for thinking of a secret idea as beans.

([R]?) Metaphor, contd• Other examples of metaphor:

  In the far reaches of her mind, Anne thought that Kyle was unfaithful.  The ATM ate my credit card.  Japan is the Britain of the Far East.  Companies are always eating up other companies.  We’re at a crossroads in our relationship.

• These aren’t anything like fixed phrases, so it’s natural to think of them as having syntactic structure going down to the wordform level.

• To the extent that the actual meanings don’t arise merely from the structures and stored word senses, we have failures of semantic compositionality.

• This is even if the structure and senses help to work out the actual meaning (e.g., through analogy construction).

• And in cases of words or small phrases that have entrenched or conventional metaphorical meanings, such as “eat” and “eat up” perhaps, then those meanings can be regarded just as alternative word senses, so compositionality is restored.

[R] Metaphor, contd

• There’s dispute about whether non-conventional metaphor is a semantic or pragmatic matter, or sometimes one and sometimes the other, or usually both, etc.

• In some cases, you can see that meaning might still be worked out in a local sort of way, without using context or inference in any significant sense.

  E.g.: “This idea is rubbish” (if we assume “rubbish” doesn’t have lexical metaphorical meaning atht applies to ideas):

  We can take rubbish to be a prototypical example of something undesirable, useless and valueless.

  We then place in the idea in question within the category of undesirable, useless and valueless things.

  [This follows the so-called “class-inclusion” account of metaphor.]

  Perhaps this could all be claimed still to be within a notion of local semantics that relies (albeit in an advanced way) just on lexical senses and syntactic structure.

[R] Metaphor, contd

• In other cases, analogy seems more the key, as in “Ideas were flying around in my head, crashing into each other.”

  Complex analogy between mental processes and animate objects moving around and accidentally disturbing each other.

• Such analogy may rest on already familiar basic analogies (systems of correspondences or “mappings”), such as between mind as a physical space and ideas as physical objects. Such analogies are often called “conceptual metaphors.”

• But there’s more of a case for saying we’re going beyond lexical senses and syntactic structure, so that we’re in the domain of pragmatics.

• Inference often seems to be involved in non-conventional metaphor. Possibly: if something is viewed as being “in the far reaches of a mind” then it is inferred to be physically inaccessible, and then a conceptual metaphor steps in to convert this to abstract, mental inaccessibility.

• Such inference is a matter of pragmatics.

A Way to Compute Lexico-Syntactic Meaning Representations

• We’re going to see how to construct first-order logic expressions in simple cases, using syntactic structure and symbols for word senses.

• Logic: you may need to review your Language and Logic module notes. See also a chapter in the textbook.

• We will be introducing also quasi-logical forms as an approximation to ordinary logical syntax.

• We will be making use of lambda calculus as well. (Only described briefly and in a simplified way.)

• We’ll assume that the choice of referents of proper names and (specific) pronouns, and choices amongst different lexical senses for other words, has been made by some separate means (probably pragmatic).

Logical Forms (LFs) for Sentences

• “Mike is rich”: is-rich(mike123) where mike123 is a constant denoting the specific person referred to by the proper name “Mike”, and is-rich is a predicate symbol representing the property of being rich.

• “Mike loves Mary”: loves(mike123, mary456).

  “Everyone loves Mary”: x(is-person(x) loves(x,mary)).

  “Mike loves someone”: x(is-person(x) loves(mike, x)).

• “Everyone loves someone”: two possibilities:

  x ( is-person(x) y(is-person(y) loves(x,y)) ). {y can vary with x}

  y (is-person(y) x ( is-person(x) loves(x,y) ) ). {same y for all x}

  Quantifier scoping issue: whether y is in the scope of x or vice versa.

Simple Illustrative Grammar

• S NP is Adj

• S NP Verb /* NB: Verb here does not include forms of “to be” */

• S NP Verb NP

• NP ProperNoun

• NP QuantifyingPronoun   QuantifyingPronoun: someone, everyone

• NP QuantifyingAdj CommonNoun  QuantifyingAdj: some, every

S {is-rich(mike)}

NP{mike} Adj {is-rich}

ProperName {mike}

Mike {mike} is rich {is-rich}

“Mike is rich” The green things are semantic expressions (SEMs).

Semantic function for anylexical category:

lex categ node’s SEM = the lexical form’s SEM.

Semantic function for NP ProperName:

NP node’s SEM = ProperName node’s SEM.

Semantic function forS NP is Adj:

S node’s SEM =Adj node’s SEM applied toNP node’s SEM

S {loves(mike,mary)}

NP{mike} NP {mary}

ProperName {mike}

Mike {mike} loves {loves} Mary {mary}

“Mike loves Mary”Semantic function forS NP Verb NP:

S node’s SEM =Verb node’s SEM applied to1st NP node’s SEM and2nd NP node’s SEM in that order

Verb {loves}

ProperName {mary}

S {¬ is-married(mike)}

NP{mike} Adj {λx (¬ is-married(x))}

ProperName {mike}

Mike {mike} is single {λx (¬ is-married(x))}

“Mike is single” λx (¬ is-married(x)) is an expression for the property of not being married. We assume that that property is anonymous in the logic – i.e. there’s no predicate symbol like “single”.

The λx,y,… introduces the variables x,y,… as local parameters, as in a program routine.

You can apply a λ expression to arguments (no more than the number of variables by λ).

(λx Expr(x)) a = Expr(a) = Expr with a substituted for x.

So ( λx (¬ is-married(x))) (mike) = ¬ is-married(mike)This is called “β-reduction”.

S {¬ is-married(mike) is-adult(mike) }

NP{mike} Adj {λx (¬ is-married(x) is-adult(x) ) }

ProperName {mike}

Mike {mike} is single {λx (¬ is-married(x) is-adult(x) ) }

“Mike is single” {done a bit better}

The point of this example and previous one:

It shows the freedom with which lambda expressions can cause the creation of complex formulas.

A word doesn’t necessarily correspond to a predicate in the logical representation language.

S {is-rich( [z is-person(z)] ) }

NP{ [z is-person(z)] } Adj {is-rich}

Q’fyingPronoun { [z is-person(z)] }

Someone { [z is-person(z)] } is rich {is-rich}

“Someone is rich”

We’ll look at how to fix it later.

WANT: {z (is-person(z) is-rich(z)}

QUASI-logical form: not in correct logical syntax

We use square brackets for a special purpose: to enclose quantified expressions that are going to end up as bogus arguments in predicate applications.

S {is-rich( [ z is-person(z)] ) }

NP{ [z is-person(z)] } Adj {is-rich}


Everyone { [z is-person(z)] } is rich {is-rich}

“Everyone is rich” WANT: { z (is-person(z) is-rich(z)}

QUASI-logical form

S {rich( [z is-banker(z)] ) }

NP{ [z is-banker(z)] } Adj {is-rich}

Q’fyingAdj { λp ( [z p(z)] ) }

Every { λp ( [z p(z)] ) } banker { is-banker} is rich {is-rich}

“Every banker is rich” WANT: { z (is-banker(z) is-rich(z)}

QUASI-logical form

CommonNoun { is-banker}

Semantic function forNP Q’fyingAdj CommonNoun:

NP node’s SEM =Q’fyingAdj node’s SEM applied

toCommonNoun node’s SEM

S {rich( [z is-banker(z)] ) }

NP{ [z is-banker(z)] } Adj {is-rich}

Q’fyingAdj { λp ( [z p(z)] ) }

Some { λp ( [z p(z)] ) } banker { is-banker} is rich {is-rich}

[R] “Some banker is rich” WANT: { z (is-banker(z) is-rich(z)}

QUASI-logical form

CommonNoun { is-banker}

S {loves( [z is-person(z)], mary)}

NP { [z is-person(z)] }

NP {mary}


Everyone { [z is-person(z)] } loves {loves} Mary {mary}

“Everyone loves Mary”

Verb {loves}

ProperName {mary}

WANT: { z (is-person(z) loves(z, Mary)}

QUASI-logical form

S {loves( [z is-person(z)], [y is-person(y)] )}

NP { [z is-person(z)] }

NP { [y is-person(y)] }


Everyone { [z is-person(z)] } loves {loves} someone { [y is-person(y)] }

“Everyone loves someone”

Verb {loves}

WANT: two possibilities (see later)

QUASI-logical form

Q’fyingPronoun { [y is-person(y)] }

S {loves( [z is-banker(z)], [y is-yacht(y)] )}

NP { [z is-banker(z)] }

NP { [y is-yacht(y)] }

Q’fyingAdj {λp([z p(z)]) }

Every{λp([z p(z)])} banker{is-banker} loves {loves} some{λp([y p(y)])} yacht {is-yacht}

[R] “Every banker loves some yacht”

Verb {loves}

WANT: two possibilities

QUASI-logical form

Q’fyingAdj {λp ([ y p(y)]) }

CommonNoun {is-banker}

CommonNoun {is-yacht}

Converting QLFs to LFs• NB first: each Quantifying Pronoun/Adjective introduces a different variable

for use in a bogus parameter like [... is-person(...)]. Mustn’t just use the same variable, say z, each time.

• Given the QLF: is-rich( [z is-person(z)] )  we want to be able to produce the proper LF

  z (is-person(z) is-rich(z))

• Given the QLF: is-rich( [z is-person(z)] ) is-stupid( [z is-person(z)] )   we want to be able to produce the proper LF

  z (is-person(z) is-rich(z) is-stupid(z) )

• So we want to “pull out” the quantification, replacing each square-bracket expression just by the variable inside it.

Pulling out means we introduce a conjunction symbol , and   pulling out means we introduce an implication symbol .

Converting QLFs to Proper LFs• Let’s call every expression of form [υ B(υ)] or [υ B(υ)] a “quasi-term” ,

where υ is a variable and B(υ) is some formula involving that variable. (The following assumes that B(υ) does not itself include quasi-terms.)

– There may be several occurrences of a given quasi-term arising from a wordform at a particular position in the sentence. These occurrences all use the same variable υ and all are associated with the same quantifier symbol or .

– But the quasi-terms arising from quantifying pronouns/adjectives at different places in the sentence use different variables.

• In some order, consider the variables featuring in some quasi-term in the QLF. For each such variable υ, replace all occurrences of the quasi-term using υ by the variable itself to get a partially revised form, QLF’, and

– If the quasi-term is of form [υ B(υ)], replace current QLF’ by υ (B(υ) QLF’)

– If the quasi-term is of form [υ B(υ)], replace current QLF’ by υ (B(υ) QLF’)

• We can call these rules the “pull-out” rules.

Converting QLFs to LFs, contd

• So with: is-rich( [z is-person(z)] )

  There’s only one quasi-term variable, z, and we get

  z (is-person(z) is-rich(z))

• With: is-rich( [z is-person(z)] ) is-stupid( [z is-person(z)] )

  There’s still only one quasi-term variable, and we get

  z (is-person(z) is-rich(z) is-stupid(z) )

Converting QLFs to LFs, contd

• With: loves([z is-banker(z)], [y is-yacht(y)] )

  There are two quasi-term variables, z, y, and we get one of the following:

  z (is-banker(z) y (is-yacht(y) loves(z,y) ))

  y (is-yacht(y) z (is-banker(z) loves(z,y) ))

  depending on the order in which we do the pull-outs.

[R] [O] Caution: Pull-Outs above are Oversimplified

• They don’t cater for quantifying adjectives and pronouns (some[one] and every[one]) being within “negative” contexts, which include explicit negations and conditional antecedents, as in:

– It’s not the case that someone / every banker is rich.– If someone/everyone is rich then I am angry.

• To cater for this, need to refine the pull-outs to work on subformulas of the QLF, not the whole QLF. A pull-out for a particular variable υ should be done in the innermost subformula that encloses all occurrences of υ.

  This allows the QLF for an example above  is-rich( [z is-person(z)] ) angry(i)

  to give rise to the proper LF:  ( z (is-person(z) is-rich(z)) ) angry(i)

  Notice that this can be manipulated into the logically equivalent formula

  (z (is-person(z) ¬is-rich(z)) ) ˅ angry(i)

[R] Variant Grammar and More Lambda

• Instead of

  S NP Verb NP

• We’re more likely to have

  S NP VP VP Verb NP

• So for “Mike loves Mary” we need a VP SEM for “loves Mary” that is a predicate expressing the property of loving Mary. We then apply that predicate to Mike.

  That SEM will be λx loves(x, mary)

S {loves(mike,mary)}

NP{mike}

NP {mary}

ProperName {mike}

Mike {mike} loves {loves} Mary {mary}

[R] “Mike loves Mary” Semantic function for

S NP VP:

S node’s SEM =VP node’s SEM applied toNP node’s SEM

Semantic function forVP Verb NP :

VP node’s SEM =λx Verb-SEM(x, NP-SEM)

Verb {loves}

ProperName {mary}

VP{ λx loves(x,mary)}

QLFs and Compact Ambiguity Handling• The QLF use of quasi-terms is a compact way of handling the semantic ambiguity arising

from different decisions on quantifier scoping.

  If there are n different quasi-term variables, then there n! different orders for pull-out, and hence n! different proper LFs (although many may mean the same as each other – Exercise).

  If we tried to construct proper LFs as we sent up the syntax tree, we would need to be explicitly handling all the different proper sub-LFs to cater for all the possibilities, and would ultimately create all the n! different proper LFs.

• The QLF approach allows us to postpone the decisions and apply pragmatic constraints at the end of the process (or later) in order to select an appropriate proper LF by doing the pull-outs in some appropriate order.

  And even allows us not to make the decisions at all in parts of the QLF that don’t interest us.

  = Two aspects of the Principle of Least Commitment.

QLFs and Ambiguity Handling, contd.• Another form of compacting of ambiguity:

• Suppose the ambiguous noun bank is used in the sentence, and we have different logical predicates is-financial-bank-institution, is-financial-bank-building, is-river-bank, etc.

• Then in the middle of a QLF we can explicit list some or all of the alternatives, as in

... [is-river-bank | is-financial-bank-building] (x) ....

  instead of having different QLFs of form

  ... is-river-bank(x) ....

  ... is-financial-bank-building(x) ....

• If there are k such words in the sentence each with m possible meanings, then we compress km possible QLFs into just one.

• And words are very frequently ambiguous!!

• Once again we can postpone the decisions until after the semantic composition process, and can even not make some.

[R] (Q)LFs and Ambiguity Handling, contd.• However, a similar thing could be done slightly more elaborately in proper LFs

by disjunction as in

... (is-river-bank(x) ˅ is-financial-bank-building(x) ) ....

• Moreover, there’s no law that says that logic predicates should be less ambiguous than natural-language words.

  As far as the technicalities of logic are concerned (both syntactic and semantic) there’s no reason we can’t have a logic predicate is-bank that covers some or all of the different sense of the word bank.

– These points are often missed by AI people and others.

• Using inclusive predicate like is-bank makes the expression of knowledge about banks more complex, as additional restrictive qualifications need to be included.

  But it simplifies the expression of lexico-syntactic semantics, and NB we could refine such expression afterwards by replacing an inclusive predicate by a more restricted one like is-river-bank (if we needed to do so).

[R] Note• Textbook chapter 18 may help you.

• But its use of lambda calculus is in the service of producing proper LFs directly rather than QLFs.

• Also it unnecessarily uses expressions like λx is-dog(x) rather than simpler expressions like is-dog. These two expressions actually mean the same property.

• It also uses an event-based form of logical representation that is interesting and good but is largely independent of the issues I address above.

Documents

Semantics and Pragmatics: Introduction