26
ESSLLI 2007 19th European Summer School in Logic, Language and Information August 6-17, 2007 http://www.cs.tcd.ie/esslli2007 Trinity College Dublin Ireland COURSE R EADER ESSLLI is the Annual Summer School of FoLLI, The Association for Logic, Language and Information http://www.folli.org

ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

ESSLLI 2007

19th European Summer School in Logic, Language and Information

August 6-17, 2007

http://www.cs.tcd.ie/esslli2007

Trinity College Dublin

Ireland

COURSE READER

ESSLLI is the Annual Summer School of FoLLI,

The Association for Logic, Language and Information

http://www.folli.org

Page 2: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter
Page 3: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

Metaphor.

Reading Materials (II) for the ESSLLI-07 course

on Figurative Language Processing

(Revised version)

Birte Lonneker-Rodman

July 12, 2007

1 The theory of Cognitive Metaphor

One of the first researchers to notice the abundance of metaphor in com-mon language use was Michael Reddy. Reddy (1979) shows that speakersof English use a large number of conventionalized metaphorical expressionswhen talking about communication: to pack thoughts into words, the sen-tence was filled with emotion, hollow words, find good ideas in the essay, sealup meaning in sentences, to mention but a few examples. These commonexpressions suggest that signal-entities like words or sentences are containerswhich directly hold “reified” mental and emotional content.

The theory of Cognitive Metaphor as such was introduced by Lakoffand Johnson (1980) and since then further elaborated by many scholars. Arecent introduction is Kovecses (2002).

According to Cognitive Metaphor theory, the primary basis of metaphoras a phenomenon is not language, but thought. The vast majority ofmetaphorical utterances like to grasp an idea rely on mental generalizations,which relate a conceptual source domain and a conceptual target domain.The target domain is understood and acted on in terms of the source domain.For example, the expression to grasp an idea makes use of the conceptualmetaphor IDEAS ARE OBJECTS, in which OBJECTS are the source domainand IDEAS the target domain. In general, we can interpret those conceptualdomains as follows:

• The source domain is a concept that is closer to basic concepts ac-cessible by bodily experience, in a continuum of concepts. Example:

1

Page 4: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

OBJECT. Physical objects can be perceived visually, touched, and ma-nipulated.

• The target domain is a concept that is closer to abstract conceptswhich cannot be immediately experienced, in the same continuum ofconcepts. Example: IDEA. An idea is an “abstract” object whichcannot be immediately perceived by the senses.

Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapteron Figurative Language, Subsection 2.2 (Literal).

1.1 Lexical metaphors as instantiations of conceptual meta-phors

Individual metaphors are lexical instantiations of conceptual metaphors. Forexample, the figurative uses of the verbs to pack (as in Reddy’s example topack thoughts into words) and to grasp (as in to grasp an idea) are lexicalinstantiations of the conceptual metaphor IDEAS ARE OBJECTS. Usually,a single conceptual metaphor accounts for the metaphorical meanings of anumber of different words belonging to the source domain: “[The] unifiedway of conceptualizing [a domain] metaphorically is realized in many differentlinguistic expressions.” (Lakoff, 1993a, 209)

Lexical metaphors can be encountered in everyday life conversations, inordinary newspaper texts, and in many other text types including academicwriting. According to Martin (1994), the frequency of lexical metaphorsin a newspaper text can be estimated to about 4 to 5 words per 100. Notonly all humans, but also most systems dealing with natural language willthus encounter metaphor.

2 Metaphor as a form of linguistic creativity

Metaphor is probably one of the most widespread forms of linguistic cre-ativity. At the same time, it is a phenomenon that occurs itself under manyforms. One of the scales on which metaphor can be characterized is thatof conventionality. At one end of the continuum, there are novel poeticaland spontaneous metaphors, which are by definition of a very low frequency(example: My horse with a mane made of short rainbows, Navaho songcited by (Lakoff, 1993a, 230)). At the other end, there are conventionalizedmetaphors that can be very common, and sometimes even difficult to replace

2

Page 5: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

by a non-metaphoric expression (example: He defended his belief that theletters were genuine).

2.1 Main types of novel metaphor

Lakoff (1993a) distinguishes several types of novel lexical metaphors. Threemain types will be briefly presented in what follows.

Lexical extension of conventional conceptual metaphors. Lexicalmetaphors of a higher degree of creativity, and having at the same timea high potential of “success” in terms of comprehensibility, are those thatextend the set of conventionally mapped lexical items inside the source do-main. A process aiming at producing a narrative of any kind could start outusing some conventionally mapped lexical items of a selected source domainand continue using lexemes from the same source domain that are usuallynot encountered in a metaphorical sense. In fact, humans do creatively pro-duce such metaphors. For example, the conceptual metaphor THEORIESARE CONSTRUCTED OBJECTS shows conventionalized lexical mappings inthe sentence He is trying to buttress his argument with a lot of irrelevantfacts, but it is still so shaky that it will easily fall apart under criticism. Acreative, but comprehensible lexical extension of this conceptual metaphoris exemplified in the sentence Your theory is constructed out of cheap stucco.

Image metaphors. Image metaphors are singular metaphors that mostoften map only one image onto another image, for example the image of anhour-glass onto the image of a woman. They do not refer to conventionalconceptual metaphors, in which many elements of the source domain canbe mapped onto many corresponding concepts in the target domain, nordo they establish new conventional metaphorical mappings. In most image-metaphors aspects of a part-whole structure are mapped onto aspects ofanother part-whole structure, but also other aspects like attributes can bemapped. In the example My horse with a mane made of short rainbowsdiscussed above, the colorfulness and beauty of the object in the sourcedomain (rainbow) are mapped onto the object in the target domain (mane).

On image metaphors and rich image metaphors, see also (Lakoff,1987, pp. 444–456) and (Dobrovol’skij and Piirainen, 2005, pp. 161–165).

Analogies. Analogies like the famous example my job is a jail usuallymake use of several well-established conceptual mappings. Understandingthe expression my job is a jail involves the processing of the independently

3

Page 6: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

existing metaphors GENERIC IS SPECIFIC, PSYCHOLOGICAL FORCE ISPHYSICAL FORCE, and ACTIONS ARE SELF-PROPELLED MOVEMENTS(cf. Lakoff (1993a)).

2.2 Structure preservation

(Lakoff, 1993a, 231) suggests as a generalization that a certain structure ispreserved in the metaphorical mapping of all metaphors, whether conven-tional, image metaphor or analogy. He calls this structure image-schemastructure, where parts are mapped onto parts and wholes onto wholes (as inthe hourglass-woman example), containers onto containers as in the IDEASARE OBJECTS example, and so on. As Green (2002) further elaborates, itseems to be the most important to map the system of relationships presentin the source domain. It is interesting to note that while Lakoff arguesthat “symbol manipulation systems cannot handle image-schemas” (Lakoff,1993a, 249), most NLP or AI systems aiming at the processing of metaphorsdo exactly that (among other tasks like inferencing): They try, in one wayor the other, to represent the structure of the source and target domains.

A different question is to which degree source and target domain needto be isomorphic, and to which extent it is necessary to represent individualmappings between single items or relations. Alternatives would be to a)map only central items or b) establish an abstract representation of domainsand map the domains as such, as suggested by the rendering in CognitiveLinguistics (e.g. time is money). The alternatives are more economicalin representation and account for the fact that “unused” portions of themapping can suddenly and creatively be exploited and the involved itemsbecome part of the set of mapped items. On the other hand, they mustcompensate the incomplete mapping by additional processing or reasoning,drawing inferences from known facts about both source and target domainsand those mappings that are available to the computational algorithm. Twoimplemented systems for metaphorical reasoning will be presented below.

3 Cognitive Metaphor theory and other metaphortheories

The theory of conceptural metaphors and mappings has been further elab-orated and extended in the Conceptual Blending theory Fauconnier andTurner (2002).

4

Page 7: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

Comparison view: metaphor as feature matching (e.g. Ortony, 1979),cf. Figure 1. (The illustration is from (Bowdle and Gentner, 2005, p. 194).)

Example: Dew is a veil.

Figure 1: Metaphor as Feature Matching

Categorization view: metaphor as class inclusion (e.g. Glucksberg andKeysar, 1990); cf. Figure 2.

Example: My job is a jail.

Figure 2: Metaphor as Class Creation and Class Inclusion

4 How do humans process metaphors?

4.1 Models and Predictions

The standard pragmatic model (Grice, 1975); (Searle, 1979) is based on asequential account of metaphor processing. A literal reading is attemptedfirst, a conflict is detected, and the finally metaphorical meaning is found.Longer processing times for metaphor than literal language are predicted.

The Cognitive linguistics model (Lakoff, 1993b) holds that conceptualmetaphors used with no noticeable effort. The same effort and processing

5

Page 8: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

times for metaphor and literal language are predicted.Current psycholinguistic processing models (e.g. (Gibbs, 1994) or (Glucks-

berg et al., 1997) are based on the idea that the hearer or reader notes corre-spondences between domains (alignment) and selectively projects properties(cf. features, but also highlighting). More effort for processing metaphorthan literal language is predicted, though not necessarily longer processingtimes, as the described processes can happen concurrently.

4.2 Experiments

An experiment by McElree and Nordlie (1999) found the same processingtimes for metaphors and literal language. The test question in this experi-ment was: Is this sentence meaningful?

Example 1: Some mouths are sewers. (figurative: novel image metaphor)Example 2: Some tunnels are sewers. (literal)Example 3: Some lamps are sewers. (nonsense)

The design of the experiment has later been criticized, as little atten-tion was paid to factors such as semantic plausibility and word frequency.(Coulson and Petten, 2002)

Coulson and Petten (2002) measured event-related brain potentials (ERPs).Test sentences:

Example 1: That stone we saw in the natural history museum is a gem.(literal)Example 2: After giving it some thought, I realized the new idea was a gem.(relatively novel metaphor extending IDEAS ARE OBJECTS)

The relevant words are in sentence-final position. The measurementsindicate that the comprehension of metaphorical sentences is more difficultthan that of literal sentences. The difference in difficulty is comparablewith the difference between high- and low-frequency words. Still, literal andmetaphoric language have been shown to share some processing mechanisms.

5 Are Mappings reality?

In favor

If metaphorical mappings are embodied like this, we should be able todo psychological experiments on them. Ray Gibbs and his colleagueshave done several revealing studies (Gibbs, 1994). In one study, sub-jects were first asked to read little stories about anger, such as one

6

Page 9: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

in which someone returned John’s car with new dents in it. Theywere then given [a] lexical decision task [. . . ]–decide as fast as possiblewhether some string of letters is an English word. They were giventhree different kinds of sentences as priming contexts:

(1) He blew his stack (appropriate idiom).

(2) He got very angry (literal).

(3) He saw many dents (neutral control phrase).

The target strings were either metaphorically connected words like“heat”, unrelated words, or nonwords. When primed by sentence 1,subjects responded faster for heat, but not for unrelated words ornonwords, even though sentence 1 has no direct mention of anythinginvolving heat. Neither literal sentences nor different metaphors foranger [involving different source domains, BLR] had this priming ef-fect. [. . . ] [P]riming is believed to work by spreading activation amongrelated neural representations. The obvious explanation for the find-ings is that a metaphorical sentence like sentence 1 evokes the mappingfrom anger to heat and thus makes it faster for people to identify thestring heat as an English word. (Feldman, 2006, pp. 211–212)

Against–maybe

[U]sing fMRI1, we tested the hypothesis that areas in human premo-tor cortex that respond both to the execution and observation of ac-tions [. . . ] are key neural structures in these processes. Participantsobserved actions and read phrases relating to foot, hand, or mouthactions. In the premotor cortex of the left hemisphere, a clear con-gruence was found between effector-specific activations of visually pre-sented actions and of actions described by literal phrases. [. . . ] [This]constitutes evidence for congruence between representations of visuallypresented actions and semantic representations of actions derived fromlinguistic phrases.

[On the other hand,] [f]or metaphorical sentences, the interaction wasnot significant in either the left [. . . ] or the right [. . . ] hemisphere.[This seems to contradict proposals] that metaphorical sentences mayrely upon embodied representations [. . . ]. However, the negative re-sults regarding metaphorical sentences should be interpreted with cau-tion due to [several] methodological considerations: [Supplement] First,the degree of correspondence between language and action observation

1Functional magnetic resonance imaging, the use of MRI to monitor brain activity fordifferent tasks.

7

Page 10: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

may be important; whereas most of the literal phrases that were pre-sented had high congruence with the actions that were presented (e.g.,grasping the keys and observing someone grasping keys), metaphoricalphrases inherently match actual physical actions less closely (e.g., graspthe meaning does not correspond to an action which could be presentedvisually). Second, it is possible that once a metaphor is learned, it nolonger activates the same network that it may have initially. Thatis, although a metaphor like grasping the situation when first encoun-tered may have utilized motor representations for its understanding,once it is overlearned, it no longer relies on these representations andinstead utilizes other representations that have to do with the conceptof understanding. This hypothesis could be tested by studying activa-tions to metaphors on a continuum from novel to overlearned, with theprediction that novel metaphors would more strongly activate sensory-motor representations than overlearned metaphors. (Aziz-Zadeh et al.,2006)2

Many further psycholinguistic experiments have been done by Sam Glucks-berg.

6 Metaphor in NLP and Reasoning

As outlined above, conventionalized metaphor is an everyday issue. Mostsystems dealing with NLP have to face it sooner or later. A successful han-dling of conventional metaphor is also the first step towards the processingof novel metaphor. However, only very few systems have been designed withspecial attention to metaphor handling. Some implications behind them arementioned in the next Subsection.

6.1 Processing of metaphor

Obvious problems for NLP systems caused by lexical metaphors consist inthe incompatibility of metaphorically used nouns as arguments of verbs. Insystems which constrain the type of arguments for every verb by seman-tic features like human, living, concrete or abstract (“selectional restric-tions”), metaphors can cause inconsistencies that will have to be solved. Forexample, if the grammatical subject of the English verb go was restrictedto entities classified as living in a given system, the following sentence (1.)taken from Hobbs (1992) could not be parsed.

2Thanks to Gloria Yang for pointing out this reference.

8

Page 11: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

(1) The variable N goes from 1 to 100.

Obviously, there is an open-ended number of such sentences. In fact,there have been attempts to increase the ability of systems to deal with in-compatibilities of this kind, caused by instantiations of conceptual metaphors(or metonymies).

Systems explicitly aimed at metaphor processing encode a representationof at least a part of the conventionalized mapping. Those systems can becalled knowledge-based systems; they “leverage knowledge of systematic lan-guage conventions in an attempt to avoid resorting to more computationallyexpensive methods” (Martin, 1994). The systems generally perform most ofthe necessary knowledge-based reasoning in the source domain and transferthe results back to the target domain using the provided mapping; this pro-cedure is applied, for example, in KARMA’s networks (Narayanan, 1999),(Feldman and Narayanan, 2004) or in the rules of TACITUS (Hobbs, 1992)and ATT-Meta (Barnden and Lee, 2001).

6.2 Metaphor Reasoning Systems

Systems designed explicitly for reasoning with metaphors and explicit ref-erence to conceptual metaphors are KARMA (Narayanan, 1999) and ATT-Meta (Barnden and Lee, 2001).

These two systems, at least in their current state, offer a slightly dif-ferent functionality. KARMA can be seen as a story understanding system,producing a time-sliced representation of the development of (features in)the target domain, and/or an output of the end state in terms of featurestructures; ATT-Meta can be seen as a Question-Answering system thatverifies whether a fact (submitted as a user query) holds, given a possiblymetaphorical representation of the current state of the world.

There are some fundamental differences in the computational design andparticular abilities of the two systems. For example, at design and imple-mentation level, KARMA uses x-schemas (“executing schemas” (Feldmanand Narayanan, 2004, p. 387)) implemented as extended Stochastic PetriNets for the source domain and Belief networks for the target domain. Onthe other hand, ATT-Meta uses situation-based or episode-based first-orderlogic throughout (Barnden and Lee, 2001, p. 34).

KARMA is ascribed the ability to detect interpreter’s bias in metaphori-cal utterances; for example, (1) implies that the speaker or writer in generalopposes government control of economy, whereas (2) does not imply this.KARMA detects this difference in speaker attitude because it can draw

9

Page 12: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

source domain inferences of “strangle-hold”, which has a detrimental effecton business (in the target domain); cf. Narayanan (1999).

(1) Government loosened strangle-hold on business.

(2) Government deregulated business.

ATT-Meta is supposed to handle uncertainty and to be able to cope witha minimal set up inter-domain mappings, emphasising the importance ofreasoning within domains, especially within the source domain; cf. Barndenand Lee (2001).

A commonality between the systems is that both have three main com-ponents, or representational domains, which reflect the influence of centralideas from metaphor theory:

1. Knowledge (common-sense knowledge, facts, default inferences) aboutthe source domain, including relations in the form of (pre-)conditions.

2. Knowledge about the target domain. This is usually less elaboratethan source domain knowledge.

3. Mapping information (KARMA) or conversion rules (ATT-Meta), whichcan be of different types and can also include information about fea-tures or facts not to be varied or changed, such as aspect (e.g. ongoingvs. completed action).

Both systems rely on extensive domain knowledge which has to be hand-coded, i.e. manually entered by the designer or user. The computational“engines” of the systems (x-schemas in KARMA or backchaining/backwardreasoning of rules in ATT-Meta, starting from the user query) then derivenew facts or consequences within the domains–usually within the source do-main3. Still, even though possibly simpler than the domain representations,the mapping rules are at the heart of these systems, as they allow informa-tion to flow from one domain to the other. For this reason, the remainderof this section will be devoted to the mapping rules.

ATT-Meta’s conversion rules are explained by Barnden and Lee(2001) using the following example (3):

(3) Then one day ... Anne found a recent ticket stub in her husband’s[Kyle’s] pocket; he’d never mentioned taking in a show. “If any of mygirlfriends had told me a similar story, I would’ve asked if they were

3(Barnden et al., 2002, p. 1190) explain why it is sometimes necessary to reason in thetarget domain.

10

Page 13: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

sure their mate wasn’t seeing someone else,” says Anne. “Instead, Ijust kept saying, ’This is the thin part of the through-thick-and-thinsection of our wedding vows.’ ”In the far reaches of her mind, Anne knew Kyle was havingan affair, but “to acknowledge the betrayal would mean I’d have totake a stand. I’d never be able to go back to what I was familiarwith,” she says. Not until eight months had passed and she finallychecked the phone bill did Anne confront the reality of her husband’sdeception. [In Linden Gross, “Facing up to the Dreadful Dangersof Denial,” Cosmopolitan, 216(3), USA ed., March 1994. Bold fontadded, italics in original.]

The clause shown in bold font contains the metaphor to be reasonedupon. It is assumed that this clause manifests two metaphorical views,MIND AS PHYSICAL SPACE and IDEAS AS PHYSICAL OBJECTS, foran understander U.

Conversion Rules for IDEAS AS PHYSICAL OBJECTS The map-ping U needs to possess is as follows: “When an idea entertained by a personis being viewed as physical object, and the person’s conscious self is viewedas a person, then the ability of the conscious self to physically operate onthe idea corresponds to the real person’s ability to operate in a consciousmental way on the idea.”

This metaphorical correspondence is captured in four user-supplied con-version rules, including:

IF is-idea(J) AND is-person(P)AND WITHIN PRETENCE: is-physical-object(J)AND WITHIN PRETENCE: is-person(conscious-self-of(P))AND WITHIN PRETENCE: to-degree-at-least(Degree):can-physically-operate-on(conscious-self-of(P), J)THEN {presumed} to-degree-at-least(Degree):can-consciously-mentally-operate-on(P,J).

Pretence is the “metaphor pretence cocoon” where the utterance istaken as literally true.

Source domain reasoning: The far reaches of a space are a physicalpart of that space, so what is in the far reaches of your mind is physically“in” your mind.

Apply conversion rule: You can mentally operate on an idea that isin the far reaches of your mind, to a very low degree at least.

11

Page 14: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

Conversion Rules for MIND AS PHYSICAL SPACE The mappingrelationship that U is assumed to possess as part of MIND AS PHYSICALSPACE is: “When a person’s mind is being viewed as a physical space, anidea’s being physically located in the space corresponds to the person’s beingable to operate mentally on the idea, to a very low degree at least.”

This correspondence is also captured in four conversion rules, the mostimportant for the Anne and Kyle example being:

IF is-idea(J) AND is-person(P)AND WITHIN PRETENCE: is-physical-region(mind-of(P))AND WITHIN PRETENCE: physically-in(J, mind-of(P))THEN {presumed} to-degree-at-least(very-low):can-mentally-operate-on(P, J).

Source domain reasoning: something that is far away is accessible tophysical operation only to a very low degree; the far reaches of somethingare far away from the center of that thing, so the degree of possible physicaloperation on an object in the far reaches is very low.

Apply conversion rule: default degree (very-low) from previous ruleis not increased.

The result of the reasoning is that the systems “knows” (can confirma logical representation of) the following statement: Anne had a very lowdegree of ability to consciously process the idea that Kyle was having anaffair.

KARMA has three kinds of mappings that connect (source) x-schema based representations to (target) belief networks representing ab-stract knowledge (e.g. about economics).

1. Ontological maps map entities and objects between embodied andabstract domains, e.g. MOVERS ARE ACTORS.

2. Schema maps map events, actions and processes, including an invari-ant mapping of their aspect. E.g. MOVING IS ACTING, FALLINGIS FAILING.

3. Parameter maps project x-schema parameters from source to tar-get domains. Examples: velocities–rate of progress made; distancetraveled–degree of plan completion.

12

Page 15: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

6.3 Problems

As Martin (1994) points out, one of the problems for knowledge-based sys-tems with integrated metaphor handling is the acquisition of sufficient andsuitable knowledge. Still nowadays, systems like KARMA and ATT-Metahave to be provided with knowledge by the users. It would therefore be use-ful to provide more knowledge about metaphor in lexical resources, whichcould be either directly used in NLP systems, or used as a basis for buildingrules and networks in systems designed especially for metaphor handling. Ifwell-studied linguistic knowledge supported by attestations in corpora wasencoded in lexical resources, they could also be regarded as a common start-ing point for different systems, and the results of the systems would becomemore directly comparable.

A different but related challenge of NLP is to provide suitable inputrepresentations, and interpret the output representations, for and from suchsystems. As long as the input has to be pre-parsed or encoded in some logicformalism, the reasoning capabilities of the systems are not accessible to thenormal user.

7 Resources: Metaphor Databases, Metaphor An-notation

Current general-domain lexicons are of very restricted usefulness for creativesystems that aim at understanding or creating metaphorical expressions.However, lexical databases like WordNet have the potential to become usefulbasic resources for metaphor representation.

One of the reasons why general lexical resources have not been used asinput for metaphor processing systems are their fine-grained and sometimesarbitrary sense distinctions (Martin, 1994). In order to overcome the lackof knowledge in his MIDAS system, Martin (1994) therefore built the Meta-Bank database, which is independent from any other lexicon or knowledgebase. MetaBank contains mappings of single lexical metaphors like enter orkill. The grammatical objects of these lexical items can refer to containers(source) or processes (target), among others: enter a computer program, killa process. For both senses of the words (in the source and target domain),representations of the meaning are built (including for example the super-concept as well as type restrictions for result, victim, and actor), which arethen mapped by so-called metaphor maps (Martin, 1992). For building theknowledge base used in MIDAS, Martin (1994) and his colleagues analysed

13

Page 16: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

three sources:

1. the Berkeley Master Metaphor List, the latest version of which isLakoff et al. (1991),

2. a specialized corpus from the computer domain containing questionsand answers about UNIX;

3. (probes of) a newspaper corpus (three years of the Wall Street Jour-nal).

The Berkeley list helps to perform directed searches (for example, oncontainer metaphors), as Martin (1994) exemplifies using the computer cor-pus. He points out that an exhaustive metaphor analysis of general corporawould be most fruitful for building knowledge resources, but is not feasiblewith a large collection. Therefore, Martin (1994) analyses six newspapersamples of each about 100 sentences. One of his empiric insights is that arelatively small number of general conventional metaphors accounts for ahigh number of lexical metaphors, according to frequency counts. The con-sequence of these findings is that it is worthwhile to undertake a thouroughstudy of metaphor domains that are lexicalized with a high frequence ingeneral corpora, because a better representation of those areas could helpmany systems.

7.1 Metaphor annotation

Lee (2006) compiled a 42,000 word corpus of transcribed doctor-patientdialogues, exhaustively hand-annotated for stretches of metaphorical lan-guage. These are provided with conceptual labels enabling the author toidentify prevalent and interrelated metaphorical mappings used as part ofcommunicative strategies in this domain. Agreement on the identificationof stretches of metaphor and other tropes in text ranges from kappa 0.82to 0.85.

The Hamburg Metaphor Database (Lonneker and Eilts, 2004), (Lonneker,2004) combines data from corpora, EuroWordNet and a freely availablemetaphor list. Reining and Lonneker-Rodman (2007) annotated more than1,000 instances of lexical metaphors from the MOTION and BUILDINGdomains in a French newspaper corpus centered on the European Union.These have been integrated into the Hamburg Metaphor Database. Inter-annotator agreement between two annotators, on a sample of 920 sentences

14

Page 17: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

from the MOTION domain, was found to be kappa (Cohen) 0.897.4

Projects annotating metaphor in corpora also include (Martin, 1994) and(Barnden et al., 2002).

8 Metaphor detection

Automated work on metaphor recognition such as (Mason, 2004), (Gedigianet al., 2006), (Birke and Sarkar, 2006) focusing on verbs as parts of speech.A further recent approach to metaphor recognition is (Krishnakumaran andZhu, 2007).

8.1 Birke and Sarkar 2006

Birke and Sarkar (2006) implemented a clustering approach for separatingliteral from non-literal language use, with verbs as targets. The algorithm isa modification of the similarity-based word-sense disambiguation algorithmdeveloped by Karov and Edelman (1998).

Similarities are calculated between sentences containing the word to bedisambiguated (target word), and collections of seed sentences for each wordsense (feedback sets). In this particular case, there are only two feedbacksets for each verb: literal or nonliteral. The authors do not make referenceto any particular theory of metaphor or of figurative language in general.Accordingly, their literal-nonliteral distinction (important for building andcleaning feedback sets and for annotating a test set) is relatively vague andbased mainly on examples. “[W]e will take the simplified view that literalis anything that falls within accepted selectional restrictions [. . . ] or ourknowledge of the world [. . . ]. Nonliteral is then anything that is “not literal”,including most tropes, such as metaphors,idioms, as well as phrasal verbsand other anomalous expressions [. . . ].” (Birke and Sarkar, 2006, p. 330)

Two sentences are similar if they contain similar words and two wordsare similar if they are contained in similar sentences (transitive similarity).

In the original version of the algorithm, a target sentence is attracted tothe feedback set containing the (one) sentence to which it shows the highestsimilarity. In the modified version used for literal/nonliteral discrimination,a target sentence is attracted to the feedback set to which it is most similar,when summing across its similarities to each sentence per feedback set.Birke and Sarkar (2006) also set a threshold below which no attraction to

4Birte Lonneker-Rodman: “Behind the scenes at the Hamburg Metaphor Database.”Talk at the ROLAP-Group, Princeton, NJ, 1 May 2007.

15

Page 18: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

any of the feedback sets is decided by the algorithm (in evaluation, thoseunclustered sentences are counted as if they had been wrongly clustered).

Features to measure the similarity are simply certain words in the sen-tence: stemmed nouns and verbs, except for target and seed words (seebelow for seed) and 374 most frequent English words.

For clustering and testing, sentences from the Wall Street Journal (WSJ)corpus are used.

For the nonliteral feedback set, the following sources are used:

1. WSJ sentences containing the target verb or synonyms found in twolists of nonliteral language use: Wayne Magnuson’s English IdiomsSayings & Slang5, and the Master Metaphor List,

2. as well as example sentences from these sources.

For example, for the target verb grasp, there might be sentences containingcomprehend, in the nonliteral feedback set. Exactly which method was usedin this case to find synonyms of the target word, or to find the right sensein WSJ sentences, is not clear.

For the literal feedback set, sentences from two sources are used:

1. WSJ sentences containing the target word or one of its synonyms fromwithin the same WordNet synset: for example, for the target verbgrasp, there might be sentences containing grip;

2. as well as example sentences for these synset members extracted di-rectly from WordNet itself.

The flaw of this set is that it might be quite noisy: “When we useWordNet as a source of example sentences, or of seed words for pullingsentences out of the WSJ, [. . . ] we cannot tell if the WordNet synsets, orthe collected feature sets, are actually literal.” (Birke and Sarkar, 2006, p.331)

Whereas WordNet synsets are not flagged for being literal or not, theauthors might have decided for more “supervision” at this stage, and con-trolled the intended meaning of the synset for which they extract the exam-ples from WordNet. For those additional sentences that were extracted fromWSJ, such a supervision would have been more difficult; they could haveused Propbank (as did Gedigian et al. (2006)) although PropBank does notuse WN sense distinctions and that might have made the work more difficult.

5http://bits.westhost.com/idioms/ [28 June, 2006]

16

Page 19: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

As the data were, Birke and Sarkar (2006) devise several methods toclean the possibly noisy literal feedback sets. The results of the cleaning canbe to shift literal feedback sentences/feature sets from literal to nonliteral, orto suppress them. The moving operation is preferred (gets a higher weight)over deleting data.

The cleaning procedure relies on two factors: The presence of phrasalverbs (e.g. throw away vs. throw, knock out vs. knock) and the overlap offeatures (i.e. noun and verb stems) with sentences in the nonliteral feedbackset, which is “trusted”.

We should note, however, that there are also sentences containing phrasalverbs which have been annotated as literal for the test set:6

(4) In northern Botswana , where the poor quality of their ivory protectselephants from poachers , the elephant population is out of controland has eaten and knocked down many square miles of scrubby forest.

(5) Another trader almost had his glasses knocked off in the jostling.

(6) Where Chekhov has Lopakhin knocking over a small table , Mr. Den-nehy goes crashing into a big , folding screen that ’s standing in for awall.

This illustrates one of the general problems with distinguishing only twosense categories: the “cluster” of literal uses contains several variants itself,including the one of knocking at a door, all of them describing different waysof colliding, entailing different consequences.

With the (thus automatically) cleaned data sets and two additional fea-tures, including a larger context (“a specified number of surrounding sen-tences”) which improves the results, the modified WSD algorithm achievesan average f-score7 of 53.8% over the 25 words in the manually annotatedtest set. Precision and recall are not provided individually. A baseline us-ing the original WSD algorithm with attraction to the set containing themost similar sentence achieves an f-score of 36.9%. Active learning, wheresome examples are given back to a human annotator for decision during theclassification, achieves an f-score of 64.9%.

Inter-annotator agreement on 200 sentences in the test set annnotatedby two annotators was found to be 0.77, using Cohen’s kappa. κ. “Thesecond annotator was given no instructions besides a few examples of literaland nonliteral usage (not covering all target verbs).” (Birke and Sarkar,

6http://www.cs.sfu.ca/∼anoop/students/jbirke/TroFiExampleBase.txt [28 June,2006].

7(2 x precision x recall) / (precision + recall).

17

Page 20: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

2006, p. 333) In fact, possibly due to the low-key definition of literal andnonliteral, several inconsistencies can also be found in the annotated data.This is normal for a semantic task of this kind but causes and remediesshould be searched against the background of a theory or approach thatprovides annotators with better crieria.

• Annotated as nonliteral in test set:8

– The executives ’ positions at their previous agencies have n’t yetbeen filled ./.

– Their flagship center in Cambridge ’s booming Kendall Squarehas filled all 44 openings , and 90 children are on the waiting list./.

• Annotated as literal in test set:

– Also , Charles L. Magee , formerly deputy general counsel , wasnamed vice president , general counsel and secretary , filling va-cancies created when Richard Stewart resigned to pursue privatelaw practice ./.

– Allen J. Krowe , 56 years old , leaves his job as executive vicepresident and a director of IBM to fill the positions at Texacocurrently held by Richard G. Brinkman , who at 61 , will retireearly ./.

8.2 Gedigian et al. 2006

Gedigian et al. (2006) use a maximum entropy (ME) classifier to identifymetaphors. This classifier needs labeled training data. The authors anno-tated a subset of the Wall Street Journal for the senses of verbs from Motion-related, Placing, and Cure frames which were extracted from FrameNet(Fillmore et al., 2003).

The annotation shows that more than 90% of the 4,186 occurrences ofthese verbs in the corpus data are lexical metaphors. Gedigian et al. (2006)conclude that in the domain of economics, Motion-related metaphors areused conventionally to describe market fluctuations and policy decisions.

The annotator could decide for one of three labels, standing for metaphor,literal, or unsure, respectively. For training the classifier, unclear cases wereignored.

8http://www.cs.sfu.ca/∼anoop/students/jbirke/TroFiExampleBase.txt [28 June,2006].

18

Page 21: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

The theoretical background of this work is the Cognitive Theory ofMetaphor, exemplified by the conventionalized and wide-spread Event Struc-ture Metaphor mapping, from the source domain of motion and forces to thetarget domain of abstract actions. A submapping within the Event Struc-ture Metaphor is difficulties are impediments to motion, which isillustrated by sentences such as It’s been uphill all the way. or We’ve beenhacking our way through a jungle of regulations.

Metaphorical uses (sense potentials) can of course also just be treatedas separate senses of the verbs (and, in fact, in many lexical databases thisis the case). However, to know that we are dealing with a metaphoricaluse (and, presumably, from which other (literal) sense it has been derived),benefits inferencing (as in the reasoning systems discussed in Section 6.2above, or in other QA-systems): “[A] metaphorical usage and a literal usageshare inferential structure. For example, the aspectual structure of run is thesame in either domain whether it is literal or metaphorical usage. Further,this sharing of inferential structure between the source and target domainssimplifies the representational mechanisms used for inference making it eas-ier to build the world models necessary for knowledge-intensive tasks likequestion answering [. . . ].”

A classifier is trained on the annotated corpus to discriminate betweenliteral and metaphorical usages of the verbs in unseen utterances.

The features used by the classifier include information about the argu-ments of the verbs, which are believed to be an important factor for deter-mining whether a verb is being used metaphorically (cf. Subsection 6.1 onselectional restrictions):

1. Argument information is extracted from PropBank (Kingsbury andPalmer, 2002).

2. The head word of each argument is extracted using a method proposedby (Collins, 1999).

3. Assign a type (“semantic type”) to the head word of each argument.How this is done depends on the type of head word:

• If the head is a pronoun, a pronoun type (human/non-human/am-biguous) is assigned. (cf. Schneider (2007))

• If the head is a named entity, the NE tag assigned by the namedentity recognizer Identifinder is used as the type of the argument.

• Otherwise, the name of the head’s WordNet synset is used as thetype of the argument.

19

Page 22: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

Another feature used for the classifier is the bias of the target verb itself.The authors use this feature because they noticed that most verbs show aclear tendency towards literal or metaphorical uses.

The verbs are chosen from concrete frames that are likely to yield metaphors,especially motion-related frames, but also placing and cure. The verbs arecollected from the FrameNet frames as lexical units and sentences from Prop-Bank Wall Street Journal corpus containing these verbs are extracted andannotated.

The classifier has been trained on the hand-discriminated data with dif-ferent combinations of features. On a validation set, the best results wereobtained when a combination of the following features were used:

1. verb bias

2. semantic type of argument 1 (ARG1), typically realized as the directobject in active English sentences such as Traders threw stocks out ofthe windows;

3. (optionally,) semantic type of argument 3 (ARG3), the semantics andsyntactic form of which is difficult to generalize due to verb-specificinterpretations in PropBank.

On a test set of 861 targets, the classifier result (trained with features1 and 2 above) over all verbs in all frames is an accuracy of 95.12. This isabove the baseline of 92.9 overall, achieved by selecting the majority classof the training set, or above the alternative baseline of 94.89, achieved byselecting the majority class of each verb specifically.

Accuracy varies a little over frames. It is equal or higher than the base-lines in all frames, with the exception of the Cure frame. It remains openwhether this is due to verbs with strong biases in the training data (treathad no metaphorical uses in the training data) or whether a different featureset might be more successful with this frame.

The classifier has been ported to a different Maximum Entropy imple-mentation and similar methods have been applied to metonymy, by Schnei-der (2007).

References

Aziz-Zadeh, L., Wilson, S. M., Rizzolatti, G., and Iacoboni, M. (2006).Congruent embodied representations for visually presented actions andlinguistic phrases describing actions. Current Biology , 16, 16.

20

Page 23: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

Barnden, J., Glasbey, S., Lee, M., and Wallington, A. (2002). Reasoningin metaphor understanding: The ATT-Meta approach and system. InProceedings of the 19th International Conference on Computational Lin-guistics (COLING-2002), pages 1188–1192, San Francisco, CA. MorganKaufman.

Barnden, J. A. and Lee, M. G. (2001). Understanding open-ended usagesof familiar conceptual metaphors: an approach and artificial intelligencesystem. CSRP 01-05, School of Computer Science, University of Birming-ham.

Birke, J. and Sarkar, A. (2006). A clustering approach for the nearly un-supervised recognition of nonliteral language. In Proceedings of the 11thConference of the European Chapter of the Association for ComputationalLinguistics, pages 329–336, Trento, Italy.

Bowdle, B. F. and Gentner, D. (2005). The career of metaphor. PsychologicalReview , 112(1), 193–216.

Collins, M. (1999). Head-Driven Statistical Models of Natural LanguageParsing . Ph.D. thesis, University of Pennsylvania.

Coulson, S. and Petten, C. V. (2002). Conceptual integration and metaphor:an event-related brain potential study. Memory and Cognition, 30(6),958–968.

Dobrovol’skij, D. and Piirainen, E. (2005). Figurative Language: Cross-cultural and Cross-linguistic Perspectives. Elsevier.

Fauconnier, G. and Turner, M. (2002). The way we think: Conceptual blend-ing and the mind’s hidden complexities. Basic Books, New York.

Feldman, J. (2006). From Molecule to Metaphor . MIT Press, Cambridge,MA.

Feldman, J. and Narayanan, S. (2004). Embodied meaning in a neural theoryof language. Brain and Language, 89, 385–392.

Fillmore, C. J., Johnson, C. R., and Petruck, M. R. L. (2003). Backgroundto FrameNet. International Journal of Lexicography , 16(3), 235–250.

Gedigian, M., Bryant, J., Narayanan, S., and Ciric, B. (2006). Catchingmetaphors. In Proceedings of the 3rd Workshop on Scalable Natural Lan-guage Understanding , pages 41–48, New York City.

21

Page 24: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

Gibbs, R. (1994). The Poetics of mind: Figurative thought, language, andunderstanding . Cambridge University Press, Cambridge.

Glucksberg, S. and Keysar, B. (1990). Understanding metaphorical com-parisons: Beyond similarity. Psychological Review , 97(1), 3–18.

Glucksberg, S., McGlone, M., and Manfredi, D. (1997). Property attributionin metaphor comprehension. Journal of Memory and Language, 36, 50–67.

Green, R. (2002). Internally-Structured Conceptual Models in Cognitive Se-mantics. In R. Green, C. Bean, and S. H. Myaeng, editors, The Semanticsof Relationships, pages 73–89. Kluwer, Dordrecht et al.

Grice, H. P. (1975). Logic and conversation. In Syntax and Semantics,volume 3, pages 41–58. Academic Press, New York.

Hobbs, J. R. (1992). Metaphor and abduction. In A. Ortony, J. Salck, andO. Stock, editors, Communication fromm an Artifical Intelligence Per-spective: Theoretical and Applied Issues, pages 35–58. Springer, Berlin.

Karov, Y. and Edelman, S. (1998). Similarity-based word sense disambigua-tion. Computational Linguistics, 24(1), 41–59.

Kingsbury, P. and Palmer, M. (2002). From Treebank to PropBank. InProceedings of the 3rd International Conference on Language Resourcesand Evaluation (LREC-2002), Las Palmas, Canary Islands, Spain.

Kovecses, Z. (2002). Metaphor: a practical introduction. Oxford UniversityPress, New York.

Krishnakumaran, S. and Zhu, X. (2007). Hunting elusive metaphors us-ing lexical resources. In Proceedings of the Workshop on ComputationalApproaches to Figurative Language, pages 13–20, Rochester, New York.ACL.

Lakoff, G. (1987). Women, fire, and dangerous things. What categoriesreveal about the mind . University of Chicago Press, Chicago.

Lakoff, G. (1993a). The contemporary theory of metaphor. In A. Ortony,editor, Metaphor and thought. Second edition, pages 202–251. CambridgeUniversity Press, Cambridge.

22

Page 25: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

Lakoff, G. (1993b). The contemporary theory of metaphor. In A. Ortony, ed-itor, Metaphor and thought , pages 202–251. Cambridge University Press,Cambridge, England, 2 edition.

Lakoff, G. and Johnson, M. (1980). Metaphors we live by . University ofChicago Press.

Lakoff, G., Espenson, J., and Schwartz, A. (1991). Master metaphor list.Second draft copy. Technical report, Cognitive Linguistics Group, Uni-versity of California Berkeley. http://cogsci.berkeley.edu.

Lee, M. (2006). Methodological issues in building a corpus of doctor-patientdialogues annotated for metaphor. In Cognitive-linguistic approaches:What can we gain by computational treatment of data? A Theme Ses-sion at DGKL-06/GCLA-06 , pages 19–22, Munich, Germany.

Lonneker, B. (2004). Lexical databases as resources for linguistic creativity:Focus on metaphor. In Proceedings of the LREC 2004 Satellite Workshopon Language Resources and Evaluation: Language Resources for Linguis-tic Creativity , pages 9–16, Lisbon, Portugal. ELRA.

Lonneker, B. and Eilts, C. (2004). A current resource and future perspectivesfor enriching WordNets with metaphor information. In Proceedings of theSecond International WordNet Conference – GWC 2004 , pages 157–162,Brno, Czech Republic.

Martin, J. H. (1992). Computer understanding of conventional metaphoriclanguage. Cognitive Science, 16(2), 233–270.

Martin, J. H. (1994). MetaBank: A knowledge-base of metaphoric languageconventions. Computational Intelligence, 10(2), 134–149.

Mason, Z. J. (2004). CorMet: A computational, corpus-based conventionalmetaphor extraction system. Computational Linguistics, 30(1), 23–44.

McElree, B. and Nordlie, J. (1999). Literal and figurative interpretationsare computed in equal time. Psychonomic Bulletin and Review , 6(3),486–494.

Narayanan, S. (1999). Moving right along: A computational model ofmetaphoric reasoning about events. In Proceedings of the National Con-ference on Artificial Intelligence (AAAI ’99), pages 121–129, Orlando,Florida. AAAI Press.

23

Page 26: ESSLLI 2007 - Trinity College Dublin...Conventionalized metaphors like those discussed by Reddy are under-stood and produced by children already fairly early in life, cf. the chapter

Ortony, A. (1979). Beyond literal similarity. Psychological Review , 86,161–180.

Reddy, M. J. (1979). The conduit metaphor: A case of frame conflict in ourlanguage about language. In A. Ortony, editor, Metaphor and thought ,pages 284–324. Cambridge University Press, Cambridge.

Reining, A. and Lonneker-Rodman, B. (2007). Corpus-driven metaphorharvesting. In Proceedings of the HLT/NAACL-07 Workshop on Compu-tational Approaches to Figurative Language, pages 5–12, Rochester, NY.ACL.

Schneider, N. (2007). A metonymy classifier. Manuscript.

Searle, J. (1979). Metaphor. In A. Ortony, editor, Metaphor and Thought ,pages 92–123. Cambridge University Press, Cambridge, England, 1 edi-tion.

24