Skill transfer through goal-driven representation mappingdgs/papers/icarus.mapping09.pdf · Skill transfer through goal-driven representation mapping Action editor: Angela Schwering

Available online at www.sciencedirect.com

www.elsevier.com/locate/cogsys

Cognitive Systems Research 10 (2009) 270–285

Skill transfer through goal-driven representation mapping

Action editor: Angela Schwering

Tolga Konik *, Paul O’Rorke, Dan Shapiro, Dongkyu Choi, Negin Nejati, Pat Langley 1

Computational Learning Laboratory, Center for the Study of Language and Information, Stanford University, Stanford, CA 94305, USA

Received 19 April 2008; accepted 19 September 2008Available online 10 January 2009

Abstract

In this paper, we present an approach to transfer that involves analogical mapping of symbols across different domains. We relate thismechanism to ICARUS, a theory of the human cognitive architecture. Our system can transfer skills across domains hypothesizing mapsbetween representations, improving performance in novel domains. Unlike previous approaches to analogical transfer, our method usesan explanatory analysis that compares how well a new domain theory explains previous solutions under different mapping hypotheses.We present experimental evidence that the new mechanism improves transfer over ICARUS’ basic learning processes. Moreover, we arguethat the same features which distinguish ICARUS from other architectures support representation mapping in a natural way and operatesynergistically with it. These features enable our analogy system to translate a map among concepts into a map between skills, and tosupport transfer even if two domains are only partially analogous. We also discuss our system’s relation to other work on analogy andoutline directions for future research.� 2009 Elsevier B.V. All rights reserved.

Keywords: Analogy; Representation mapping; Cognitive architectures; Transfer; Transfer learning; Agent architectures

1. Introduction

Many computational learning methods require far moretraining instances than humans to achieve reasonable per-formance in a domain. A key reason is that humans oftenreuse knowledge gained in early settings to aid learning inones they encounter later. This phenomenon is known astransfer in cognitive psychology, where it has received farmore attention than in AI and machine learning.

Much of this research has focused on transfer of com-plex skills for tasks that involve action over time (e.g.Kieras & Bovair, 1986; Singley & Anderson, 1988). Thispaper reports on a computational approach to transfer that

1389-0417/$ - see front matter � 2009 Elsevier B.V. All rights reserved.doi:10.1016/j.cogsys.2008.09.008

* Corresponding author.E-mail addresses: [email protected] (T. Konik), [email protected]

(P. O’Rorke), [email protected] (D. Shapiro), [email protected] (D. Choi), [email protected] (N. Nejati), [email protected](P. Langley).

1 Current address: School of Computing and Informatics, Arizona StateUniversity, P.O. Box 87-8809, Tempe, AZ 85287, USA.

takes a similar perspective. We focus on the acquisition ofcognitive skills from experience and on how transferimproves behavior on distinct but related tasks.

We share with many psychologists the idea that transferis mainly a structural phenomenon. This suggests thattransfer is linked closely to how an agent represents knowl-edge in memory, how its performance methods use thesestructures, and how its learning elements acquire thisknowledge. Theoretical commitments to representation,performance, and learning are often associated with thenotion of a cognitive architecture (Newell, 1990). Thus, itseemed natural for us to study transfer in the context ofICARUS (Langley & Choi, 2006), an architecture that takespositions on each of these issues.

In previous research (Choi, Konik, Nejati, Park, &Langley, 2007), we showed that, ICARUS is natively capableof producing near transfer due to its commitments to rela-tional, hierarchically composable, and goal-indexed knowl-edge representations, plus its mechanisms for using andacquiring such structures. This form of transfer is due to

mailto:[email protected]



mailto:dongkyuc@stanford. edu

mailto:dongkyuc@stanford. edu



T. Konik et al. / Cognitive Systems Research 10 (2009) 270–285 271

the application of literally similar knowledge in multipletasks, represented in ICARUS as generalized skills acquiredby a flexible learning method.

However, literal similarity does not explain all transferphenomena. Humans often reason analogically betweensituations that are very different in the surface but containstructural correspondences apparent in statements such as‘‘an electric battery is like a reservoir”. Structural mappingtheory (Gentner, 1983) explains this kind of analogy bydetecting structural correspondences between twodomains. This paper investigates a related approach. Ourfocus is to explain far transfer, defined as transfer betweendomains that employ different symbols to represent rela-tions among objects. We introduce a new mechanism forgoal-directed analogical reasoning that identifies mappingsbetween such domains, and we use that mapping to portboth conceptual and procedural knowledge into a new set-ting. We combine this mechanism with features of ICARUS,resulting in a system that acquires skills from experience inone problem domain and employs them in another, despitesignificant differences in vocabulary.

More specifically, we develop a novel algorithm for rep-resentation mapping, called GAMA (goal-driven analogicalmapping) that identifies correspondences between rela-tional representations through explanatory analysis. Thisprocess is mainly guided by pragmatic (i.e. related to thepurpose of analogy) considerations, using the terminologyof Holyoak and Thagard (1989), but it is also biased to sat-isfy structural and semantic constraints as defined by Gent-ner (1983). We claim that:

� Goal-directed analogy enables far transfer in a cognitivesystem.� The same features that distinguish ICARUS from other

cognitive architectures support goal-directed analogyin a natural way.

We support the first claim via computational experi-ments on far transfer tasks. In particular, we demonstratea significant increase in problem-solving speed due tomapped knowledge, and then present the results of a lesionstudy that shows the speedup is due to the GAMA algo-rithm. We support the second claim by illustrating thedependence of GAMA on core features of the ICARUS

architecture:

� two separate long-term memories for procedural (skills)and declarative knowledge (concepts);� skills indexed by the goals they achieve (thus associating

all procedural knowledge elements with declarativeknowledge elements); and� hierarchical and composable representations for both

skills and concepts.

These features work synergistically with GAMA by lettingit translate a map among concept predicates into a map

between skills, and by supporting far transfer even in thecase of partial analogy.

We elaborate on these ideas in the pages that follow.The next section presents an experimental testbed for com-puter games that illustrates the benefits of reusing learnedknowledge structures. In Section 3, we review ICARUS’assumptions about representation, performance, and learn-ing, and how these assumptions support near transfer. InSection 4, we describe our representation mapping algo-rithm, GAMA, showing how it transfers ICARUS conceptsand skills across domains. In that section, we also discusshow ICARUS’ architectural commitments help GAMA to sup-port transfer. In Section 5, we evaluate our claims usingspecific experiments with game playing domains. We con-clude by reviewing related efforts on structural transfer(Section 6), describing our priorities for future researchon this topic (Section 7), and stating our conclusions (Sec-tion 8).

2. The transfer setting

Our research has studied transfer via a collection ofchallenge problems phrased as source-target pairs. Here,the object is to solve the source problem, extract transfer-able knowledge from that experience, and demonstrate thatthese structures improve the agent’s ability to solve the tar-get problem. We measure transfer by comparing the timerequired to solve the target problem with and withoutexposure to the source.

We have employed the general game playing (GGP)framework (Genesereth, Love, & Pell, 2005) to structurethis task. GGP encodes tasks (typically games) in a first-order logic language that employs separate theory elementsto describe legal moves, state transitions, and the initialstate associated with game instances. GGP also enforcesa careful evaluation model by presenting game rules atthe same time as game instances. This requires agents toemploy general mechanisms for performance and learning,while constraining the role of background knowledge.

We have studied three types of far transfer tasks, char-acterized by the nature of the analogy within eachsource-target pair. Abstraction tasks admit a one-to-onecorrespondence between symbols denoting objects and/orrelations, and allow elements in the source with no targetcorollary. Fig. 1a gives an example drawn from a gamecalled Escape, where the agent’s goal is to direct theexplorer to an exit. The source task requires nailing logstogether to create a bridge over the river, while the targetrequires tying barrels together with rope. The predicatesfor nailing and tying differ from source to target and whilethe logs correspond to the barrels, the hammer has no tar-get corollary. This makes transfer difficult because theagent must discover the mapping between source and tar-get symbols and abstract away the parts of the sourceknowledge (e.g., knowledge about hammer) that is notapplicable in the target.

Fig. 1. Source-target scenario pairs.

272 T. Konik et al. / Cognitive Systems Research 10 (2009) 270–285

Reformulation scenarios consist of isomorphic problemscreated by consistently replacing all source symbols toobtain the target task. Fig. 1b gives an example taken froma domain called Wargame, where the goal is to maneuverthe soldier to the exit while avoiding/defeating enemies.The enemies actively seek the soldier and move twice asfast, but they can become stuck at walls. Supply pointscontain weapons and ammunition. These transfer problemsare difficult because the agent must discover a deliberatelyobscured source-target relation.

Finally, cross-domain scenarios draw the source andtarget problems from different games. Fig. 1c shows anEscape to mRogue example (after the ancient text game),where the agent gets points for gathering treasure, defeat-ing monsters, and exiting the playing field. These scenariosdo not deliberately support analogies, but all games occuron 2D grids and involve reaching a goal after surmountingobstacles by collecting and employing appropriate tools.Transfer pairs in this class are difficult because the agentmust identify both the symbol mapping and the portionsof the source solution that are preserved.

3. The ICARUS architecture

ICARUS is a cognitive architecture in the tradition ofwork on unified theories of cognition (Newell, 1990). Assuch, it provides general-purpose reasoning mechanismsand representation. We discuss these components in thetraditional order, beginning with the key representationalelements (concepts and skills), followed by the operationsthat employ them (inference, execution, and learning).We place these components in context at the section’send by describing the capacity for near transfer they natu-rally support.

Table 1Example ICARUS concepts.

3.1. Representation of concepts and skills

The ICARUS architecture makes several commitmentsabout knowledge representation. First, it supports two dis-tinct structures to represent tasks; concepts describe situa-tions in the environment while skills describe how toachieve such situations. Both concepts and skills are orga-nized in hierarchies, which consist of primitive structures inthe terminal nodes and nonprimitive ones in the internalnodes. This means that ICARUS can employ multiple layersof abstraction to describe the current state and the proce-dures for manipulating that state.

ICARUS’ long-term conceptual knowledge is described byconcept clausesthat resemble traditional Horn clauses infirst-order logic (Table 1). An ICARUS agent uses primitiveconcepts like location to describe symbolic and numericfeatures of the perceived state. Higher-level concepts suchas atExit are defined in terms of other concepts and enablethe agent to form beliefs (concept instances) given otherconcept instances. Since concepts can also represent fea-tures of desired states, ICARUS uses them to describe thegoals.

ICARUS’ skills and concepts correspond to hierarchicaltask networks like those in SHOP2 (Nau, Cao, Lotem, &Munoz-Avila, 1999) with one additional constraint; theskills are indexed by the goals they are intended to achieve.All skill clauses (Table 2) have a head, which indicates theskill’s goal, and start conditions, which describe the situa-tion when it is applicable. Primitive skills like location inTable 2, achieve their goal by executing an action, movein this case, while nonprimitive skills specify more complexactivities that involve decomposing the skill into subskills(e.g., atExit is decomposed into subskills such as holding).

It is important to note that the skills and concepts con-tain different kind of information, even though ICARUS uses

Table 2Primitive and non-primitive ICARUS skills.


the same predicate symbols to refer both. A concept p

defines what p is and when its instances can be inferred,while the skill p is a constructive definition that describeshow instances of p can be achieved or created throughactions. For example, while the atExit concept (Table 1)defines when an agent is at an exit location, the atExit skill(Table 2) describes how to bring the agent to that exitlocation.

ICARUS skill clauses refer to concepts in their head, startconditions, and subgoals. The SHOP2 formalism alsorefers to defined predicates in its methods’ preconditions,but their heads and bodies refer to arbitrarily named tasksand subtasks. ICARUS’ use of goals (desired concepts) intheir place plays important roles in execution and learning.

3.2. Execution of hierarchical skills

The ICARUS architecture operates in cognitive cycles,spanning conceptual inference, skill selection, and physicalexecution (Fig. 2). The system derives its beliefs via a bot-tom-up inference process, initiated by object descriptionsthat arrive in the agent’s perceptual buffer. After it infersprimitive beliefs from these percepts, inference overhigher-level concepts generates additional beliefs. In con-trast, ICARUS performs skill selection in a top-down man-

Long-TermConceptual

Memory

ConceptualInference

Problem SolvingSkill Learning

Long-TermSkill Memory

Long-TermConceptual

Memory

ConceptualInference

Problem SolvingSkill Learning

Long-TermSkill Memory

Fig. 2. A schematic of memories and p

ner, starting with the unsatisfied goal having the highestpriority. On each cycle, it finds a path from this goal down-ward through the skill hierarchy; each skill along this pathmust have an unsatisfied head and be applicable given cur-rent beliefs, with the terminal node being a primitive skillthat ICARUS executes in the environment. This differs fromtraditional production-systems (Neches, Langley, & Klahr,1987), which require multiple cycles and use of workingmemory to traverse levels of an implicit goal hierarchy.

Since the architecture repeats this procedure on eachcycle, agents combine reactivity with goal-driven behav-iors. ICARUS is goal-driven because an applicable skill pathcorresponds to an abstract partial plan for achieving thegoal. The system is reactive because it reevaluates skill pathon each execution cycle. Moreover, there is a bias towardspersistence because the execution module prefers portionsof the path selected on the previous cycle as long as theyare still applicable.

3.3. Learning hierarchical skills

Nejati, Langley, and Konik (2006) reported an analyti-cal method to learn hierarchical skills. This mechanisminputs the goal and a solution trace that achieves it, statedas a sequence of observed states and actions that produce

Short-TermConceptual

Memory

Short-TermGoal/SkillMemory

SkillExecution

Perception

Environment

PerceptualBuffer

MotorBuffer

Skill Retrieval

Short-TermConceptual

Memory

Short-TermGoal/SkillMemory

SkillExecution

Perception

Environment

PerceptualBuffer

MotorBuffer

Skill Retrieval

rocesses in the ICARUS architecture.


them. The mechanism interprets the solution trace in thecontext of the system’s conceptual knowledge and primi-tive skills and explains how the solution path producedthe desired goal. Then, it outputs a skill hierarchy forachieving that goal.

The learning component starts it analysis with the top-level goal and explains it with a recursive procedure, wherea set of goals are either decomposed into subgoals usingconceptual definitions or explained in terms of the effectsof primitive skills. The system then applies the same proce-dure recursively on the subgoals and the correspondingparts of the solution trace, until it generates all the explana-tions necessary for the overall task. The architecture trans-forms the resulting explanation structure into hierarchicalskills and adds them to skill memory for future use.

This learning method is related to previous techniques forexplanation-based learning (Segre, 1987), but instead ofcompiling away the explanation structure, it uses this todetermine the structure of the skill hierarchy. We reportedanother approach (Langley & Choi, 2006) for learning skillhierarchies that relies on means-ends problem-solvinginstead of an explanatory analysis of solution traces,although it requires more search through the problem space.

3.4. Near transfer in ICARUS through generalized skills

In our previous research (Choi et al., 2007), we demon-strated ICARUS’ near transfer capabilities on Urban Com-bat, a real-time, first-person perspective game.2 Weclaimed that the architecture achieves this effect due to itsuse of relational, hierarchically composable, and goal-

indexed skills, along with mechanisms for using and acquir-ing them. These representational commitments mean thatskills exist as independent knowledge elements that canbe composed in novel ways. The learning algorithm justdescribed utilizes this capability by acquiring many smallskills that will be composed at run-time as opposed to cre-ating a single macro with a long sequence of actions (Moo-ney, 1990). ICARUS’ flexible execution mechanism supportsthis ability by making skill decomposition decisions atrun-time based on observed situations.

ICARUS’ relational representation facilitates transfer byincreasing generality of the encoded skills such that theyapply in circumstances that are only qualitatively similarto the situations that led to their acquisition. Learning sup-ports this generality by employing relational backgroundknowledge. As a result, the learned skills reference the sys-tem’s conceptual vocabulary, letting ICARUS retrieve themmore flexibly during execution. For example, suppose anagent has acquired a skill that has the goal of reaching atarget location. Its start conditions will indicate that thislocation is collectively-blocked (a background concept thatmatches when multiple objects collectively impede a path),and it will have subgoals for reaching the closest blocking

2 Details of this game are available at http://www.urban-combat.net.

object and trying to overcome it (e.g., by jumping overit). The generality of the collectively-blocked concept letsthe agent apply this learned skill in novel situations wheremultiple objects collectively block a path in very differentconfigurations.

ICARUS’ goal-indexed skill representation facilitatestransfer by enabling the skills learned on different occa-sions to be composed in new situations in novel ways. Incases where learned skills are partially incorrect or inaccu-rate given the current situation, the execution module candynamically replace the unusable subskills with others orit can return control to alternative higher-level skills. Forexample, suppose that ICARUS is executing a navigationskill for reaching a target location in a partial informationgame. When the agent encounters an unexpected impassesuch as a destroyed bridge, its source navigation skill fortraversing that region may become inapplicable. Givenalternate learned skills, the agent could try to achieve thesame subgoal by constructing a raft, or by falling back todefault exploration skills that search for an alternative pathto a goal location.

4. Far transfer through representation mapping

Although the architectural mechanisms described in theprevious section account for interesting transfer phenom-ena, ICARUS does not exhibit far transfer in cases wherethe source and target domains are represented with differ-ent symbols. This section describes a new method for rep-resentation mapping and transfer that enables better reuseof knowledge in such settings.

Analogical mechanisms are often described in terms ofdistinct steps. Here, we extend Thagard, Holyoak, Nelson,and Gochfeld’s (1990) decomposition by including an addi-tional first and fifth step:

1. Performance and learning in the source domain.2. Retrieving or selecting plausibly useful source

knowledge.3. Mapping between the source and the target domains.4. Translating source information for appropriate use with

the target.5. Performing with the transferred knowledge.6. Subsequent learning in target domain.

We have developed a new representational transfer algo-rithm called GAMA (Goal-driven Analogical Mapping) tosupport far transfer by handling the mapping and transla-tion tasks identified above (steps 3 and 4). We use GAMA totransfer executable knowledge, and demonstrate it in a set-ting (see Section 5) that includes performance and learningin the source domain as well as performance in the targetdomain (steps 1 and 5, above). GAMA consists of two parts:the representation mapper finds the correspondencesbetween source and target symbols, while the representa-tion translator uses those correspondences to translatesource skills and concepts into skills and concepts in the

http://www.urban-combat.net


target domain. The following subsections describe thesecomponents of GAMA in more detail. Note that we havenot fully integrated GAMA into the ICARUS architecture,so we make no commitments about how and when theretrieval of relevant past experiences (step 2 above), andtransfer should occur in general.

The input of GAMA consists of source and target domaintheories3, a solution for a source task, and source skillslearned from that solution. The output comprises targetskills and concepts constructed from transferred sourceknowledge. The new transfer mechanism uses skills learnedin the source domain to improve performance in the targetdomain. To this end, the representation mapper finds thecorrespondences between source and target symbols, whilethe representation translator uses those correspondences totranslate source skills and concepts into skills and conceptsin the target domain. In the following subsections, wedescribe these components of GAMA in more detail.

4.1. Representation mapping

The intuition behind the representation mapper is thatthere is an opportunity for analogy if one can explain thesolution for a source problem in target terms. In overview,the algorithm employs a source problem and an explana-tion of how the source problem is solved as input. Next,it generates an analogous explanation from the perspectiveof the target domain theory by matching target theoryclauses against the source explanation. This first forgeslinks between the source and target theories, then returnsas output correspondences between concept predicates(and objects) of the source and target theories.

In more detail, the input consists of source and target

domain theories, source solution, and learned source knowl-

edge. The domain theories describe the initial state, rulesand goal using ICARUS concepts and primitive skills.4 Thesource solution consists of a goal instance and a sequenceof action-state pairs achieving that goal. Finally, learnedsource knowledge contains an explanation of the sourcesolution and learned source skills automatically generatedby the learning system described in Section 3.3. The sourceexplanation is an intermediate structure produced by thislearning method consisting of a directed graph in whichthe nodes are facts that were relevant for achieving thesource goal and the links represent their justification interms of source theory clauses. The output is a representa-tion map – a set of source-target predicate (or object) cor-respondences that can convert source skills and conceptsto the target representation. GAMA hypothesizes correspon-dences between source and target predicates by aligningtarget theory clauses against the source explanation. The

3 In the GGP setting, we assume that an agent knows the rules of bothsource and target games but initially does not possess strategic knowledgein either domain.

4 We use a preprocessor that automatically translates GGP game rulesto ICARUS’ representation.

system starts this process by forming a correspondencebetween the source and target goals. Next, GAMA expandsthe source goal using the explanation for how it wasachieved, and the target goal using the target theory forhow it might be achieved. GAMA then searches for map-pings among predicates and objects that let the target the-ory match onto the source explanation.

GAMA’s search is guided by constraints and heuristicsdescribed in the next section. Whereas hard constraintscause GAMA to backtrack, heuristics and soft constraintsdetermine the order of search through the space of corre-spondences and the preference among alternative map-pings. Once GAMA finds a consistent alignment, it returnsas output a representation map M, which is a hypothesizedset of correspondences between the source and target con-cept predicates such that, given M, the target theory canexplain the source solution in a similar fashion to the givensource explanation. GAMA also facilitates transfer betweendomains that are analogous at an abstract level by allowingpartial mappings that do not perfectly align the sourceexplanation with the target theory. However, it tries tominimize the amount of abstraction with a bias that prefersmappings which cover as much of the source explanationas possible.

Table 3 presents an example from an alignment process.In overview, GAMA expands a given correspondence intoantecedents using the explanation on the source side andthe domain theory on the target side, creating an opportu-nity to establish new mappings. The alignment starts withthe correspondence (atExit explorer) M (atGoal hero)between the top-level goals. On the target side, GAMA uni-fies the goal instance (atGoal hero) with the target conceptclause (atGoal ?agent), which it will expand later to con-tinue the mapping process. It matches those subconceptsagainst the antecedents of (atExit explorer) in the sourceexplanation. Consequently, GAMA generates a new set ofcorrespondences between the antecedent pairs such as(location explorer 5 0) M (place hero ?X ?Y), which it willexpand later. Note that, in this Table, the antecedent cor-respondence (not (blocked east 4 0)) M true exemplifies acase when GAMA could not construct a perfect alignmentand leaves the source explanation node without a targetcorrespondence.

Table 4 presents the pseudo-code for the GAMA algo-rithm. Given a source explanation E, target theory T, thesource and target goals GS and GT , GAMA starts the align-ment process by forming a correspondence between GS andGT . At any given time, it keeps track of a partiallyexpanded correspondence between E and T using two datastructures. The set of cndidate correspondences C containscorrespondences that are not examined yet, whereas thecorrespondence support set S accumulates the correspon-dences committed to so far. The algorithm uses S to con-struct its output and to ensure that the collectedcorrespondences are globally consistent. C and S are ini-tialized with the top-level goal correspondence GS $ GT .On each iteration, GAMA picks the most promising corre-

Table 3Example alignment process of a source explanation against a target theory.


spondence s0 $ t0 2 C based on a heuristic function h (Sec-tion 4.2). Next, it applies a correspondence expansion ons0 $ t0, where correspondences are formed betweentheantecedents of s0 and t0. During correspondence expan-sion, it generates source antecedents s1; . . . ; sn that justify s0

in the source explanation, and target antecedents t1; . . . ; tm

that regress the goal, t0, through a single concept or prim-itive skill clause. These antecedents are concepts in thesource and target domains, which GAMA seeks to analogize.A correspondence expansion returns an antecedent align-ment A ¼ f. . . ; ðsi $ tjÞ; . . .g that is a pairwise correspon-dence between source and target antecedents. If thenumber of source and target antecedents are not the same,the system allows a partial antecedent alignment with cor-respondences such as true$ tj or si $ true, which is thebasis for learning mappings between domains that are sim-ilar only at an abstract level. However, the method assignsa high heuristic cost to such abstract correspondences.

After each correspondence expansion, GAMA updates itsdata structures. It removes the expanded correspondences

from C, and adds the newly created antecedent correspon-dences to C as well as to S. The algorithm backtracks if thenewly added correspondences create inconsistency in S

based on constraints we describe in the next subsection.This process continues recursively until C is empty, whenit returns a representation mapping constructed using theinformation in S.

This pseudo-code defines a search process generated bythe nondeterministic correspondence expansion function.This function calculates all possible antecedent alignmentsfor a given correspondence, but returns them one by one(per a heuristic order) each time the algorithm backtracksto a given choice point. For example, in Table 3 the firstcorrespondence expansion results in (location explorer 50) M (place hero ?X ?Y), but GAMA also considers an incor-rect alternative correspondence (location exit 5 0) M (placehero ?X ?Y)during its search. The algorithm sorts its alter-natives at choice points using a heuristic function h( ) thatwe describe shortly. When new additions to the correspon-dence support set S violate constraints, GAMA backtracks.

Table 4The GAMA algorithm, with line numbers for nondeterministic steps in bold.


It also backtracks to an earlier choice point if the heuristicvalue of the collected correspondences in S is too low (e.g.,if there are too many abstractions).

4.2. Constraints and heuristics used in representation

mapping

If not controlled, the space of all possible representationmappings is very large. Like other techniques that findanalogies, ours is guided by several constraints and heuris-tics. These fall into three groups: pragmatic, structural, andsemantic constraints, using the terminology of Holyoakand Thagard (1989).

Pragmatic constraints concern the purpose of analogyand play a central role in GAMA. In our case, the purposeof analogy is to transfer source skills that can achieve a tar-get goal and the search process is guided through forma-tion of an explanation of that target goal. Moreover,GAMA considers only correspondences that are necessaryto transfer relevant source skills. In particular, it forms cor-respondences for the symbols that occur in the explanationof the source goal that corresponds to the target goal, and,consequently, for the symbols in the source skills learnedfrom that explanation.

Structural constraints are based on structural similaritybetween the source and target theories. GAMA uses struc-

tural constraints motivated by Gentner’s (1983) structure-mapping theory and its implementation SME (Falkenhain-er, Forbus, & Gentner, 1989). One major structuralconstraint is the assumption that there is a one-to-onerelation between objects; for example, the system treatsthe correspondences hero M ?x and exit M ?x as inconsis-tent. On the other hand, the correspondences hero M ?xand hero M ?y are consistent, and if both are in the corre-spondence support set, the system unifies the variables ?xand ?y. GAMA also employs a second structural constraintcalled parallel connectivity by Klenk and Forbus (2007),which assumes that correspondences between predicatesimply correspondences between arguments. For example,correspondences (location ?a ?b ?c)M(place ?a ?b ?c) and(location explorer 1 2) M (place hero ?x ?y) imply the corre-spondences explorer M hero, 1 M ?x, and 2 M ?y.

GAMA is mainly guided by the pragmatic and structuralconstraints, but to order alternatives at choice points it alsouses semantic constraints, which involve literal similarity inthe meaning of source and target concepts. Before the algo-rithm starts its explanatory analysis, it calculates a similar-ity measure for each source and target predicate (andobject) pair, then inserts these into a conceptual similarity

matrix. At present, we use a naive similarity measure basedon shared symbols between source and target domains. Ifsource and target predicates have the same symbols, the


correspondence is assigned the maximum similarity value.If they do not have the same symbol, their similarity isderived from the average similarity of the predicates thatoccur in their concept definitions. GAMA uses the valuesfrom the conceptual similarity matrix in the heuristic func-tion that orders options at choice points.

The heuristic function hðxÞ (Table 4) returns a confi-dence value for a set of correspondences x. The value ofa correspondence hðs$ tÞ is a measure of how well thesource predicate s matches against the target theory t,and it is determined from a mixture of structural andsemantic considerations. If a correspondence s$ t is justi-fied by previous elements in the correspondence support setS, GAMA assigns a high positive value to hðs$ tÞ based onstructural consistency. For example, in Table 3, the align-ment (location explorer 4 0) M (place hero ?X2 ?Y) is justi-fied by the previously recorded correspondence (location?A ?B ?C) M (place ?A ?B ?C) and therefore has a highconfidence value. On the other hand, if s$ t cannot be jus-tified by structural evidence, GAMA determines its confi-dence value using semantic considerations. In that case,our implementation assigns hðs$ tÞ a value proportionalto the conceptual similarity metric5 between s and t. Whenh must determine the value of a set of correspondences,A ¼ fs1 $ t1; s2 $ t2; . . . ; sn $ tng, as when A is an align-ment of antecedents (Table 4, line 16), the heuristic func-tion simply returns the average of confidence valuesassigned to each correspondence in that set such that

hðfs1 $ t1; s2 $ t2; . . . ; sn $ tngÞ ¼ averageiðhðsi $ tiÞÞ:

If the number of source and target antecedents are not thesame in a comparison expansion, GAMA can still find a cor-respondence by abstracting away some of the antecedents.In particular, if GAMA does not find a correspondence for asource explanation predicate s, it can choose to constructan abstraction mapping s$ true, in which case it will alsoskip aligning the decedents justifying s in the source expla-nation. If abstraction were allowed without additional bias,the system would generate too many alternative mappingsand would not have sufficient information to prefer amongthem. For example, the trivial mapping: ‘‘abstract all ante-cedents of top level goals” is always valid so GAMA needscriteria to prefer better mappings if they exist.

GAMA has the structural preference to prioritize map-pings that have less abstraction and that explain more ofthe source solution with the target theory. Abstraction cor-respondences have a high negative value, so expansionsthat generate abstractions are considered only as a lastresort. GAMA also prefers smaller abstractions over largerones. If the system forms a source abstraction s$ true,

5 In our current implementation, conceptual similarity is merelydetermined by the static similarity matrix generated during the initialphase and does not use information in the correspondence support set.Evaluating conceptual similarity given a correspondence support set mayyield a more accurate heuristic function, but only at the expense of morecomputational cost.

which means that the source predicate does not have anytarget correspondence, the cost of abstraction is propor-tional to the size of the explanation structure not aligneddue to the abstraction. The algorithm estimates the magni-tude of that value proportional to the number of all descen-dants of s in the explanation structure. On the other hand,if no correspondence can be found for a target predicate t,the system forms an abstraction true$ t. This abstractionhas a constant cost, since abstraction of a theory clausedoes not leave any part of the source explanationuncovered.

4.3. Translating skills and concepts

GAMA passes a hypothesized representation map to itsrepresentation translator component, which is responsiblefor translating the source skills and concepts to the skillsand concepts of the target theory.

Translation of concepts is a simple recursive process. Asthe base case, given a source concept s for which GAMA hasa predicate correspondence s$ t, all references to s arereplaced with t. A concept that does not have a representa-tion translation can still be imported to the target domainby translating its definition. In that case, we say that s istranslated analogically and a new concept definitionanalogical� s is created in the target domain by translat-ing the definition of s into target symbols. Moreover, GAMA

adds the correspondence s$ analogical� s to the repre-sentation map to be used in translation of other conceptsand skills.

The ability to translate conceptual knowledge given amapping between concept predicates is not very surprising.However, the ability to transfer procedural knowledgeusing a conceptual mapping is less obvious. One of themain representational commitments of ICARUS, associatingall skills with the goals they achieve, facilitates transfer ofskills given mappings between concept predicates becausethe skills are referenced with those conceptual predicates.

The representation translator imports skills from sourceto target by converting predicates in the goal, subgoals, andstart conditions of these skills using hypothesized represen-tation correspondences. The translator matches the lefthand-side of a correspondence to each conceptual predi-cate that occurs in a source skill definition and replaces itwith the right-hand side. Table 5 presents a representationmap, a source skill, and the transferred target skill, includ-ing three different kinds of correspondences. While thepredicate and object mappings such as (holding?x1) M (holding ?x1) and doesnail M doestie translatesource skill symbols into target symbols, abstraction corre-spondences like (property ?Item3 hammer) M true and(holding ?Item3) M true mean that the predicates on theleft-hand side do not have a target correspondence andmust be dropped in start conditions or subgoals of thetransferred skills.

As the representation translator converts skills usingabstraction maps, it uses additional information the repre-

Table 5Importing source skills into the target domain.


sentation mapper provides. For example, the translatordoes not apply an abstraction correspondence like (holding?Item3) M true everywhere it matches on a source skill,since that would generate incorrect results, such as remov-ing all holding subgoals in Table 5. Instead, the systemkeeps track of the point in the explanation structure atwhich the abstraction occurred and only abstracts the cor-responding part of the source skill.

4.4. Transfer in partial analogy

When the source and target domains have a one-to-onecorrespondence, it is easy to see how GAMA can transferknowledge. In this subsection, we describe how it cantransfer between source-target domains that are quite dif-ferent, as in the case of cross-domain scenarios. To thatend, we highlight four features that equip our system tohandle partial analogies.

First, GAMA’s basic explanatory analysis naturally sup-ports abstract correspondences between domains, becauseexplanations refer to only relevant parts of a sourcedomain that play role in a solution. All other features ofa source domain are automatically abstracted duringtransfer.

We have also described a second mechanism GAMA usesto find abstract similarities. It can heuristically ignore por-tions of the source explanation that do not align against thetarget theory, or portions of the target theory that do not

match against the explanation structure. As a result of thisprocess, given a source predicate s that does not have a tar-get correspondence, GAMA acquires a source abstractionmap of the form s$ true, and applies it to remove portionsof the transferred source skills that are not meaningful inthe target domain.

A third method for transferring abstract similarities isrelated to untranslatable source symbols – source concep-tual predicates and objects that cannot be translated intothe target domain through correspondences or translationsof their concept definitions. If the translator were to leavesuch symbols in the start conditions of a skill, the skillwould never be applicable in the target domain. Therefore,the representation translator generalizes those skills byremoving predicates in the start condition that cannot betranslated. For example, the heavy-to-carry concept inTable 4 does not exist in the target game, since that gamedoes not include a weight limit. As a result, GAMA general-izes the transferred skill during translation by removingthat condition from the start field.

Finally, GAMA can sometimes transfer a skill, even if itcontains untranslatable symbols in its head and subgoals.Just like in the cases of transferring concepts by translatingtheir definitions, our system can transfer skills withuntranslatable goals by translating the skill definitions ofthose goals. A source skill s is translatable to a target skillanalogical-s, if all of its subgoals are translatable. Forexample, the source skill carrying in Table 6 refers to a sub-

Table 6Transfer of an mRogue source skill into Wargame. GAMA removes untranslatable start condition concepts such as heavy-to-carry and imports skills withuntranslatable goals such as pickedup by translating their definitions.


goal pickedup for which the translator could not find aconceptual translation. Nevertheless, this skill can still betranslated since its subgoals can be translated. In this case,GAMA can create a new target skill analogical-pickedup6

because pickedup’s only subgoal nextHeroLocation has atranslation, namely, intendedSoldierLocation. This showsthat the system can transfer skills even if there is apartial mapping that cannot translate all skill symbols.ICARUS’ commitment to hierarchically composable skillsplays a major role in this ability, because it lets the agentinterleave target skills with analogically translated sourceskills.

5. Experimental evaluation

We have claimed that goal-directed analogy supports fartransfer. This section provides support for that claim in theform of two computational experiments. The first examinesthe magnitude of the transfer effect produced on a batteryof far transfer tasks designed by an independent agency.7

We expect a substantial increase in problem-solving speeddue to transferred knowledge. The second experimentinvolves a lesion study that determines the source of powerbehind the observed transfer, specifically if it is due to reuseof generalized skills without representation mapping (the

6 Since an analogically translated skill analogical-s does not have acorresponding concept definition, the interpreter never tests to see if suchgoals are achieved. This limits reactivity but enables flexible transfer.

7 This system was one of three cognitive architectures tested in theDARPA Transfer Learning Program, via a formal evaluation processcontrolled by the Navy Center for Applied Research.

lesion case) or transfer of generalized skills with representa-tion mapping (the non-lesion case). We hypothesize thatthe ability to map skills across problem representations willqualitatively improve transfer in the non-lesion case rela-tive to the lesion case.

5.1. Experimental methodology

Details of the general game playing framework led us toconduct these experiments with two important changesfrom the ICARUS architecture. In particular, we replacedthe bottom-up inference module with a top-down mecha-nism to handle the number of entities perceived in the envi-ronment. Because GGP provides fully modeled,deterministic domains, we also replaced the system’s reac-tive execution module with one that mentally simulatedexecution with N-step look ahead, using technologyadapted from Nau et al.’s (2003) SHOP2 planner. Weemployed this execution module to generate solution tracesfor input to the learning algorithm described earlier and toevaluate learned skills in the target domain.

The system begins by translating the GGP game specifi-cation into concepts and primitive skills, then invokingautomatically generated exploration skills to search for asolution to the source problem. Upon finding one, the sys-tem acquires new hierarchical skills (as discussed in Section3.3) that both generalize the solution and become theobject of transfer to the target problem. After porting theseconcepts and skills via representation mapping, the perfor-mance system employs them in the target domain, fallingback on exploration behavior when guidance from sourceknowledge is exhausted.


In more detail, our experiments measure transfer as thedifference in agent performance (solution speed) on a giventarget problem with and without exposure to the corre-sponding source (normalized to facilitate comparison).The non-transfer (N) agent sees the target problem aloneand must solve it by a base set of exploration skills. In con-trast, the transfer (T) agent has the opportunity to solve thesource problem, acquire knowledge from that experience,and make it available for transfer. The T protocol involvesseveral steps that are consistent with the general game play-ing format:

1. download the source game definition and initial state;2. solve the source task via exploration;3. learn hierarchical skills and concepts from the solution;4. download the target game definition;5. select a representation map;6. instantiate the mapped skills and concepts in the target;7. if learned skills fail to improve performance on the

source task, then set mapped skills = {};8. download initial state data for the target problem; and9. solve the target problem using the mapped knowledge.

Since this experiment concerns the potential benefits oftransferred knowledge, not the cost of acquiring thatknowledge, the protocol only reflects the time requiredfor the last two steps in the transfer agent’s performancescore (although steps 4–6 were short by comparison). Thetransfer and non-transfer agents both had access to thesame exploration skills and supporting background knowl-edge to solve the target task. They act as a fallback to thetransfer agent if mapped knowledge does not suffice. Step isintended to avoid negative transfer. It ensures that the sys-tem will only attempt to transfer high-quality learnedknowledge from the source to the target task and will relyon exploration if that test fails.

5.2. Experimental results

Fig. 3 presents the results of the experiment to measurefar transfer. It examines the magnitude of the effect

-0.40

-0.20

0.00

0.20

0.40

0.60

0.80

1.00

Esca

peA-

1Es

cape

A-2

Esca

peA-

3W

arga

me

A-1

War

gam

eA-

2Es

cape

R-1

Esca

peR-

2m

Rogu

eR-

1m

RogD

iffer

ence

in n

orm

aliz

ed s

olut

ion

time

(N-T

)/(N

+T)

Fig. 3. Far transfer in 17 scenarios measured as the norma

obtained in 17 scenarios, where the first five are abstractiontasks (A), the next five are reformulation examples (R), andthe last six are cross-domain transfer tasks. Each datapoint averages across ten trials of the given target problemfor the transfer agent and the non-transfer agent, both.Values above zero indicate positive transfer (the transfercase solution is faster than the non-transfer case), valuesnear-zero indicate no transfer, and values below zero indi-cate negative transfer.

The results show that the system produces positivetransfer in 10 of 17 cases. These include several instancesof very positive transfer. For example, the 0.93 score inCrossDomain-4 indicates that the agent solved the problemmore than ten times faster with transferred knowledge (andfive to six times faster in Escape A-1 and Wargame R-2).The system obtains positive transfer from examples in eachof the abstraction, reformulation, and cross-domain clas-ses. The cross-domain case is somewhat surprising, sincethese scenarios were not in any way designed to afford posi-tive transfer.

The four instances of near-zero transfer in Escape A-3,Wargame A-1, Escape R-1, and CrossDomain-5 corre-spond to cases where the source–source filter failed (stepabove). These scenarios should produce near-zero transfer,given that both the transfer and non-transfer agentsemploy exploration to solve the target task. Escape R-2also falls into this class, suggesting its moderate positivetransfer is due to random variation.

The instances of negative transfer in Fig. 3 are also easyto explain. The result for Wargame-R-1 is due to an incor-rect representation map (found in eight of ten transfer tri-als) applied to correct skills, while the effect inCrossDomain-1 is due to a partial map that creates action-able but misleading skills. The effect in CrossDomain-2 isdue to random fluctuation, as that scenario also failedthe source–source filter.

In summary, we can explain the instances of zero trans-fer as the (automatically diagnosed) failure of the learningsystem to acquire useful skills, and the one case of signifi-cant negative transfer as the product of an incorrect map-ping. Interestingly, the instances of positive transfer are

ueR-

2W

arga

me

R-1

War

gam

eR-

2Cr

ossD

omai

n-1

Cros

sDom

ain-

2Cr

ossD

omai

n-3

Cros

sDom

ain-

4Cr

ossD

omai

n-5

Cros

sDom

ain-

7

lized difference of the non-transfer and transfer cases.

-0.40

-0.20

0.00

0.20

0.40

0.60

0.80

1.00

Esca

peA-

1Es

cape

A-2

War

gam

eA-

2m

Rogu

eR-

1m

Rogu

eR-

2W

arga

me

R-1

War

gam

eR-

2Cr

ossD

omai

n-1

Cros

sDom

ain-

3Cr

ossD

omai

n-4

Cros

sDom

ain-

7

Diff

eren

ce in

nor

mal

ized

sol

utio

n tim

e (N

-T)/(

N+T

)

Lesion Non-Lesion

Fig. 4. A lesion study showing the impact of representation mapping on transfer.


more enigmatic. The benefit could be due to either of thetwo transfer mechanisms: direct application of learnedskills or transfer of learned skills via representation map-ping. The second experiment disambiguates these effects.

The lesion study examined the source of transfer in the11 scenarios (out of 17) where the system acquired reliableknowledge from solving the source problem it could thenemploy as a basis for transfer (i.e., the scenarios that passedstep). Fig. 4 shows the results of this study. Here, the lesioncase employs generalized skills without representationmapping, while the non-lesion condition duplicates a sub-set of the cases shown with representation mapping inFig. 3. As before, each data point averages across ten trialsof the given target problem for both the transfer agent andthe non-transfer agents.

The results show that the system without representationmapping (the lesion condition) produces circa zero transferin 9 of 11 cases. This effect is easy to understand. Lacking arepresentation map, the system has no mechanism forrelating source skills to target needs, so the transfer andnon-transfer agents both rely on exploratory behavior,which produces no net transfer. The main exception isWargame A-2, where the terms that differed betweensource and target were bound to variables in skills. Thismade source skills directly applicable for the transfer agent,leading to positive transfer.

More broadly, we can draw two conclusions from thelesion study. First, representation mapping generates virtu-ally all of the positive transfer observed in the 11 scenarios.In particular, it provides the system with the capacity toexploit learned knowledge given the representational dividecharacteristic of far transfer tasks. Second, this effectappears robust across problem classes, as it explains allbut one case of positive transfer.

Finally, we note that the scenarios were designed toafford positive transfer. This is least true in the cross-domain cases where the source-target relationship isunconstrained. However, the abstraction and reformula-tion scenarios obey what might be called a ‘‘constant con-tent assumption”: given the correct mapping amongsymbols, solutions for the source can solve the target task.This is evident in the reformulation case, while the trans-

formations defining the abstraction class admit no newinformation in the target, implying source solutions arepreserved. Experiments with even more disparate sourceand target tasks would clarify the limits of the GAMA rep-resentation mapping algorithm.

6. Related research

The research reported in this article integrates analogicaltransfer based on representation mapping into a largerarchitectural framework that learns to solve problemsmore effectively. There are large bodies of related workon analogy, learning, and problem-solving. In this section,we describe the work that we believe to be most relevant,concentrating primarily on analogy since it plays a key rolein our approach to transfer.

There are many surveys of analogy in cognitive science.Hall (1989) gave a comparative analysis of early computa-tional approaches to analogy, while Gentner and Holyoak(1997) discussed analogy in the context of reasoning andlearning. Two brief, relatively recent encyclopedia articleson analogy have emphasized the perspectives of cognitivescience and psychology (Gentner, 1999, 2003). French(2002) and Kokinov and French (2003) provide recentsummaries of computational models of analogy, whileHolyoak (2005) gives a broad and detailed overview ofresearch on analogy in psychology, including computa-tional models. One conclusion drawn in the most recentof these articles is that ‘‘models of analogy have not beenwell integrated with models of problem-solving...” and‘‘integration of analogy models with models of generalproblem-solving remains an important research goal”.Our work integrates analogy with problem-solving andthus makes significant progress on this goal.

In the discussion that follows, we focus on a subset ofthe research described in these surveys that is particularlyrelevant to our work. We also discuss some related worknot mentioned in the reviews.

Gentner (1983) presented an early analysis of analogy inwhich she proposed key hypotheses, including the one-to-one constraint and the systematicity principle. The firststates that, in most human formed analogies, an object in


one domain will not map to two objects in the other. Thesystematicity principle states that, in general, analogiesmap relations that preserve higher-order structure and rela-tions like causes and implies. Psychological studies pro-vided experimental support for these hypotheses (Clement& Gentner, 1988; Krawczyk, Holyoak, & Hummel,2005). Our approach not only incorporates both the one-to-one constraint and the systematicity principle, but it isalso guided by the purpose of analogy (pragmaticconstraints).

Falkenhainer et al. (1989) used Gentner’s analogy the-ory as the basis for an early computational model, thestructure-mapping engine (SME). In later work, the PHI-NEAS system (Falkenhainer, 1990) used analogical map-ping as part of a larger theory formation process thatinvolved explanation and that included retrieval, transfer,and qualitative simulation. While his system uses qualita-tive models of dynamic systems as background knowledgeand applies them in a qualitative simulation setting, oursystem uses action models and applies them in a prob-lem-solving setting. Unlike our system that learns how toachieve goals, Falkenhainer’s system uses analogy toimprove imperfect target domain theories.

Klenk and Forbus (2007) describe another system builtupon SME that learns new general domain knowledge inthe domain of rotational kinematics by analogy with trans-lational kinematics. This system finds analogies betweenworked source and target solutions to physics problems,which involve mental activities such as simplification offormulas. In contrast, the solutions our system uses aresequences of action with physical effects, and instead ofsolutions in the target domain, we use target domain theoryto form analogies. Moreover, while our system learns howan agent should behave in the target, Klenk et al.’s systemlearns how the target domain functions.

Burstein (1988) described CARL, a system that con-structed analogies which mapped actions, preconditions,and results. The program also constructed maps betweenrelations and objects; however, unlike our algorithm thatmaps whole source solutions and explanations against atarget theory automatically, CARL incrementally con-structs and refines maps by interacting with a teacher thatspecify partial maps, demonstrate actions one by one, andgive feedback on predictions about the expected effects.

Holyoak and Thagard (1989) advocated the use of prag-matic constraints involving goals. They implemented ananalogy engine called ACME and applied it to a collectionof examples including the atom-solar system analogy.However, their algorithm assumed that pragmatic valuesfor predicate matching were given, leaving open questionof how they arise. In contrast, our system’s goal-drivenexplanatory analysis is implicitly based on the pragmaticityprinciple.

Several researchers have used goals in analogy. Forinstance, one of the earliest analogy systems (Winston,1982) used causal networks to infer the motives of an agentbased on a similar situation with another agent. However,

this work used small examples that involved bits of plots orstories. It did not deal with complex plans and it did notinvolve cross-domain transfer, as does our approach. Pur-pose-directed analogy (Kedar-Cabelli, 1985, 1988) usedgoals, causal relations, and explanations to show how anobject satisfies a given purpose. For example, a styrofoamcup lets one drink hot liquids just as does a ceramic mugwith a handle. Her system introduced the use of goalregression to guide analogical reasoning. Veloso and Car-bonell’s (1993) PRODIGY/ANALOGY used a goal-direc-ted form of analogy to improve the performance of ameans-ends planner. They conducted experiments in alogistics domain that involved moving objects using trucksand planes. Neither work incorporates representation map-ping between relations or involve cross-domain transfer.

Our approach also bears similarities to work on analog-ical problem-solving. VanLehn and Jones’ (1993) CAS-CADE used a form of analogical search control to guidetop-down chaining through an AND–OR tree. Their sys-tem stored a semi-persistent mapping that it reused laterin a given problem but it did not map across domains orrepresentations. Jones and Langley’s (2005) EUREKA alsoused analogical search control, in this case to direct ameans- ends problem solver that spread activation throughstored subproblem decompositions. This system exhibiteda limited ability for cross-domain transfer, but only whenprovided with connections between predicates.

In summary, our approach is closely related to, andbuilds upon, earlier research that involved structural anal-ogy and pragmatic constraints. It has been guided by psy-chological results on structural analogy, but it incorporatespragmatic information about goals and explanations.Moreover, it is distinguished by its use of analogical repre-sentation mapping to achieve cross-domain transfer.

7. Directions for future research

Our future work will advance our ability to reuse previ-ously acquired knowledge in robust ways. We plan to pur-sue this goal by improving the representation mapping andtranslation mechanisms, and by completing their integra-tion into the ICARUS cognitive architecture.

We intend to improve the representation mapping tech-nology by supporting more complex mappings betweensource and target domains. For example, while our currentwork allows predicate abstraction in cases where there is noexact match, we can also consider the elimination of andreordering of predicate arguments. This change willincrease the number of possible representation maps, andrequire additional attention to search control.

We can also improve the quality of representation mapsby employing feedback from the application of transferredskills. For example, we can test mapped skills and reducethe heuristic score for any map that produces a failed skillin the target domain. This process can be interleaved withmap generation. We can also remodel the map evaluationheuristic to reflect an expected utility of transferred knowl-


edge, again updated from experience. This change wouldfocus GAMA on the most useful mappings. In addition,we can construct mechanisms for cumulative transfer thatincrementally augment the representation map as the agentaccrues experience in the target domain over time. We canexpand the impact of representation mapping by employ-ing learning in ICARUS to improve transferred knowledgein the target domain. We are pursuing this avenue nowby tuning and patching somewhat incorrect skills via the-ory refinement.

Our larger goal is to integrate representation mappinginto ICARUS at the architectural level. This would requireus to address the retrieval problem, and fold representationmapping into the architecture’s cognitive processing, suchthat is invoked under appropriate conditions. These arenon-trivial extensions, but they would produce a versionof ICARUS that seeks opportunities to retrieve, transfer,and apply knowledge from prior experience to currenttasks, in analogy to human behavior.

8. Concluding remarks

In this paper, we discussed an approach to transfer thatinvolves analogical mapping of symbols across differentdomains. We combined this method with representationalassumptions from ICARUS, a theory of the human cognitivearchitecture. Our transfer algorithm, GAMA, conducts rep-resentation mapping to generate correspondences betweensource and target concepts by matching the explanationof a source solution against the target theory. Next, it con-ducts representation translation by using those correspon-dences to import source skills and concepts into the targetdomain. GAMA is a pragmatic system guided by the pur-pose of analogy, since its search is directed by the targetgoal that the system aims to achieve using transferredskills. It further constrains search by using structural andsemantic constraints. We showed how GAMA finds abstractsimilarities even when the source and target domains arepartially analogous. Its representation mapper contributesto abstraction by omitting parts of an explanation that can-not be aligned to the target theory, whereas its representa-tional translator handles unmapped source symbols bytranslating their concepts and skills definitions or by omit-ting them when generalization is appropriate.

We claimed that the same features which distinguishICARUS from other architectures facilitate representationaltransfer. Most notably, indexing skills by their goals sup-ports transfer by enabling use of conceptual correspon-dences in translating skills. Moreover, hierarchicallycomposable skill representation allows transfer even inthe cases of partial mappings letting the agent interleaveanalogically translated skills with mapped skills. We pre-sented experimental evidence that our new representationaltransfer mechanism improves on more basic learning pro-cesses. We also demonstrated far transfer among reformu-lation tasks, abstraction tasks, and in less constrainedcross-domain scenarios. In this research, we have taken

steps towards the adaptation of past experience to currentgoals, which is important for building cognitive architec-tures that learn continuously and cumulatively.

Acknowledgements

We thank Ugur Kuter, Nate Waisbrot, Tom Fawcett,and Chunki Park for their contributions to this work. Thispaper reports research sponsored by DARPA under agree-ment FA8750-05-2-0283. The US Government may repro-duce and distribute reprints for Governmental purposesnotwithstanding any copyrights. The authors’ views andconclusions should not be interpreted as representing offi-cial policies or endorsements, expressed or implied, ofDARPA or the Government.

References

Burstein, M. (1988). Incremental learning from multiple analogies. In A.Prieditis (Ed.), Analogica (pp. 37–62). London: Pitman.

Clement, C. A., & Gentner, D. (1988). Systematicity as a selectionconstraint in analogical mapping. In Proceedings of the 10th annual

conference of the Cognitive Science Society (pp. 412–418). Montreal:Lawrence Erlbaum.

Choi, D., Konik, T., Nejati, N., Park, C., & Langley, P. (2007). Structuraltransfer of cognitive skills. In Proceedings of the eighth international

conference on cognitive modeling (pp. 115–120). Oxford, UK: Taylor &Fransis.

Falkenhainer, B., Forbus, K. D., & Gentner, D. (1989). The structure-mapping engine: Algorithm and examples. Artificial Intelligence, 41,1–63.

Falkenhainer, B. (1990). A unified approach to explanation and theoryformation. In J. Shrager & P. Langley (Eds.), Computational models of

scientific discovery and theory formation (pp. 157–196). San Francisco,CA: Morgan Kaufmann.

French, R. M. (2002). The computational modeling of analogy-making.Trends in Cognitive Science, 6, 200–205.

Genesereth, M. R., Love, N., & Pell, B. (2005). General game playing-overview of the AAAI competition. AI Magazine, 26, 62–72.

Gentner, D. (1983). Structure-mapping: A theoretical framework foranalogy. Cognitive Science, 7, 155–170.

Gentner, D. (1999). Analogy. In R. A. Wilson & F. C. Keil (Eds.), The

MIT encyclopedia of the cognitive sciences (pp. 17–20). Cambridge,MA: MIT Press.

Gentner, D. (2003). Analogical reasoning psychology of in encyclopedia of

cognitive science. London: Nature Publishing.Gentner, D., & Holyoak, K. (1997). Reasoning and learning by analogy.

American Psychologist, 52, 32–34.Hall, R. (1989). Computational approaches to analogical reasoning: A

comparative analysis. Artificial Intelligence, 39, 39–120.Holyoak, K. J. (2005). Analogy. In K. Holyoak & R. G. Morrison (Eds.),

The cambridge handbook of thinking and reasoning (pp. 117–142).Cambridge, UK: Cambridge University Press.

Holyoak, K. J., & Thagard, P. (1989). Analogical mapping by constraintsatisfaction. Cognitive Science, 13, 295–355.

Jones, R. M., & Langley, P. (2005). A constrained architecture forlearning and problem solving. Computational Intelligence, 21, 480–502.

Kedar-Cabelli, S. (1985). Purpose-directed analogy. In Proceedings of the

seventh annual conference of the Cognitive Science Society. Hillsdale,NJ: Lawrence Erlbaum.

Kedar-Cabelli, S. (1988). Toward a computational model of purpose-directed analogy. In A. Prieditis (Ed.), Analogica (pp. 89–107).London: Pitman.


Kieras, D. E., & Bovair, S. (1986). The acquisition of procedures fromtext: A production-system analysis of transfer of training. Journal of

Memory and Language, 25, 507–524.Neches, R., Langley, P., & Klahr, D. (1987). Learning development and

production-systems. In D. Klahr, P. Langley, & R. Neches (Eds.),Production-system models of learning and development. Cambridge,MA: MIT Press.

Klenk, M., & Forbus, K.D. (2007). Learning domain theories viaanalogical transfer. In Proceedings of the 21st international workshop

on qualitative reasoning, Aberystwyth, U.K.Kokinov, B., & French, R. M. (2003). Computational models of analogy-

making. In Encyclopedia of cognitive science (pp. 113–118). London:Nature Publishing.

Krawczyk, D. C., Holyoak, K. J., & Hummel, J. E. (2005). The one-to-oneconstraint in analogical mapping and inference. Cognitive Science, 29,29–38.

Langley, P., & Choi, D. (2006). Learning recursive control programs fromproblem solving. Journal of Machine Learning Research, 7, 493–518.

Mooney, R. J. (1990). A general explanation-based learning mechanism and

its application to narrative understanding. San Francisco, CA: MorganKaufman.

Nau, D., Cao, Y., Lotem, A., & Munoz-Avila, H. (1999). SHOP: Simplehierarchical ordered planner. In Proceedings of the 16th international

joint conference on artificial intelligence (pp. 968–973). San Francisco,CA: Morgan Kaufmann.

Nau, D., Au, T.-C., Ilghami, O., Kuter, U., Murdock, J. W., Wu, D., et al.(2003). SHOP2: An HTN planning system. Journal of Artificial

Intelligence Research, 20, 379–404.Nejati, N., Langley, P., & Konik, T. (2006). Learning hierarchical task

networks by observation. In Proceedings of the 23rd international

conference on machine learning (pp. 665–672). New York: ACM Press.Newell, A. (1990). Unified theories of cognition. Cambridge, MA: Harvard

University Press.Segre, A. (1987). A learning apprentice system for mechanical assembly. In

Proceedings of the third IEEE conference on AI for applications (pp.112–117).

Singley, M. K., & Anderson, J. R. (1988). A keystroke analysis of learningand transfer in text editing. Human–Computer Interaction, 3, 223–274.

Thagard, P., Holyoak, K. J., Nelson, G., & Gochfeld, D. (1990).Analogical retrieval by constraint satisfaction. Artificial Intelligence,

46, 259–310.VanLehn, K., & Jones, R. M. (1993). Integration of analogical search

control and explanation-based learning of correctness. In S. Minton(Ed.), Machine learning methods for planning (pp. 231–273). SanFrancisco, CA: Morgan Kaufmann.

Veloso, M. M., & Carbonell, J. (1993). Derivational analogy in PROD-IGY: Automating case acquisition, storage, and utilization. Machine

Learning, 10, 249–278.Winston, P. H. (1982). Learning new principles from precedents and

exercises. Artificial Intelligence, 19, 321–350.

Documents

Skill transfer through goal-driven representation mappingdgs/papers/icarus.mapping09.pdf · Skill transfer through goal-driven representation mapping Action editor: Angela Schwering