Paul Groth @ AAAI Symposium Mar 2009 : Agents that Learn from Human Teachers SCAFFOLDING INSTRUCTIONS TO LEARN PROCEDURES FROM USERS Paul Groth and Yolanda

Paul Groth @ AAAI Symposium Mar 2009 : Agents that Learn from Human Teachers

SCAFFOLDING INSTRUCTIONS TO LEARN PROCEDURES FROM USERS

Paul Groth and Yolanda GilInformation Sciences InstituteUniversity of Southern California


Learning Procedures Naturally

• Humans learn procedures using a variety mechanisms– Observation, practice, reading textbooks

• Human tutorial instruction– Broad descriptions of actions and explanations of

their dependencies

• The computer is told what to do by the instructor.• Goal: learn procedures from instruction that is

natural to provide


What is Instruction by Telling• General statements

– Not refer to a specific state

• Descriptive statements• About types, functions, processes

“To dial a number, lift the receiver and punch the number”“Place a pot with water on the stove. Turn the stove on and wait

until the water bubbles”“A good hovering area is behind a tall building that is more than

200 ft away”“A vehicle is parked if it is stopped on the side of the road or if it is

stopped in a parking lot for more than 3 minutes”


Why is Instruction by Telling Important?

• [Scaffidi et al 06]: by 2012, 90M end user programmers in the US alone– 13M would describe themselves as programmers– 55M will use spreadsheets and databases

• [Adams 08]: we have gone from dozens of markets of millions of users to millions of markets of dozens of users– The “long tail of programming” [Anderson 08]

• Today most successful end user applications focus on data manipulation through spreadsheets and web forms

• We need approaches to allow end users to specify procedures to process data or to control a physical environment– With examples– By telling: a natural method for humans, needed if procedures are complex

and hard to generalize from examples


Shortcomings of Human Instruction [Gil 09]

• Organization• Omissions• Structure• Errors• Student’s preparation• Student’s ability• Teacher’s skills

“[…] Procedures are complex relational structures and the mapping between these structures and a linear sequence of propositions expressed in discourse is not easy to define.” – [Donin et al 02]


TellMe Learning Procedures by Being Told

• Developed four-stage process:1. Ingestion: create initial procedure stub from given instruction2. Elaboration: map terms to existing knowledge, infer missing

information using heuristics, create hypotheses of procedures3. Elimination: rule out hypotheses through symbolic execution4. Selection: select one procedure hypothesis using heuristics that

maximize consistency


An Example

2. start SetupKlondikeSolitaire hasId 88873. resultIs name=solitaireGameSetup isa type=GameSetup4. initSetup name=deck isa type=CardDeck5. doThis name=Deal basedOn deck, 7 expect name=hand6. name=hand isa type=Hand7. doThis name=Layout basedOn hand8. name=numOfCards isa type=Integer 9. name=numOfCards value=710. repeat doThis name=Decrement basedOn numOfCards

expect numOfCards

Create initialinterpretations based on priorknowledge, annotate gaps

<j.0:Procedure rdf:about="#SetupKlondikeSolitaire"> <j.0:hasSteps rdf:parseType="Collection"> <j.0:Procedure rdf:about="#Deal"/> <j.0:Procedure rdf:about="#Layout"/> <j.0:Loop> <j.0:until rdf:resource="#Unknown> <j.0:repeat rdf:parseType="Collection"> <j.0:Procedure rdf:about="#Decrement"/> <j.0:Procedure rdf:about="#Deal"/> <j.0:Procedure rdf:about="#Layout"/> </j.0:repeat> </j.0:Loop> </j.0:hasSteps> <j.0:hasInput rdf:resource="#deck"/> <j.0:hasResult rdf:resource="#solitaireGameSetup"/></j.0:Procedure>

Elaborate usingheuristics for filling gaps


Repeat until when?

Instruction is ambiguous & incomplete

Which player?What is “closest”?

Closest could have meant closest sidelineor it could have meant closest player

<j.0:Procedure rdf:about="#SetupKlondikeSolitaire"> <j.0:hasSteps rdf:parseType="Collection"> <j.0:Procedure rdf:about="#Deal"/> <j.0:Procedure rdf:about="#Layout"/> <j.0:Loop> <j.0:until rdf:resource="#Unknown> <j.0:repeat rdf:parseType="Collection"> <j.0:Procedure rdf:about="#Decrement"/> <j.0:Procedure rdf:about="#Deal"/> < <j.0:Procedure rdf:about="#Deal"/ j.0:Procedure

rdf:about="#Layout"/> </j.0:repeat> </j.0:Loop> </j.0:hasSteps> <j.0:hasInput rdf:resource="#deck"/> <j.0:hasResult rdf:resource="#solitaireGameSetup"/></j.0:Procedure>







Repeat until left sideline is reachedor right sideline or front line Repeat until teammate

or opponent has ball

Eliminate throughsymbolic execution and reasoning

Select based on heuristicsthat maximize consistency

X X X

1

2

3

4


Example Heuristics• Ingestion

– If a variable is assigned a constant in the instruction, then find a consistent basic type for it.

• Elaboration– If the input of a component (i.e. subtask) is type compatible with the

result of a preceding component, then that result could be connected to the input.

– If two variables share any typing information they could be unified.

• Elimination– Hypotheses with matching symbolic execution traces can considered

to be the same. • Selection

– Pick the simplest hypothesis (with least components).


Instructions that TellMe Can Process

1: begin lesson 2: start GetOpen hasId 8888 3: repeat

// Find your closest opponent. 4: doThis name=GetCurrentPosition expect originalPosition 5: name=originalPosition isa type=Position 6: doThis name=FindClosestOpponent basedOn=originalPosition expect=opponentLocation

// Dash away from them 7: name=opponentLocation isa type=Position 8: doThis name=FaceAwayFrom basedOn opponentLocation 9: doThis name=Dash expect=currentPosition 10: name=currentPosition isa type=Position

// Stop and move back to your previous position (e.g. cut back). 11: doThis name=MoveTowards12: basedOn originalPosition expect=currentPosition // If you are not open, do this again. 13: until 14: name=Open basedOn currentPosition // Once your open, find the ball and face it. 15: doThis name=FindTheBall expect=ballLocation 16: name=ballLocation isa type=Position 17: doThis name=Face basedOn ballLocation 18: end


Instructions that TellMe Can Process


Example of Procedure Learned by TellMe


Applying the Framework• “A Scientific Workflow Construction Command Line” [Groth & Gil IUI-09]• Real natural language descriptions of procedures:

– Protocols in GenePattern [Reich et al 08]– Workflows in MyExperiment [DeRoure et al 09]

• Example used:“This workflow performs data cleansing on genes, clusters the results, and then displays a heatmap.”


Conclusion

• TellMe provides a framework for addressing learning from instruction given by humans

• The approach can be applied to different domains

• Future work includes:– More and better heuristics– Dealing with more sophisticated language

constructs– Towards natural language input

Documents

Paul Groth @ AAAI Symposium Mar 2009 : Agents that Learn from Human Teachers SCAFFOLDING INSTRUCTIONS TO LEARN PROCEDURES FROM USERS Paul Groth and Yolanda