Bayesian Network backbone of clippy

1

Bayesian Network backbone of

Microsoft Office Assistant – Clippy:

A CASE STUDY

Naveen Bharathi Pitchandi

North Carolina State University

Abstract

User interface agents are increasingly employed to enhance software products. Websites

(e.g. buy.com, extempo.com, ananova.com, mysimon.com) now use characters to guide users

through processes or present information.Microsoft Office Assistant - Clippy is a well-known

animated character that was part of Microsoft’s graphical user interface for its elements of the

Microsoft Office Suite. Clippy was first included in the 1997 release of the Office suite and

continued to be part of the product line until 2007 when it was permanently removed. due to

widespread hatred towards the animated character. Clippy was one of the many cartoon

characters that were available as the engine’s front end interacting with the user.

The Office Assistant was an Intelligent User Interface for Microsoft Office that assisted

users by way of an interactive animated character, which interfaced with the Office help content.

The back end of the Office Assistant was based on Bayesian Networks, to provide the

corresponding assistance to the user by inferring from user background, user actions, and

program state. The Bayesian method of analysis computes the likelihood of alternative concepts

based on the user’s inputs/actions and models the user’s intention to provide assistance.

Within few years of its existence, the office assistant received wide spread hatred and was

deemed “annoying” and one of the Top 50 worst inventions. The purpose of this paper to

investigate the working of the assistant with a help of a sample network, reasons behind Clippy’s

massive failure. And propose a solution of including the “historical acknowledgement” to

elicited assistance as part of a node in the network to update the probabilities that would capture

the User’s expertise level and need for assistance better.

2

1. ORIGIN OF CLIPPY

Microsoft Office assistant was a partial implementation of the Lumiere project at

Microsoft Research that was initiated in 1993 with the goal of developing methods and an

architecture for reasoning about the goals and needs of software users as they work with

software.At the heart of Lumiere are Bayesian models that capture the uncertain relationships

between the goals and needs of a user and observations about program state, sequences of actions

over time, and words in a user's query (when such a query has been made).

Early on in the Lumiere project, studies were performed in the Microsoft usability labs to

investigate key issues in determining how best to assist a user as they worked. The studies were

aimed at exploring how experts in specific software applications worked to understand problems

that users might be having with software from the user's behaviors. The Lumiere prototypes have

explored the combination of a Bayesian perspective on integrating information from user

background, user actions, and program state, along with a Bayesian analysis of the words in a

user's query (although the commercially available Office products had a separation between user

interface events and word-based queries). Research on new application and extensions continued

in parallel with efforts to integrate portions of the prototype into commercially available software

applications. In January 1997, a derivative of Lumiere research shipped as the Office Assistant in

Microsoft Office '97 applications.

2. DOWNFALL OF CLIPPY “It looks like you’re writing a letter” – one of the most irked phrases used by Clippy,

proposing assistance to write a letter when a user types “Dear”. Accurate as it was in estimating

the intensions of the user based on actions/word queries, its repeated prompting for assistance

irked many users. Clippy offered to help incessantly and popped up with suggestions related to

the actions made despite the user denying for assistance repeatedly. Eventually, Clippy became

one of the most hated features of the Office Suite. And it was eventually deemed as one of the

Top 50 worst inventions.

3

3. REASONS BEHIND CLIPPY’S FAILURE

Clippy was permanently removed from the Office Suite from 2007, and there has been no

attempt yet to revive the assistant back (except for using the designed character for a game –

Ribbon Hero2 – in 2011). But on thorough research in looking at areas where Clippy fell behind,

a couple of research papers throw some light on the issue.

The first paper, [1] by Luke Swartz, 2003 analyses from a psychological perspective of

what lacked in Clippy. The Office Assistant’s letter-writing proactive help feature, thus, breaks

every relevant etiquette rule: it ignores social conventions of when to disturb someone, it does not

learn from its mistakes, it does not develop a long-term relationship, and (one might argue) it does

not even provide a helpful service. Since this feature is the most cited annoyance in the popular press,

one cannot help but wonder how much better the Office Assistant would be perceived if this one

feature had been fixed or eliminated before its release.

While the above conveyed in terms of a human perception, the following article elicits the

technical drawbacks of the implementation of the Office assistant from its parent project – Lumiere.

[2] The paper by E.Horvitzpublished in 1998 at the Conference on Uncertainty in Artificial

Intelligence (UAI) by the Lumiere team described the inner workings of the Assistant’s inference

engine and also how much of it was included in the released version of Office 97. 1) The system

does not take into account of any persistent user profile, wherein the assistant does not take into

account of all actions taken by user and acclimatize to User’s competency level to offer the

required assistance. Instead, the assistant worked on a single profile and hence was a more rigid

structure than what we would’ve liked it to be. 2) The system does not use rich combination of

events over time; rather system only considers a small set of relatively atomic user actions. And

only those actions which are present in the small event queue present are considered. 2) The

system separates the analysis of words. 3) The system separates the analysis of words and of

events. When words are available, the system does not exploit information about context and

recent actions. Finally 4) the automated facility of providing assistance based on the likelihood

that a user may need assistance or on the expected utility of such autonomous action was not

employed.

4

Based on inferences from the two papers [1] ,[2] the feature of automated facility of

providing assistance based on likelihood that user may need assistance, seemed to have a higher

impact on the failure of the office assistant. The paper proposes a possible solution that could be

incorporated onto the office assistant, which would update probabilities for the likelihood such

that the assistant would be smarter as it gains more knowledge on whether the user would need

assistance for a particular action given what was the user response for previous assistance for the

same action. This would solve the problem of unnecessary pop-ups irking the user providing

assistance when it is not warranted.

Another possible way to improve the performance of the office assistant would be to

integrate the search of words and actions, which would again reduce the most common error of

prompting to assist with a letter. To better understand how the Bayesian network infers from

User’s actions and elicits user intentions, the following sections would explain a segment of how

it works.

4. BAYESIAN NETWORK : Teaching notes

A Bayesian Network is a graphical model that applies probability theory and graph

theory to construct probabilistic inference and reasoning models. The nodes represent variables,

events or evidence, whilst the arc between two nodes represents conditional dependences

between the nodes.

A Bayesian Network is defined as a Directed Acyclic Graph (DAG) in which the arcs are

Unidirectional and feedback loops are not allowed in the graph. Because of this feature, it is easy

to identify the parent-child relationship or the probability dependency between two nodes.

A small network based on user’s recent actions – usage of Chart option on Excel – has

been mapped. Though typically the actual network could be bigger and more complex, for

understanding purposes a simple network to explain the basic theory behind eliciting appropriate

assistance based on user actions is used (Fig 1.). This network does not exactly replicate the

actual network behind Office Assistant. This is a hypothesized network built based on the

working of the Assistant. Based on evidence of any of the user actions (Nodes in the bottom

row), the likelihood of which type of and if User is looking to create/edit a chart can be obtained.

5

Fig. 1

To further simplify the computation involved, and for better understanding considering a

sub network from the above given by Fig 2

Fig. 2

6

To explain the network, probabilities were assigned to each node [3] (though they aren’t

the actual probability distribution used in the real network behind the Assistant). There are

different states for each node that have a probability assigned to it. A causal relationship can be

derived from the network by using the Bayes rule.

A B P (B|A) = P (A, B)/P (B)

Hence, the inference from the network can be obtained by:

• Conditional Probability: P(Q = q|E) = P(q,E)/P(E)

Where Q is the query variable and E is the evidence.

X1, ...,Xn be unknown network variables that q depends on

• Joint Probability :

P(Q=q, E=e) = 𝑃 𝑞 𝑒, 𝑥1, … 𝑥𝑛 𝑃(𝑒, 𝑥1, … . 𝑥𝑛),-,…,,.

• General Product rule for Bayesian Networks:

P(X1, ...,Xn) = 𝑃(𝑋𝑖|𝑝𝑎𝑟𝑒𝑛𝑡𝑠 𝑋𝑖 ).78-

Table 1

Node Insert Chart

State Yes No

Prior

Probability 0.500 0.500

7

A prior probability of being neutral to any decision (50-50) is assigned to the Insert chart

node such that with further evidence, the posterior probability would indicate if User’s intention

is to insert a chart. The conditional probabilities of the remaining nodes are established below.

Table 2

Table 3 Table 4

Table 5

The Assistant would be able to infer if the user wishes to create a line chart if the user

Highlights cells or mouse over insert chart or mouse over line chart. The prior conditional

probabilities from the above tables are used to derive inferences. The first step towards that is to

find the joint probability density function.

Node Name Line

State Yes No

Insert Chart = Yes 0.65 0.35 Insert Chart = No 0.65 0.35

Node Name Mouse over insert

State Yes No

Line = Yes 0.85 0.15

Line = No 0.15 0.85

Node Name Mouse over Line

State Yes No

Line = Yes 0.85 0.15

Line = No 0.15 0.85

Node Name Highlight Cells

State Yes No

Line = Yes 0.85 0.15

Line = No 0.15 0.85

8

P(I, L, MOI, MOL, H) = P(I)*(P(L|I)*P(MOI|L,I)*P(MOL|L,I)*P(H|L,I). Where I = Insert

Chart, L=Line, MOI = Mouse over insert, MOL = Mouse over line and H = Highlight cells.

Hence the model can now answer based on any inference such as

P (L= Yes | MOL = Yes, H= Yes) = P (L= Yes | H = Yes). P (MOL = Yes | L = Yes, H = Yes)

P (MOL = Yes | H = yes)

P (L= Yes | MOL = Yes, H= Yes) is the posterior probability, P (L= Yes | H = Yes) is the prior

probability, P (MOL = Yes | L = Yes, H = Yes) is the likelihood and gives the probability of

evidence assuming L= yes and H = Yes. P (MOL = Yes | H = yes) is the expectedness or how

expected the evidence is given H = yes.

The network was simulated using a software – BayesiaLab which calculates the posterior

probabilities based on the above formula:

Fig. 3

9

Hence from the inference it is oberserved that if the user’s actions are “highlight cells”

and “Mouseover line chart”, the user model estimates that it is 64.09% likely that he would want

to Insert a chart and 96.98% that it would be a Line Chart. Similar simulations were carried out

for different evidences of user actions. This is a way that the assistant maps the assistance

required. But, there must be an additional network which must determine whether the User

“needs” this assistance or not.

4.1 Likelihood Network

One of the major reasons for the failure of the assistant was that there was no user

profiling and the assistant never considered the likelihood of the need for assistance. Though

Lumiere project included a network to assess the likelihood, it was not entirely incorporated onto

the Office suite. Translating the Bayesian computation onto an already existing software package

posed great difficulty. Hence some of the functionalities though had a bayesian background, was

not computationally the same when coded onto the software package.

Based on the negative reviews and the repeated mistakes made by Clippy, a network

(Based on the likelihood network by Lumiere team) with a minor change was built using

Bayesialabs and analysed how that minor addition could improve the likelihood.

Fig. 4 Fig. 5

Fig4 represents part of the network that was designed by the Lumiere team. Fig5.

represents the network that was based on the network from Fig4. With a minor addition of a node

“Historical Assistance Acknowledgement”. The reason behind including this node was that, the

network though successfully managed to elicit user’s intentions, it did not learn the likelihood

10

probabilities over time of whether the user used the assistance or not. The “Historical Assistance

Acknowledgement” node captures the information of whether the User “Accepts” or “Denies”

the help offered by the Assistant.

User needs assistance is the decision making node that decides if user needs assistance.

The several factors that affect this decision are the remaining nodes present in the network.

“Pause after activity” node captures the information as to whether the user is idle above a

threshold time (Approximately 10seconds) or not. This information is form of a representation if

the user is unsure of how to move ahead. But the user could also have paused due to distraction

while doing something else, that information affects the “Pause” and is captured by “User

Distracted”. “Difficulty of current task” and the “User expertise” also determine the need for

assistance. Where, difficulty of current task is a predetermined by the software the user expertise

is provided by the user.

A sample data was used that intended to capture history of actions performed by the user,

and whether assistance was provided or not for that action. And also captured the information of

whether user accepted or denied the assistance which was only included in the second network

that was a modification of the Lumiere. The analysis was confined to one goal of “Chart

function” for easier analysis and comparison. For a similar inference in each of the network the

posteriors of both of the networks were calculated. The results are shown below:

Fig. 6

11

Fig. 7

From Fig.6 and Fig.7 it can be seen that the posterior probability for assistance was lesser

for Fig.7 (36.36%) compared to Fig.6 (45.83%). Fig.7 can be claimed to perform better than

Fig.6 as Fig.6 only accounted for the number of times assistance was provided, but Fig.7 also

included the number of times the assistance was provided and was denied. Hence the second

network was able to learn that the user needed lesser assistance than the Assistant thought he

would. For further confirmation, few more data points were added and associated to the network:

Fig. 8

12

Fig. 9

Hence with additional data points, Fig. 8 and Fig. 9 depict the network that associated the

additional data points and recalculated the posteriors. The additional data points were in such a

way that the user denied most of the assistance provided by the Assistant to show more precisely

how differently it would perform if not for “Historical Assistance Acknowledgement”.

Without the presence of “Historical Assistance Acknowledgement”, the network in Fig. 8

computes a posterior of 79.25% whereas the network in Fig. 9 accounts for the number of times

assistance is denied by the user reducing the posterior to 50%. Hence the network created

performs better than how it presently functions, and this would in turn help the Assistant be more

smart in learning the User’s expertise level. Wherein, if the user denies the assistance for a

particular action repeatedly, it reduces the likelihood of offering assistance.

4.2 Summary

A more precise network to capture the user’s decision and need for assistance was

created, that performed better than that of the present model that is used for the Office Assistant.

From the simulation of the two types of network, it was evident that the present network

repeatedly prompted for assistance despite the user denying it. The second network though,

captured this information. This change into the backend of the Assistant could make it smarter

and less annoying. An intelligent assistant would be highly desirable as timely need for help

when provided could change the impression that it had earlier created.

13

4.3 Key Assumptions

The data used behind the all the networks built in this paper are not the actually used in

network of the Office Assistant. Hence the prior probabilities computed might not be the same.

They are just a representation of how the data could be, to explain the working of the network.

The search for user’s intentions considered here are only for user related actions and not

query based search. The integration of both the searches as mentioned earlier in the paper has not

been focused on.

4.3 Difficulties in implementing

Tracking user’s actions were inherently difficult to integrate with the software the first

time it was attempted to integrate with Lumiere project. Hence the ability to track the decision

made by the user for assistance provided and the complexity involved to store this information is

to be assessed.

The paper does not take into account of how much information must be stored, i..e if the

system keep storing user actions indefinitely, it could reduce the efficiency of the search as it had

to process all the information to come up with a likelihood.

14

REFERENCES

[1] Luke Swartz, WHY PEOPLE HATE THE PAPERCLIP: LABELS, APPEARANCE,

BEHAVIOR AND SOCIAL RESPONSES TO USER INTERFACE AGENTS,

Symbolic Systems Program, Stanford University, June 12 2003

[2] E. Horvitz, J. Breese, D. Heckerman, D. Hovel, and K. Rommelse. The Lumiere Project:

Bayesian User Modeling for Inferring the Goals and Needs of Software Users. Proceedings of

the Fourteenth Conference on Uncertainty in Artificial Intelligence, July 1998.

[3] Michael Y.K. Kwan, K.P. Chow, Frank Y.W. Law, Pierre K.Y. Lai, Computer Forensics

using Bayesian Network: A Case Study.

15

EXECUTIVE SUMMARY The case study is about the much hated Microsoft Office Assistant – Clippy, its inception,

demise and the technical flaws behind its failure. Clippy based its inferences for providing

assistance to the user based on user’s recent actions or query based help request based on

Bayesian networks.

Microsoft Office assistant was part of Lumiere research that was established by Microsoft

Researchers from the Decision Theory & Adaptive Systems Group to study and improve human

computer interaction using Bayesian methods. The Lumiere prototypes explored the combination

of a Bayesian perspective on integrating information from user background, user actions, and

program state, along with a Bayesian analysis of the words in a user’s query.

Clippy eventually became one of the most hated inventions – Time, 2010. It apparently

prompted to offer assistance despite the user not really needing one. On secondary research from

the reasons behind why people hated Clippy it was certain that Clippy popping up incessantly

with help messages is what irked users the most. When looked at the technical background of

this problem it was found that Clippy had excluded some of the features from the Lumiere

project on its integration to the Office suite. The main aspects are: 1) No user persistent profiles

2) Separation between user interface events and word-based queries; for word based queries the

engine ignored any context and user actions 3) The automatic facility of providing assistance

based on the likelihood that a user may need assistance was not employed.

A sample network to explain the working of Clippy; how Clippy inferred which

assistance to provide based on user’s actions was built using a software - Bayesia Labs. As a

solution to the problem of Clippy not being to estimate the likelihood of user needing assistance

accurately, a modification to the likelihood network proposed by Lumiere was built with an

inclusion of node “Historical acknowledgement of assistance” which captured information on

response of the user to the assistance provided by Clippy : Whether user accepted or rejected the

assistance. This would be prior to further computation of the likelihood for a similar action in the

future, but now Clippy would have more information as to whether User would need the

assistance based on user’s earlier response to the same. Lumiere’s original network and the

modified network were simulated and found that the modified network performed better. This

solution could possibly make Clippy smarter and be able to decide when User needs help more

precisely.

Documents

Bayesian Network backbone of clippy