Upload
agatha-marshall
View
214
Download
0
Embed Size (px)
Citation preview
UCI
Large-Scale Collection of Application Usage Data to Inform Software
Development
David M. Hilbert
Information and Computer ScienceUniversity of California, IrvineIrvine, California 92697-3425
[email protected]://www.ics.uci.edu/~dhilbert/
UCI
Overview• Background and Motivation
• Dissertation and Evaluation
• Insights and Hypotheses
• Progress and Schedule
• Dissertation Outline
• Future Research
UCI
Background and Motivation• Expectations influence designs, designs embody
expectations
• Mismatches between expectations and how applications are actually used can lead to breakdowns
• Identification and resolution of mismatches can help improve fit between design and use
• Behavior of applications, users, and usage environments complex and unpredictable enough that observation required
• Research area: theories, methods, techniques to enable large-scale incorporation of application usage data in development
UCI
Impact of the Internet• On the positive side
– cheap, rapid, large-scale distribution of software for evaluation
– simple transport mechanism for usage information and feedback
– use and development becoming increasingly concurrent
– should make incorporating usage information easier
• On the negative side– reduces opportunities for traditional user testing
– increases variety and distribution of users and usage situations
– lack of scalable techniques and methods for incorporating usage information on a large scale
UCI
Current Approaches• Current approaches suffer from significant limitations
– usability testing => scale (size, scope, location, duration)
– beta testing => data quality (incentives, knowledge, detail)
• The user feedback paradox– users not having problems => provide feedback, negative
reactions
– users having problems => withhold feedback, positive reactions
• The impact assessment problem– impact on user population of suspected or reported
problems and potential changes
UCI
Research Goals• Address issues of scale
– enable larger scale evaluations (size, scope, location, duration) than currently possible with existing usability testing techniques
• Address issues of data quality– enable higher quality data to be collected than currently
possible with beta testers alone or existing automated techniques
• Provide a complementary source of information– help address the feedback paradox and impact
assessment problem in making design and effort allocation decisions
UCI
Research Direction• Explore the use of automated software monitoring
techniques– capture information about user interactions on a large
scale
– compare actual use against developers’ expectations
– help automate mismatch identification and resolution process
– make incorporating information about users more palatable to developers
UCI
Dissertation• Technical issues
– Abstraction Problem (data quality)
– Selection Problem (data quality/scale)
– Context Problem (data quality)
– Reduction Problem (scale)
– Evolution Problem (scale)
• Hypothesis– all these problems can be addressed by embedding the
right kinds of data collection mechanisms within an appropriate data collection architecture
UCI
Dissertation (cont’d)• Theoretical/methodological issues
– aside from “technical issues”, it isn’t clear what data to collect and why, and how to incorporate results in development
– since data collection and analysis can be expensive, guidance can increase the chances that the cost/benefit ratio will be favorable
• Hypothesis– a theory and method based on usage expectations can be
elaborated to provide motivation and guidance for incorporating data collection and analysis in development
UCI
Contributions• Identification of key issues limiting scalability and data
quality inherent in current techniques
• Solutions to the abstraction, selection, context, reduction, and evolution problems within a single data collection architecture
• A reference architecture to provide design guidance regarding key components and relationships
• Theory to motivate the significance of usage expectations in development and importance of collecting usage information
• Methodological guidance regarding collection, analysis, interpretation, and incorporation of results in development
UCI
Evaluation• Prototype
– demonstrate solutions to the abstraction, selection, context, reduction, and evolution problems within a single data collection architecture
• Informal empirical evaluation– assess usability and utility of approach based on feedback
from independent developers who integrated the prototype in a research demonstration scenario
• Participant observation of an industrial project– foundation for an analytical evaluation of the techniques,
reference architecture, theory, and method
UCI
The Abstraction Problem• Observation
– questions about usage typically occur in terms of concepts at higher levels of abstraction than represented in data provided by application components
– questions of usage can occur at multiple levels of abstraction
• Hypothesis– simple “data abstraction” mechanisms (based on
grammatical techniques) can be constructed to allow low-level data to be related to higher-level concepts such as UI and application features as well as users’ tasks and goals
– this can impact the results of human and automated analyses
UCI
The Selection Problem• Observation
– the amount of data necessary to answer usage questions will typically be a relatively small subset of the much larger set of data that might be recorded at any given time
– collecting too much data can make it difficult to separate events and patterns of interest from the “noise”
• Hypothesis– simple “data selection” mechanisms (based on events,
event sequences, values, and value vectors) can be constructed to allow important data to be captured - and unimportant data filtered - prior to reporting
– this can impact the results of human and automated analyses, not to mention scalability
UCI
The Context Problem• Observation
– information required to interpret the significance of events may not be available in the events produced by application components
– contextual information may be spread across multiple events or missing altogether, but is frequently available “for the asking” from the application, artifacts, or user
• Hypothesis
– simple “context-capture” mechanisms (that provide access to application, artifact, and user state information) can be exploited to allow context to be used in interpreting the significance of events
– this can also help in capturing important information not available in events
UCI
The Reduction Problem• Observation
– much of the analysis that will ultimately be performed to answer usage questions can actually be performed during data collection resulting in greatly reduced data reporting and post-hoc analysis needs
– when analysis is left as last step it is often not performed
• Hypothesis
– simple “data reduction” mechanisms (e.g., for performing counts and other simple analyses during collection) can be constructed to reduce the amount of data that must ultimately be reported and analyzed
– this can impact scalability and likelihood that data will be analyzed
UCI
The Evolution Problem• Observation
– data collection needs will typically evolve over time (perhaps due to results of earlier data collection) more rapidly than the application
– unnecessary coupling of data collection and application code can increase cost and even cripple evolution of data collection
• Hypothesis
– “evolvable” data collection mechanisms (based on encapsulating abstraction, selection, context-capture, and reduction decisions) can be constructed to allow data collection to evolve over time without impacting application deployment or use
– this can impact the practicality of performing data collection
UCI
Approach• Expectation-Driven Event Monitoring (EDEM)
UCI
EDEM Architecture
Agent Specs saved w/ URL
Development Computer
Java Virtual Machine
EDEMActive Agents
ApplicationUI Components
Top Level Window& UI Events
Property Queries
Property Values HTTPServer
DevelopmentComputer
AgentSpecs
EDEMServer
CollectedData
User Computer
Java Virtual Machine
EDEMActive Agents
ApplicationUI Components
Top Level Window& UI Events
Property Queries
Property Values
Agent Specs loaded via URL
Agent Reports sent via E-mail
UCI
Reference Architecture
SystemModel of
UI & App:
ComponentsEvents
PropertiesMethods
O b j O b j O b j
O b j O b j O b j
O b j O b j O b j
O b j
O b j
DataCapture
Abstraction, Selection, Context,
Reduction
DataPackaging
DataAnalysis
DataPrep
DataTransport
AnalystModel of
UI & App:
Features,Dialogs, Controls,
User-Supplied Values, User Tasks
Mapping
UCI
Instrumentation intertwined w/ app
Reference Architecture (Word IV)
SystemModel of
UI & App:
ComponentsEvents
PropertiesMethods
O b j O b j O b j
O b j O b j O b j
O b j O b j O b j
O b j
O b j
DataCapture
Abstraction, Selection, Context,
Reduction
DataPackaging
DataAnalysis
DataPrep
DataTransport
AnalystModel of
UI & App:
Features,Dialogs, Controls,
User-Supplied Values, User Tasks
Mapping
UCI
Event monitoring infrastructure
TestWizard Database of Office UI
Reference Architecture (Office IV)
SystemModel of
UI & App:
ComponentsEvents
PropertiesMethods
O b j O b j O b j
O b j O b j O b j
O b j O b j O b j
O b j
O b j
DataCapture
Abstraction, Selection, Context,
Reduction
DataPackaging
DataAnalysis
DataPrep
DataTransport
AnalystModel of
UI & App:
Features,Dialogs, Controls,
User-Supplied Values, User Tasks
Mapping
UCI
Event monitoring infrastructure
Expectation Agents
Reference Architecture (EDEM)
SystemModel of
UI & App:
ComponentsEvents
PropertiesMethods
O b j O b j O b j
O b j O b j O b j
O b j O b j O b j
O b j
O b j
DataCapture
Abstraction, Selection, Context,
Reduction
DataPackaging
DataAnalysis
DataPrep
DataTransport
AnalystModel of
UI & App:
Features,Dialogs, Controls,
User-Supplied Values, User Tasks
Mapping
“Pluggable” Data Abstraction, Selection, Context-Capture, and
Reduction
UCI
Dissertation Progress
Survey
Theory and Method
Reference Architecture
Informal evaluation
Prototype
Participant observation
N/A
Theory and method require further elaboration
Design guidance requires further elaboration
N/A
Prototype requires porting and other extensions
Further analysis of observations required
CommentsProduct
Done
Needs WorkNear Done
Done
Needs Work
Near Done
Status
UCI
Dissemination Progress
Conf. Demo
Conf. Demo
Conf. Paper
Work. Paper
Conf. Paper
Journ. PaperJourn. Paper
ICS97
IUI98
ICSE98
CSCW98
Agents98
IEEE TSE
ACM Surveys
X
X
X
X
X
X
PrototypeDescription Venue
X
X
X
X
X
Theory/Method
X
X
Reference Arch.
X
X
Techniques
X
Survey
AcceptedAcceptedAccepted
Accepted
Accepted
In ReviewIn Review
Status
UCI
Schedule for Work Remaining
Prototype extension
Theoretical elaborationDocument results
Buffer period
Final defense
port; update event model; explicit support for 5 techniques elaborate theory/method based on “participant observation”should already be well into writing
wrap up any loose ends
schedule ahead of time w/ Grudin
CommentsProduct
Dec-Jan ‘99
Jan-Feb ‘99
Feb ‘99
May-Jul ‘99
May ‘99
Schedule
UCI
Dissertation Outline• Introduction (General Introduction)
– Expectations in Software Development (highlight theory)
– Impact of the Internet (problems and opportunities)
– Problems with Current Practice (usability and beta testing)
– Proposed Solution (foreshadow insights, approach, contributions)
• Extracting Usage Data from User Interaction Events (State of the
Art) – Synch and Search
– Abstraction, Filtering, and Recoding
– Counts and Summary Statistics
– Sequence Detection
– Sequence Comparison
– Sequence Characterization
– Visualization
– Integrated Support
UCI
Dissertation Outline (cont’d)• Key Problems and Insights (Problem Statement)
– The Abstraction Problem (meaningfulness)
– The Selection Problem (meaningfulness)
– The Context Problem (meaningfulness)
– The Reduction Problem (scalability/practicality)
– The Evolution Problem (scalability/practicality)
– Interdependencies and Interactions
– Need for Theoretical and Methodological Guidance
UCI
Dissertation Outline (cont’d)• Expectation-Driven Event Monitoring (Solution Statement)
– Theory and Method (based on research and Microsoft experience)
• Expectations in development
• Identifying expectations
• Integrating data collection in the development process
• Analyzing data and interpreting results
• A sample usage data collection process
– Techniques for Addressing Current Limitations (description of prototype)
• Data Abstraction
• Data Selection
• Context Capture
• Data Reduction
• Evolution
– Reference Architecture (based on prototype and Microsoft experience)
• Architectural components and relationships
• Supporting large-scale data collection
UCI
Dissertation Outline (cont’d)• Experience and Evaluation (Evaluation of Solution)
– The GTN scenario
• Study Goals
• Description
• Results
– Participant observation of an industrial project
• Study Goals
• Description
• Results
– Collection, analysis, and reporting goals
– Challenges and limitations (addressed by this research)
– Lessons learned (informing this research)
UCI
Dissertation Outline (cont’d)• Conclusions
– Conclusions
– Summary of Contributions
– Future Research
• References
• Appendices
UCI
Future Research• Large-scale evaluation of research in practice
– nature of usage information
– issues in interpretation and incorporation of results
– evolution and maintenance issues
• Other possible extensions– exploit relationships between expectations and other
requirements-related artifacts, e.g. use cases, cognitive walkthroughs, task analysis
– explore issues of adaptability and reuse of infrastructure and default analyses
– analysis of changes in usage over time
– analysis of usage involving multiple cooperating users
UCI
Other Possible Applications• Support for adaptive UI/application behavior based on long-
term information about user (or users’) actions
• Support for "smarter" delivery of help/suggestions/assistance based on long-term information about user (or users’) actions
• Support for monitoring of other component-based software systems
– low-level data must be related to higher level concepts of interest
– available information exceeds that which can practically be collected
– data collection needs evolve over time more quickly than application
UCI
Research Process
MicrosoftExperience
Motivation
Insight
Theory/Method
Evaluation
Insight
Prototype
Survey
ReferenceArchitecture
GTNScenario
Motivation
Insight
Evalu
ati
on
Insi
ght
Evalu
ati
on
Insi
ght
Evaluation
Insight
EvaluationInsight