32
A Seminar Report On SIKULI SCRIPT AT, MAHATMA GANDHI INSTITUTE OF TECHNICAL EDUCATION & RESEARCH CENTRE-NAVSRI Year-2011

Sikuli script

Embed Size (px)

DESCRIPTION

Seminar Report on Sikuli Script, Made by Snehal M Patel

Citation preview

Page 1: Sikuli script

A Seminar Report On

SIKULI SCRIPT

AT,

MAHATMA GANDHI INSTITUTE OF TECHNICAL EDUCATION & RESEARCH CENTRE-NAVSRI

Year-2011

Page 2: Sikuli script

MAHATMA GANDHI INSTITUTE OF

TECHNICAL EDUCATION &

RESEARCH CENTRE

A Seminar Report on

SIKULI SCRIPT

Submitted to,

Department of Computer Science Engineering

Submitted by,

SNEHAL M PATEL

B.E 3rd COMPUTER (5th SEM)

Enrolment No. 090330131025

Under the Esteemed guidance of

DIYA VADHWANI

Page 3: Sikuli script

Shree Navsari Paschim Vibhag Koli Samaj Kalyankari Trust Sanchalit

MAHATMA GANDHI

INSTITUTE OF TECHNICAL

EDUCATION & RESEARCH CENTRE

Navsari- 396 450

C E R T I F I C A T E

This is to certify that the seminar entitled “Sikuli Script” is submitted by,

Snehal M Patel bearing Enrolment No. 090330131025 of Computer

Science Engineering Department (B.E 3rd, Sem V) in fulfillment of the

requirement, has satisfactorily completed his work for the academic year

JUNE-2011 to OCT-2011.

Diya Vadhwani Prof. Mukesh M Patel

Internal Guide, Head of Department (CSE)

Page 4: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | ii

ACKNOWLEDGEMENT

We are extremely grateful to Prof. Mukesh Patel, Head of Department of Computer Science

Department-MGITER, Navsari for providing all the required resources for the successful completion

of our seminar.

My heartfelt gratitude to my internal guide Diya Vadhwani, Associate Professor, for her valuable

suggestions and guidance in the preparation of the seminar report.

We will be failing in duty if we do not acknowledge with grateful thanks to the authors of the

references and other literatures referred to in this seminar.

Last but not the least; we are very much thankful to our parents who guided us in every step which

we took.

- SNEHAL M PATEL

Page 5: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | iii

ABSTRACT

Sikuli Script designed by associate professor Rob Miller, grad student

Tsung-Hsiang Chang, and the University of Maryland’s Tom Yeh, is called Sikuli,

which means ―God’s eye‖ in the language of Mexico’s Huichol Indians. In a paper

that won the best-student-paper award at the Association for Computing Machinery’s

User Interface Software and Technology conference in 2010, the researchers showed

how Sikuli could aid in the construction of ―scripts,‖ short programs that combine or

extend the functionality of other programs.Using the system requires some familiarity

with the common scripting language Python. But it requires no knowledge of the

code underlying the programs whose functionality is being combined or extended.

When the programmer wants to invoke the functionality of one of those programs,

she simply draws a box around the associated GUI, clicks the mouse to capture a

screen shot, and inserts the screen shot directly into a line of Python code.

Sikuli Script is a visual technology to search and automate graphical user

interfaces (GUI) using images (screenshots) which is under development, which

automates anything we see on the screen without internal API's support. The first

release of Sikuli contains Sikuli Script, a visual scripting API for Jython, and Sikuli

IDE, an integrated development environment for writing visual scripts with

screenshots easily.

Sikuli Script enables the Programmer to writes program against the user

interface instead of an API. Sikuli Script automates anything we see on the screen

without internal API's support. We can programmatically control a web page, a

desktop application running on Windows/Linux/Mac OS X, or even an iPhone /

Android / Symbian application running in an emulator.

The developers behind this project are,

Page 6: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | iv

TABLE OF CONTENTS

Title Page No.

Acknowledgement ii

Abstract iii

1. Introduction 01

2. How it Works ? 02

2.1. Sikuli: Seeing Pixels 02

2.2. User Interface For Taking Screenshots 02

2.3. Editor for Writing Sikuli Scripts 03

3. Functions 04

3.1. For Handling Application 04

3.2. To Interacting with User 04

3.3. For General Setting & Information 07

3.4. Other Inportant Function 08

4. Extensions 10

4.1. How to Download 10

4.2. How to Develop an Extension 11

4.3. How to Test an Extension 12

5. Working Procedure 13

6. Application & Example 15

6.1. Application of Sikuli Script 15

6.2. Example of Sikuli Script 15

7. Evaluation of Sikuli Script 20

7.1. Evaluation 20

7.2. Testability Analysis 21

7.3. Reusability Analysis 21

8. Limitation 22

9. Future Development 23

10. Summary & Conclusion 24

11. Who is Using ? 25

12. References 26

Page 7: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 1

INTRODUCTION

Until the 1980s, using a computer program means memorizing a lot of

commands and typing them in a line at a time, only to get lines of text back. The

graphical user interface, or GUI, changed that. By representing programs, program

functions, and data as two-dimensional images — like icons, buttons and windows —

the GUI made intuitive and spatial what had been memory intensive and laborious.

But while the GUI made things easier for computer users, it didn’t make them

any easier for computer programmers. Underlying GUI components is a lot of

computer code, and usually, building or customizing a program, or getting different

programs to work together, still means manipulating that code. Researchers in MIT’s

Computer Science and Artificial Intelligence Lab hope to change that, with a system

that allows people to write programs using screen shots of GUIs. Ultimately, the

system could allow casual computer users to create their own programs without

having to master a programming language.

Researchers at the University of Maryland and Massachusetts Institute of

Technology have developed a screen-capture–based scripting environment that could

signal a new programming paradigm that leverages the graphical interface as a sort of

API. The Sikuli system lets users with minimal programming experience use GUI

screen shots to create scripts that interact with applications. Ultimately, it will open

opportunities to develop scripts that touch multiple applications without requiring any

understanding of the underlying programs APIs

In human-to-human communication, asking for information about tangible

objects can be naturally accomplished by making direct visual references to them.....

For example, to instruct a mover to put a lamp on top of a nightstand, we would say,

―put this over there” while pointing to ― ‖ and ― ‖ respectively.

Likewise, in human-to-computer communication, finding information or

issuing commands involving GUI elements can be accomplished naturally by making

direct visual reference to them.

Sikuli allows user or programmer to make direct Visual reference to GUI

elements. To search a documentation database about a GUI element, a user can draw

a Rectangle around it and take a Screenshot as a query. Similarly, to automate

interactions with a GUI element, a programmer can insert screenshots directly into a

script statement and specify what keyboard or mouse action to invoke when this

element is seen on screen.

Page 8: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 2

HOW IT WORKS

2.1 Sikuli: Seeing Pixels

Sikuli's greatest value is its generality, "If it has pixels that Sikuli can see,

and then it's open to automation". The technique is open to any application with a

GUI that can display on a Windows, Mac, or Linux desktop. Users have already been

apply it to not just desktop applications, but also Web pages, video games, mobile

phone apps (running in a simulator or using a remote connection between the desktop

and the phone), and applications from other platforms running in a virtual machine.

2.2 User Interface for Taking Screenshots

Sikuli Search allows a user to select a region of interest on the screen,

submit the image in the region as a query to the search engine, and browse the search

results. To specify the region of interest, a user presses a hot-key to switch to Sikuli

Search mode and begins to drag out a rubber-band rectangle around it. Users do not

need to fit the rectangle perfectly around a GUI element since screenshot

representation scheme allows inexact match. After the rectangle is drawn, a search

button appears next to it, which submits the image in the rectangle as a query to the

search engine and opens a web browser to display the results.

Sikuli Script’s annotation interface allows a user to save screenshots with

custom annotations that can be looked up using screenshots. To save a screenshot of a

GUI element, the user draws a rectangle around it to capture its screenshot to save in

the visual index. The user then enters the annotation to be linked to the screenshot.

Optionally, the user can mark a specific part of the GUI element (e.g., a button in a

dialog box) to which the annotation is directed.

(Example: Taking Screenshot)

Page 9: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 3

2.3 Editor for writing Sikuli scripts:

(Fig: Sikuli Editor)

They developed an editor to help users write visual scripts (Above Fig). To

take a screenshot of a GUI element to add to a script, a user can click on the camera

button (a) in the toolbar to enter the screen capture mode. The editor hides itself

automatically to reveal the desktop underneath and the user can draw a rectangle

around an element to capture its screenshot. The captured image can be embedded in

any statement and displayed as an inline image.

The editor also provides code completion. When the user types a command,

the editor automatically displays the corresponding command template to remind the

user what arguments to supply. For example, when the user types find, the editor will

expand the command. The user can click on the camera button to capture a screenshot

to be the argument for this find() statement. Alternatively, the user can load an

existing image file from disk (b), or type the filename or URL of an image, and the

editor automatically loads it and displays it as a thumbnail. The editor also allows the

user to specify an arbitrary region of screen to confine the search to that region (c).

Finally, the user can press the execute button (d) and the editor will be hidden and the

script will be executed.

Page 10: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 4

FUNCTIONS

3.1 Handling Applications

closeApp - exit - openApp - run – switchApp

3.1.1 openApp( application )

application: The name of an application (case-insensitive) that can be found in the

environment variable PATH, or be the full path to an application (Windows: use double

backslash \\ for the path separator.)

Opens the application application and brings it to the front most.

openApp("cmd.exe") # Windows: found through PATH

openApp("c:\\Program Files\\Mozilla Firefox\\firefox.exe") #

windows: full path specified

openApp("Safari") # Mac: opens Safari

3.1.2 switchApp( application )

application: The name of an application (case-insensitive).

Switches to application application and brings it to the front most. If the application is not

running, it will be launched by openApp().

switchApp("cmd.exe") # Windows: switches to open command prompt or

starts one

switchApp("c:\\Program Files\\Mozilla Firefox\\firefox.exe") #

windows: opens a new browser window !! (since text cannot be found

in the window title)

switchApp("mozilla firefox") # windows: switches to the frontmost

open browser window (no window open: does nothing !!)

switchApp("Safari") # Mac: switches to Safari or starts it

Page 11: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 5

3.1.3 closeApp( application )

application: The name of an application (case-insensitive).

Closes the given application application. It does nothing if no opened window (Windows) or

running app (Mac) can be found.

Note: On Windows: see note with switchApp(). The whole application owning the matching

window will be closed.

closeApp("cmd.exe") # Windows: closes an open command prompt

closeApp("c:\\Program Files\\Mozilla Firefox\\firefox.exe") # windows: does nothing, since

text cannot be found in the window title

closeApp("mozilla firefox") # windows: stops firefox including all its windows

closeApp("Safari") # Mac: closes Safari including all its windows

3.1.4 run( command )

command: a command, that can be run from the command line.

Executes the command command. The script waits for completion.

3.1.5 exit ()

Stops the script gracefully at this point.

Page 12: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 6

3.2 Interacting with the User

popup - input

3.2.1 popup( text )

text: a string that is used as a message

Displays a dialog box with an Ok button and text as message. The script waits for the user to

click Ok.

popup("Hello World!\nHave fun with Sikuli!") # \n

can break a line.

3.2.2 input( [text] )

text: a string that is used as a message. If omitted, it is left blank.

Displays a dialog box with an input field, a Cancel button, an OK button and text as message.

The script waits for the user to click either Cancel or Ok.

Page 13: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 7

3.3 General Information and Settings

getBundlePath - setBundlePath – setShowActions

3.3.1 setBundlePath( path-to-a-folder )

path-to-a-folder a fully qualified path to a folder containing your images used for finding

patterns. Windows: use double backslashes.

Sets the path for searching images in all Sikuli Script methods. Sikuli IDE sets this

automatically to the path of the folder where it saves the script (.sikuli). Therefore, you

should use this function only if you really know what you are doing. Using it generally

means that you would like to take care of your captured images by yourself.

3.3.2 getBundlePath()

returns: a string containing a fully qualified path to a folder containing your images used for

finding patterns. Note: Sikuli IDE sets this automatically to the path of the folder where it

saves the script (.sikuli). You may use this function if, for example, to package your private

files together with the script or to access the picture files in the .sikuli bundles for other

purposes. Sikuli only gives you to access to the path name, so you may need other python

modules for I/O or other purposes.

3.3.3 setShowActions( False | True )

If set to True, when a script is run, Sikuli shows a visual effect on the spot where the action

will take place before executing actions (e.g. click, dragDrop, type, etc) for about 2 seconds .

The default setting is False.

Page 14: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 8

3.4 Other Important Function

Find – Pattern – Action – Region

3.4.1 Find

The find () function locates a particular GUI element to interact with. It takes a visual pattern

that specifies the element appearance, searches the whole screen or part of the screen, and

returns regions matching this pattern or false if no such region can be found.

For example, find (― ‖ ) returns regions containing a Word document icon.

3.4.2 Pattern

The Pattern class is an abstraction for visual patterns. A pattern object can be created from an

image or a string of text. When created from an image, the computer vision algorithm

described earlier is used to find matching screen regions. When created from a string, OCR is

used to find screen regions matching the text of the string.

An image-based pattern object has four methods for tuning how general or specific the

desired matches must be:

exact(): Require matches to be identical to the given search pattern pixel-by-pixel.

similar(float similarity): Allow matches that are somewhat different from the given pattern.

A similarity threshold between 0 and 1 specifies how similar the matching regions must be

(1.0 = exact).

anyColor(): Allow matches with different colors than the given pattern.

anySize(): Allow matches of a different size than the given pattern.

Each method produces a new pattern, so they can be chained together. For example,

Pattern( ).similar(0.8).anyColor().anySize()

Matches screen regions that are 80% similar to of any size and of any color composition.

Note that these pattern methods can impact the computational cost of the search;

The more general the pattern, the longer it takes to find it.

Page 15: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 9

3.4.3 Action

The action commands specify what keyword and/or mouse events to be issued to the center

of a region found by find(). The set of commands currently supported in our API are:

click(Region), doubleClick(Region): These two commands issue mouse-click events

to the center of a target region. For example, click( ) performs a single click on the

first close button found on the screen. Modifier keys such as Ctrl and Command can

be passed as a second argument.

dragDrop(Region target, Region destination): This command drags the element in

the center of a target region and drops it in the center of a destination region. For

example, dragDrop( , ) drags a word icon and drops it in the recycle bin.

type(Region target, String text): This command enters a given text in a target region

by sending keystrokes to its center. For example, type( ,UIST)

types the UIST in the Google search box.

3.4.4 Region

The Region class provides an abstraction for the screen region(s) returned by the find()

function matching a given visual pattern. Its attributes are x and y coordinates, height, width,

and similarity score. Typically, a Region object represents the top match, for example,

r = find( ) finds the region most similar to and assigns it to the variable r. When used

in conjunction with an iterative statement, a Region object represents an array of matches.

For example, for r in find( ) iterates through an array of matching regions and the

programmer can specify what operations to perform on each region represented by r. Another

use of a Region object is to constrain the search to a particular region instead of the entire

screen. For example,

find( ).find( ) constrains the search space of the second find() for the ok

button to only the region occupied by the dialog box returned by the first find().

Page 16: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 10

EXTENSIONS

4.1 How to Download and use:

The download of an extension is supported by the IDE through the menu Tools > Extensions.

You get a popup, that lists the available and already installed extensions and allows to

download new packages or updates for installed ones.

This popup shows a new package not yet installed:

If you need more information about the features of the extension, just click More Info - this

will open the related documentation from the web in a browser window.

If you want to install the extension, just click the Install... button. The package will be

downloaded and added to your extensions repository.

This popup shows an installed package:

If a new version would be available at that time, the Install... button would be active again,

showing the new version number. Now you could click and download the new version.

How to Use an Extension

To use the features of an installed extension in one of your scripts, just say from extension-

name import *. For an usage example read Sikuli Guide.

For information about features, usage and API use menu Tools -> Extensions -> More

Info in the IDE.

Page 17: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 1 1

4.2 How to develop an extension

The source structure of an extension named extension-name looks like this:

Java - org/com -- your-organization-or-company --- extension-name ---- yourClass1.java ---- yourClass2.java ---- .... more classes python - extension-name -- __init__.py -- extension-name.py

The final structure of a JAR (filename extension-name-X.Y where X.Y is the version

string) looks like this:

org/com - your-organization-or-company -- extension-name --- yourClass1.class --- yourClass2.class --- .... more classes extension-name - __init__.py - extension-name.py META-INF - MANIFEST.MF

The file __init__.py contains at least from extension-name import * to avoid one

qualification level. So in a script you might either use:

import extension-name extension-name.functionXYZ()

or:

from extension-name import * functionXYZ()

The second case requires more investment in a naming convention, that avoids naming

conflicts.

The file extension-name.py contains the classes and methods, that represent the API, that one

might use in a Sikuli script.

As an example you may take the source of the extension Sikuli Guide

Page 18: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 12

4.3 How to test extension

While developing your extensions, you can put the JAR file in Sikuli’s extension

directory or in the same .sikuli folder as your test script. The JAR file should not have a

version number in its file name, e.g. extension-name.jar. Because Sikuli starts to search

extensions in the .sikuli folder of the running script and then in the Sikuli extensions folder, it

is usually a good idea to put your developing extensions in the .sikuli folder of your test

script.

Another option is to use the load() function with an absolute path to your extension-

name.jar. If this fails, Sikuli goes on searching in the current .sikuli folder and then in the

Sikuli extensions folder. If load() succeeds, it returns True and puts absolute-path-to-your-

extension-name.jar into sys.path, so you can use import extension-name afterwards.

Page 19: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 13

WORKING PROCEDURE

5.1 How it Work ?

(Fig. Working of Sikuli Script)

Saving

.sikuli (Recognized as source code and opened in editor)

Consists of python file (.py) and all (.png) images used.

Also creates (.html) file for easy web sharing.

Executing

.skl (Executable script, zipped .sikuli directory)

Recognized and run without opening IDE

org.python.util.PythonInterpreter Created

Headers passed to handle Jython

Execution of .py code

Page 20: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 1 4

Java Library Core

Java.awt.Robot: Keyboard & Mouse handling

C++ Engine: Image pattern recognition

Jython Encapsulation

End-User commands

Python (Jython) Interpreter in a Java Runtime Environment.

Scope: Static

Memory Management / Variables & Bindings:

Heap Dynamic – All objects and data structures

Handled by Interpreter, no user control

Malloc(), realloc(), free() etc. can be called by importing C library but results

in mixed calls between C allocator & Python memory manager

Garbage Collection:

Reference Count

Data Types & Type Checking

No Type checking, data types exist but pointers are changed freely

Methods can require a specific type and are checked then

Comments

― # ‖ This is a comment in Python

Page 21: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 1 5

APPLICTION

6.1 Application of Sikuli Script

Sikuli Script can be used to Execute Repeated task like, changing System Setting

every time user login to System or doing one type of work every day. Sikuli script can also

be use as Monitoring for particular event to accrue on screen & to run specific task when that

event accrue. Last but not least, Sikuli script can Performe every task which a user can do on

his computer, the condition is just that, particular task must have pixel that Sikuli can sees!!

Because if it has pixel, it is ready to Automate by Sikuli Script....

6.2 Sikuli Script Examples

We present six example scripts to demonstrate the basic features of Sikuli Script. For

convenience in Python programming, we introduce two variables: find.region and

find.regions, that respectively cache the top region and all the regions returned by the last call

to find. While each script can be executed alone, it can also be integrated into a larger Python

script that contains calls to other Python libraries and/or more complex logic statements.

6.2.1 Minimizing All Active Windows

6.2.2 Deleting Documents of Multiple Types

6.2.3 Tracking Bus Movement

6.2.4 Navigating a Map

6.2.5 Responding to Message Boxes Automatically

6.2.6 Monitoring a Baby

Page 22: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 1 6

EXAMPLES

6.2.1 Minimizing All Active Windows:

This script minimizes all active windows by calling find repeatedly in a while loop (1) and

calling click on each minimize button found (2), until no more can be found.

6.2.2 Deleting Documents of Multiple Types:

This script deletes all visible Office files (Words, Excel, PowerPoint) by moving them to the

recycle bin. First, it defines a function recycleAll() to find all icons matching the pattern of a

given file type and move them to the recycle bin (1-3). Since icons may appear in various

sizes depending on the view setting, anySize is used to find icons of other sizes (2). A for

loop iterates through all matching regions and calls dragDrop to move each match to the

recycle bin (3). Next, an array is created to hold the patterns of the three Office file types (4)

and recycleAll() is called on each pattern (5-6) to delete the files. This example demonstrates

Sikuli Script’s ability to define reusable functions, treat visual patterns as variables, perform

fuzzy matching (anySize), and interact with built-in types (array).

Page 23: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 1 7

6.2.3 Tracking Bus Movement:

This script tracks bus movement in the context of a GPSbased bus tracking application.

Suppose a user wishes to be notified when a bus is just around the corner so that the user can

head out and catch the bus. First, the script identifies the region corresponding to the street

corner (1). Then, it enters a while loop and tries to find the bus marker inside the region

every 60 seconds (2-3). Notice that about 30% of the marker is occupied by the background

that may change as the maker moves. Thus, the similar pattern modifier is used to look for a

target 70% similar to the given pattern. Once such target is found, a popup will be shown to

notify the user the bus is arriving (4). This example demonstrates Sikuli Scripts with

everyday tasks.

6.2.4 Navigating a Map:

This script automatically navigates east to Houston following Interstate 10 on the map (by

dragging the map to the left). A while loop repeatedly looks for the Interstate 10 symbol and

checks if a string Houston appears nearby (1). Each time the string is not found, the position

100 pixels to the left of the Interstate 10 symbol is calculated and the map is dragged to that

position (3), which in effect moves the map to the east. This movement continues until the

Interstate 10 can no longer be found or Houston is reached.

Page 24: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 1 8

6.2.5 Responding to Message Boxes Automatically:

This script generates automatic responses to a predefined set of message boxes. A screenshot

of each message box is stored in a visual dictionary d as a key and the image of the button to

automatically press is stored as a value. A large number of message boxes and desired

responses are defined in this way (1-100). Suppose the win32gui library is imported (101) to

provide the function getActiveWindow(), which is called periodically (102) to obtain the

handle to the active window (103). Then, we take a screenshot by calling getScreenshot()

(104) and check if it is a key of d (105). If so, this window must be one of the message boxes

specified earlier. To generate an automatic response, the relevant button image is extracted

from d (106) and the region inside the active window matching the button image is found

and clicked (107). This example shows Sikuli Script can interact with any Python library to

accomplish tasks neither can do it alone.

Page 25: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 1 9

6.2.6 Monitoring a Baby:

This script demonstrates how visual scripting can go beyond the realm of desktop to interact

with the physical world. The purpose of this script is to monitor for baby rollover through a

webcam that streams video to the screen. A Special Green Mark is Posted on Baby’s

Forehead .By periodically checking if the marker is present (1- 2), the script can detect baby

rollover when the marker is absent and issue notification (3).

Page 26: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 2 0

EVALUTION OF SIKULI SCRIPT

7.1 Evaluation:

To evaluate Sikuli Test, we performed testability analysis how diverse the visual behavior

GUI testers can test automatically, and reusability analysis—how likely testers can reuse a

test script as a GUI evolves.

7.2 Testability Analysis:

We performed testability analysis on a diverse set of visual behavior. Each visual behavior

can be defined as a pairing of a GUI widget and a visual effect rendered on it. We considered

27 common widgets (e.g., button, check box, slider, etc.) and 25 visual effects.

(e.g., ppearance, highlight, focus, etc.). Out of the 675 possible pairings, we identified 368 to

be valid, excluding those that are improbable (e.g., scrollbar + font changing). We began the

analysis by applying Sikuli Test to test the visual behavior exhibited by four real GUI

applications (i.e., 1: Capivara, 2: jEdit, 3: DrJava, and 4: System Preferences on Mac OS X).

Table 1 summarizes the result of the testability analysis. Each cell corresponds to a visual

behavior. Out of 368 valid visual behaviors, 139 (indicated by the number of the application

used to be tested) are empirically testable, visual behavior was found in the four applications

and could be tested; 181 (indicated by a triangle ") are theoretically testable, visual behavior

was not found in the four applications but could be inferred from the testability of other

imilar visual behavior; and 48 (indicated by an ―F‖) are not testable.

Page 27: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 2 1

In addition to these valid visual behaviors, there are 307 rarely paired improbable visual

behaviors indicated by an ―X‖. As can be seen, the majority of the valid visual behavior

considered in this analysis can be tested by Sikuli Test. However, complex visual behavior

such as those involving animations (i.e., fading, animation) are currently not testable, which

topic for future work.

7.3 Reusability Analysis:

We performed reusability analysis of test scripts based on two real GUI applications:

Capivara, a file synchronization tool, and jEdit, a rich-text editor. These two applications

were selected from SourceForge.net with two criteria: it must have GUI, and it must have at

least 5 major releases available for download.

Page 28: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 22

LIMITATIONS

Sikuli Script has only two Limitation, which are...

1. Theme variations: Many users prefer a personalized appearance theme with

different colors, fonts, and desktop backgrounds, which may pose challenges

to a screenshot search engine. Possible solutions would be to tinker with the

image-matching algorithm to make it robust to theme variation or to provide a

utility to temporarily switch to the default theme whenever users wish to

search for screenshots. UI automation is less affected by theme variations

when users write scripts to run on their own machines. However, sharing

scripts across different themes may be difficult. Possible solutions would be to

derive a conversion function to map patterns between themes or to require

users to normalize the execution environment by switching to the default

theme when writing sharable scripts.

2. Visibility constraints: Currently, Sikuli Script operates only in the visible

screen space and thus is not applicable toinvisible GUI elements, such as

those hidden underneath other windows, in another tab, or scrolled out of

view. One solution would be to automate scrolling or tab switching actions to

bring the GUI elements into view to interact with it visually. Another solution

would resort to platform- or application-specific techniques to obtain the full

contents of windows and scrolling panes, regardless of their visibility.

Page 29: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 23

FUTURE DEVELOPMENT

To automate scrolling or tab switching actions to bring the GUI elements

into view to interact with it visually

fast and accurate OCR on screen

Accessibility API integration

Page 30: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 2 4

SUMMARY AND CONCLUSION

Platform Independence:

Works on ANY GUI that can be displayed on Windows/Linux/Mac

Virtual machines

Remote desktop

Mobile simulators: Android, iPhone

Web: Flash, HTML+Javascript

Program Against UI:

Sikuli programs are written against the user interface instead of an API

UI: visible, familiar, always exists

API: faster, probably more stable

Readability of test cases:

The semantic gap between the test scripts and the test tasks automated by the

scripts is small. It is easy to read a test script and understand what GUI feature

the script is designed to test.

Page 31: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 25

WHO IS USING ?

Page 32: Sikuli script

En. No.: 090330131025 Sikuli Script

MGITER/ CO/2011 P a g e | 2 6

REFERENCES

http://sikuli.org

http://blog.sikuli.org/

http://hcc.cc.gatech.edu/documents/

http://twitter.com/#!/sikuli

http://groups.csail.mit.edu/uid/sikuli/

http://sikuli.csail.mit.edu/demo.shtml

http://downloadsquad.switched.com/2010/01/30/sikuli-uses-screen-shots-

to-run-scripts-is-amazing/

http://www.makeuseof.com/tag/create-automation-scripts-easily-

screenshots/

T. Yeh, T.-H. Chang, and R. C. Miller. Sikuli: Using GUI screenshots for

search and automation. In UIST ’09, pages 183–192. ACM, 2009

http://hcc.cc.gatech.edu/documents/104_Edwards_week2.pdf