GUI Ripping of iOS Mobile Applicationswith the ocial Xcode UI Testing framework. • A tool that partially implements these approaches. ... [7] or record-replay [10] [3] [6] testing

Project Work

Andre Da Cruz Guerreiro

GUI Ripping of iOS Mobile Applications

April 26, 2016

supervised by:

Prof. Dr. Sibylle Schupp

Hamburg University of Technology (TUHH)Technische Universität Hamburg-HarburgInstitute for Software Systems21073 Hamburg

Contents

Contents1 Introduction 1

2 Background 32.1 GUI Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 GUI Ripping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3 iOS Programming Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.3.1 iOS Application Structure . . . . . . . . . . . . . . . . . . . . . . . 62.3.2 iOS Event Handling . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 Approach 133.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.1.1 GUI Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.1.2 GUI Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.2 Ripper Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.3 Di�erences to Original Approach . . . . . . . . . . . . . . . . . . . . . . . 16

4 Design and Implementation 194.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.2 GUI Model Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.3 State Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.3.1 Visibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.3.2 Extraction Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.4 Executable Event Identification . . . . . . . . . . . . . . . . . . . . . . . . 254.5 State Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.6 Next Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.7 Event Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.7.1 Xcode UI Testing Framework . . . . . . . . . . . . . . . . . . . . . 284.7.2 Widget-Proxy Correlation . . . . . . . . . . . . . . . . . . . . . . . 29

5 Conclusion 31

6 Future Work 33

iii

1 IntroductionIn recent years there has been a staggering rise in the adoption of mobile computingdevices [13]. In 2016 mobile device shipments will reach 2.6 billion units [24]. In factmultiple studies [18] [14] [12] suggest that the sales of mobile devices have long since out-paced the sales of personal computers. Moreover, the usage of mobile devices in generaland native mobile applications in particular, compared to that of personal computers, iseven more overwhelming as smartphone usage was up 394 percent and tablet usage wasup 1721 percent in 2015 [21] [22], while personal computer usage is decreasing. In fact,it is reasonable to consider mobile applications ubiquitous in todays society. Currently,more than 1.6 million mobile applications are o�ered in the Android Play Store and morethan 1.5 million in the Apple App Store. Now, while Android has a significantly highermarket share, several studies indicate that iOS devices and their applications are usedfar more often [17] [16] and also attract more paying customers [19] [15] [20]. A veryrecent study reassures this as an undeniable fact by pointing out that, despite havinghalf as much downloads as the Android Play Store, the iOS App Store brings in nearlytwice as much revenue [23]. Thus, making the iOS platform an indispensable componentfor many businesses in the digital industry and encouraging a high standard of quality.

Needless to say, together with the incredible increase of mobile devices and applica-tions, the mobile development industry also saw a strong upturn [44]. At the same time,the quickly changing demands of the mobile market [36], led to a strong preference forfast-paced small teams, and naturally, agile methodologies like scrum have establishedthemselves as a very popular work method choice [30]. This means that a typical mobiledevelopment team has limited resources to do manual testing, which is in conflict withthe aided high application quality in order to prevail in midst of the strong competition.Automated testing is a possible solution for this dilemma and, hardly coincidentally,belongs to the fundamentals of most agile methodologies.

With the multitude of continuous integration technologies, available for virtually alldevelopment platforms, automated execution of tests is mostly a solved problem andwidely used [9]. Automated creation of tests, on the other hand, is an essentially un-touched issue in the industry [28] [39]. Indeed, to the best of our knowledge, there areno tools that automatically generate test cases for the sake of GUI testing iOS applica-tions. Whereas model-based testing is a recognized approach to automatically generatetest cases from a model, that has seen a strong rise in popularity in the research sectorfor quite some time, its adoption for automated GUI testing in the industry is negligi-ble [28]. This huge gap between research and industry is apparent, when comparing theamount of developed proposes for various platforms — including the desktop [40] [43],the web [42] [31], Android [26] [46] [25] and iOS [35] — to their very modest adoptionin the industry.

As mentioned above, there exists one published paper addressing the iOS platform,describing a method able to automatically generate a model of the graphical user inter-face of a certain set of iOS applications. Unfortunately, it does only consider a limitednumber of GUI elements and user interactions, and isn’t very adoptable to various appli-

1

1 Introduction

cations and specific needs. This characteristics make it unsuitable for the application inthe industry. The objective of this work is to define an abstract approach for the auto-matic model generation of iOS application GUIs, comprehensively analyze every integralpart of that approach regarding the iOS platform, and to design and build a foundationfor an e�cient, extensive and customizable tool, that hopefully is the first step in pavingthe way for automated GUI test case generation, that actually is applicable in real iOSdevelopment. Specifically, this paper makes the following contributions:

• A model for mobile application GUIs, as well as an approach to dynamically reverseengineer that model.

• A comprehensive analysis of the challenges, requirements and needs the iOS plat-form entails for the various integral parts of that approach.

• Methods to realize all these parts for the iOS platform:– A method to extract the GUIs state, that includes the visibility detection of

its GUI elements.– A method to identify executable GUI elements, that is customizable and

extendable.– A heuristic method to compare states, that also is customizable.– A GUI exploration strategy.– A method to hook into an application and dynamically explore it.– A method to convert a path in the GUI model to a test case, that is executable

with the o�cial Xcode UI Testing framework.

• A tool that partially implements these approaches.

This paper is divided into six chapters. This introduction is followed by a backgroundchapter, establishing the conceptual foundations of this work by giving an overview of theGUI testing domain in section 2.1, introducing the concept of GUI ripping, describingapproaches for various platforms in section 2.2 and presenting the general programmingconcepts of the iOS platform in section 2.3. In chapter 3, we first define all the elementsof a mobile application GUI in 3.1.1, to then propose a model capable of representingsuch a GUI in 3.1.2, as well as an approach to dynamically reverse engineer it frommobile applications in 3.2. The final section in this chapter points out the di�erencesbetween the proposed approach and the original approach described in [40]. Chapter 4first gives an overview of the individual components, that need to be addressed whendesigning an implementation for our approach, in section 4.1. Subsequently, the designand some implementation details for these components are provided. This includesdetails about the data structure of the model in 4.2, the extraction algorithm in 4.3,the event identification in 4.4, the state comparison in 4.5, the exploration strategy in4.6 and the event synthesis in 4.7. In chapter 5 we come to a conclusion, and finally anevaluation experiment, that should be carried out in an immediate next step, is describedin the last chapter.

2

2 BackgroundThis chapter gives an introduction to the topic and theoretical background of GUItesting, GUI ripping and the iOS platform. First section 2.1 introduces into the field ofGUI testing by providing a definition, describing the most common approaches, namingsome of the challenges and noting the extent of its application. After that, in section2.2, we give an overview and reflect on related work in the domain of GUI ripping forseveral platforms. Finally, for a better understanding of our approach, an explanationof the general programming concepts related to the structure and functionality of iOSapplications is provided in the last section.

2.1 GUI TestingGUI testing is a subdomain of the very big testing domain and concentrates on teststhat interact with the GUI of an application. GUI tests mainly assess the following twocriteria:

1. The GUI meets its specifications.

2. The application’s intended functionality can be executed correctly using the GUI.

The first point includes for example checking that all elements have the correct size andposition — even when displayed on screens with di�erent resolutions —, text has thecorrect font and alignment, images are clear, and colors are aesthetically pleasing. Thesecond point includes functional tests, that are executed by triggering events on theactual GUI elements. Considering that GUI tests are the closest that a test can get toactually using an application, they are arguably one of the best ways to do functionaltesting.

There are di�erent approaches to carry out GUI tests. The most common ones are:

• Manual: Tests are executed manually by a tester in conformance with a checklistof the requirements.

• Random: The GUI is tested by randomly executing events on GUI elements.This mainly makes sense for smoke tests, but can also be used to verify globallyapplying assertions.

• Scripted: For every test case, a script is written by the tester, that executes achain of events on GUI elements and verifies specified assertions.

• Record-Replay: A tester uses a tool to record a test case, that then is replayedby the tool when executing the test and verifies specified assertions.

• Model-Based: A model of the applications GUI is created and test cases arededuced from it. Model-based testing spans a wide ranging area. Each of itscomponents, whether creating the model, deducing test cases from the model or

3

2 Background

executing deduced test cases, has a variety of approaches on the full spectrum ofentirely manual to completely automated.

Two major challenges are recognizable related to GUI testing. First, the often rad-ical and fast-paced changes in GUIs make it di�cult and expensive to maintain tests.Secondly, certain behavior is only achieved with complex combinations of actions, for ex-ample when presented with a login form, first the username textfield has to be selected,a username has to be entered, the password textfield has to be selected, the passwordhas to be entered and only then can the login button be tapped to proceed to the nextscreen. This is rred to as the sequencing problem and makes random GUI testing almostuseless and manual, scripted, record-replay and any manual part in model-based testingvery work intensive and tedious. Both these challenges and the non-negligible resultinge�ort can only be tackled by automation. In e�ect, this means that fully automatedmodel-based testing appears to be the most promising approach to tackle the majorchallenges in GUI testing. However, model-based testing comes with its own problemsthat need to be addressed. The biggest being the sequencing problem mentioned beforeand the number of possible paths, that grows exponentially with the length of the se-quence for that paths should be generated. Another problem is the prevention of a stateexplosion, since many GUI elements can be manipulated in an infinite number of ways,e.g. a text field where any combination of characters can be entered.

Even though model-based testing is the most promising approach, when taking a lookat the industry it is remarkably uncared for. While manual GUI testing is performed atpractically every company developing mobile applications [35], and many tools allowingrandom [33] [11], scripted [1] [8] [7] or record-replay [10] [3] [6] testing exist, when itcomes to model-based testing there are very few tools, if any — depending on platformand programming language—, utilized in the industry [28] [39]. In contrast to that, thepopularity of model-based GUI testing is rising very rapidly for quite some time in theresearch area [28].

2.2 GUI RippingAs mentioned in the previous section, there are a multitude of research projects involvingmodel-based testing. This section describes some of the most relevant to this work, thatall belong to the GUI ripping domain. GUI ripping is the process of reverse engineeringthe GUI of an application by systematically triggering events, in this manner completelyexploring its state space, and from the collected information creating a model of theGUI. It plays an essential role in the advancement of fully automating GUI testing, as itautomates the creation of the GUI model, which is the enabler for all other componentsin the GUI testing workflow.

GUI ripping was first introduced by Memon et al. in [40], that presented an approachto automatically reverse engineer desktop applications. A detailed description of thisapproach, that also forms the basis for this work, is given later on in chapter 3. Aroundtheir ripper the authors further created a GUI testing framework, called GUITAR, con-taining additional GUI testing tools. Over the years the same group of researches from

4

2.2 GUI Ripping

the University of Maryland has extended GUITAR to include the complete range oftools for model-based GUI testing [43] [26] [34], by developing, steadily improving andevaluating methods for test case generation and execution [48] [47] [38] [49] [27], testoracles [45] [41], as well as addressing several other challenges in model-based GUI test-ing [37] [29] [32]. A decade after their original GUI ripping paper, they published asecondary study [39], that provides an overview of the work published in the GUI rip-ping domain, summarizes the advancements made and discusses the broader impact thatGUI ripping had on GUI testing, model-based testing, various other testing areas, andeven outside the testing domain.

Mesbah et al. applied the same principle to tackle the challenge of the mostly collapsedconcept of webpages with unique URLs, that arose with the increasing significance ofAJAX in web applications. In order to test the dynamic states of web applications anew method to infer the web application’s models was necessary. In [42] an approachis presented, where the DOM tree is scanned for promising executable elements, thatprobably change the state of the application, events are triggered on those elements andthe discovered state changes are incorporated into the resulting model. Motivated bythe lack of satisfying search in dynamic web application content, [31] had previously pro-posed a similar approach to extract a model of AJAX applications. The main di�erencebetween the two approaches is the applied method for identifying executable elements.While in [42] the DOM is inspected, in [31] the underlying JavaScript source code isanalyzed.

[25] first adapted the GUI ripper concept for Android mobile applications. Theirapproach automatically explores the GUI by triggering events on its elements and createsa GUI tree with the distinct GUI states as nodes and triggered events to segue betweenthem as edges. However, their method is limited to certain supported GUI elements andevents, just like their strategy to explore the GUI and derive test cases is preset andcan not be adjusted to satisfy alternative exploration requirements. [26] on the otherhand presents an approach and tool, that allows the ripping process to be adapted tothe specific application under test or testing objectives by providing the possibility totweak parameters that guide the ripping behavior. Although their tool endorses thecustomization of the ripping process, it does not output a model of the applicationsGUI, but instead generates and executes test cases during the ripping process. All ofthe approaches and tools mentioned thus far are incapable of deriving executable GUIelements for the user interactions supported by the AUTs platform. Instead they limitthe GUI exploration to a basic subset of interactions, that must be supplied to the tools.The work done in [46] addresses this shortcoming by using a mixture of dynamic GUIcrawling and static analysis of the AUTs source code. In the first step of their approach,the source code is analyzed to extract a list of user interactions handled by the AUT,to then systematically rip the GUIs model similar to previous approaches, but with acomplete event input set. A further improvement over other approaches is the e�cacyof their Android ripping tool, due to reducing event triggers specifically to elements thatactually handle it.

Considering the iOS platform the situation is quite di�erent with, to the best of ourknowledge, only one published approach to date. The limited attention dedicated to

5

2 Background

GUI ripping in this domain may be caused by the “idiosyncrasies of the iOS platform”,as put by the authors of [46]. Unsurprisingly, the focus of the paper [35] is on overcomingthe challenges to implement a ripper for iOS mobile applications. Specifically this workaddresses how to automatically exercise the GUI, since at the time of its publicationno o�cially supported method existed. To solve this, an open source testing frameworkcalled KIF [4] is utilized, that makes use of undocumented iOS APIs to enable theautomation of tests. With the foundation that KIF provides, the authors manage it tohook into the running AUT with a combination of code injection, method swizzling andreflection, as well as to simulate user interaction events. In this manner the GUI statespace is explored, similar to previously described approaches, and a GUI tree model iscreated. The authors don’t declare the aspirations to cover the complete set of GUIelements and user interactions, which is why their tool limits these to the most commonones when exploring the GUI.

2.3 iOS Programming ConceptsiOS is the operating system running on iPhones, iPod touches and iPads. CocoaTouchis a collection of frameworks that serves as an abstraction layer for iOS and otherservices running on top of iOS. It allows developers to write applications for the platform.CocoaTouch is mainly written in Objective-C but provides its APIs both for Objective-Cand Swift. UIKit is one of the main frameworks of CocoaTouch and provides APIs tocreate the GUI of iOS applications. In the following the application structure and theevent handling of iOS applications, implemented by the UIKit framework, are described.

2.3.1 iOS Application StructureAn iOS application follows the Model-View-Controller (MVC) pattern. MVC ensuressome separation of concerns, where one part of the program is responsible for managingthe information and implementing information related logic (model), a second part isresponsible for the representation of information through the GUI (view), and a thirdpart is responsible for serving as an intermediary between model and view, implementingthe logic to convert input from the view to commands for the model and updating theview corresponding to changing model states.

The UIKit framework implements the fundamentals of MVC’s view part with the UIV-iew class, along with the fundamentals of the controller part with the UIViewControllerclass. The model part is not relevant for this work. This means in an iOS applicationevery GUI element — with very few exceptions — is a subclass of UIView. Some typicaliOS GUI elements are visible in figure 2.1.

A UIViewController, the controller part of the MVC pattern, is responsible for arectangular part of the screen, its root view. This UIView is the canvas for any GUIelements managed by this controller. The elements are added as sub views of the rootview and they themselves can have multiple sub views, as shown in figure 2.2. Besidesthe usual controller tasks, UIViewControllers are also responsible for the creation anddestruction of its views, as well as managing transitions, when sub views of its root view

6

2.3 iOS Programming Concepts

Figure 2.1: iOS GUI Element Examples

change their state or move to another position, or when the entire root view transitionso� the screen to make space for another view controllers root view.

Figure 2.2: View Controller Structure [2]

A UIViewControllers root view can cover the entire screen, but multiple UIView-Controllers, responsible for di�erent parts of the screen, can also be composed with aso-called container view controller. Figure 2.3 shows a container view controller withtwo child view controllers, one responsible for the part of the screen denoted A and theother for the part denoted B. Both view controller’s root views are sub views of thecontainer view controller’s root view.

UIKit comes with some default container and normal view controllers, that solvecommon tasks. As an example, the most commonly used container is the UINaviga-tionController. As visible in figure 2.4, the navigation controller manages multiple viewcontrollers in a stack and displays the topmost controller’s view together with a navi-gation bar on the screen. It also enables the navigation through the stack by providing

7

2 Background

Figure 2.3: Container View Controller Structure [2]

the possibility to push and pop view controllers as a response to user interactions. Inmost instances the interaction to push a new view controller onto the stack is with aGUI element belonging to the topmost view controller, that is currently presented onscreen, while the interaction to pop the topmost view controller from the stack is witha back button, that is displayed on the left side of the navigation bar.

Figure 2.4: Navigation Controller Structure [2]

Besides containment, it is also possible to modally present a normal or container viewcontroller. When presenting a view controller modally, it slides in from the bottom andcovers the entire screen until it is dismissed.

With this as the bits and pieces of an iOS application, they all come together with theUIApplication. This singleton object is a centralized point to control the application andserves as an interface between it and the operating system. Its main task is coordinatingincoming events and dispatching them to the responsible party. The UIApplicationalso manages the key window, which is the canvas for an applications GUI. When anapplication is launched the window’s root view controller is set, which marks the start

8


of the applications view controller hierarchy.Figure 2.5 puts it all together with an example. When an application is launched

iOS creates a UIApplication object and informs it that it will appear on screen. TheUIApplication then commands the creation of the application’s UIWindow and its rootview controller. In this example the root view controller is a container view controller,that has multiple child view controllers. Each of them has its root view, that they areresponsible for and that are assembled on corresponding parts of the window. In thisfashion the hierarchy of view controllers and the hierarchy of views grow respective tothe applications demands. In the example the UINavigationController, that manages astack of view controllers, assembles the topmost view controllers view with its navigationbar view. However, the root view controller ultimately presents another view controllermodally, that ends up covering the entire screen.

Figure 2.5: iOS Application Structure [2]

2.3.2 iOS Event HandlingAfter the application launched iOS informs the UIApplication object of any events thatoccur. In iOS an event is represented by an object of the UIEvent class. An eventbelongs to one of the following type classes:

9

2 Background

• Touch: a touch or a multitouch — multiple touches at the same time — on thedevices screen, e.g. taping a button.

• Motion: a motion made with the device, e.g. shaking it.

• Press: a press on a hardware button of the device, e.g. pressing the volume upbutton.

• Remote: an event from a remote control device, e.g. pressing the play/pausebutton on a headset.

UIApplication dispatches the events to the responsible party, that can handle themaccordingly. This is an object that is a subclass of the UIResponder class — whichUIApplication, UIView and UIViewController also inherit from — and overrides at leastone of the entry points that the UIResponder class specifies. To determine the responsibleresponder for an event, first a hit test — comparable with ray tracing — is performed onthe position of its occurrence to determine the topmost UIView object at that positionand therewith the GUI element the user interacted with. If this object doesn’t overridethe corresponding responder methods mentioned above, the next object in the responderchain becomes responsible for handling the event. This goes on until an object thathandles the event is found or the entire chain was traversed. The responder chain isformed by another method of the UIResponder class, namely nextResponder. Its defaultimplementation returns the next responder in this fashion for the following object classes:

• UIView, that is not a root view: its super view.

• UIView, that is a root view: its managing UIViewController.

• UIViewController: its root view’s super view.

• UIWindow: its UIApplication.

• UIApplication: nothing.

However, UIKit also implements some helper classes to simplify the most common waysdevelopers react to events. Handling events with such a helper is far more common thanthe raw handling described above. The first set of convenience classes are UIControl andits subclasses. All UIControl subclasses represent a GUI element with the purpose ofenabling the user to exercise control over the application. UIButton, UISwitch, UISliderand UIDatePicker, visible in figure 2.1, are all examples for UIControl subclasses. Devel-opers can assign target-actions pairs to each UIControl object for certain control events,e.g. TouchDown, triggered directly when the control is touched on the screen, TouchDra-gInside, triggered when dragging inside of the controls view, or ValueChanged, triggeredwhen a value of the control is changed, e.g. the month value in a UIDatePicker. Whensuch a control event occurs on an UIControl object, the UIApplication executes thetarget-action pairs associated with it by telling the target to execute the correspondingaction.

10


UIKit also implements some other classes, that aren’t UIControls, but behave verysimilarly in the sense of saving developers the work of handling the raw events. Someexamples are the very elemental UITableViewCell, that implements a cell in a table, orUITabBarItem and UIBarButtonItem, that implement the buttons in a tab bar and in anavigation or tool bar, respectively. All of them either implement a default behavior, likepopping the topmost view controller when the back button in a navigation controllersnavigation bar is tapped, or provide the developer with an entry point to handle theevent with the aid of the delegate technique, like notifying the tables delegate when oneof its table cells is selected.

Besides this, UIKit also provides the UIGestureRecognizer class, along with its sub-classes, that each deal with one of these specific gestures: tap, pinch, rotation, swipe,pan and long press. Gesture recognizers observe certain event sequences, recognize apredefined gesture and provide an entry point for developers to handle these. A ges-ture recognizer can be added to any UIView in order to detect a corresponding gestureperformed on it. For the example of a pinch, the UIPinchGestureRecognizer recognizesthe start of this gesture on a view as soon as two simultaneous touches occur on it andthe corresponding target is notified that the gesture began. Since a pinch is a continuesgesture, every time the two touches move toward or away from each other — typicallysignaling zoom-in and zoom-out — the target is notified of the state change of the ges-ture along with relevant parameters, in this case scale and velocity of the movement.When the touches end the gesture also ends and the target is notified. Besides thedefault recognizers mentioned above, developers can also implement their own gesturerecognizers by subclassing UIGestureRecognizer.

11

3 ApproachThis chapter describes the approach we developed to reverse engineer a mobile applica-tion GUI. It is heavily based on the approach introduced in [40], that addresses desktopapplication GUIs, but is adjusted to the requirements imposed by typical mobile appli-cation GUIs. First in section 3.1 we define all components of a mobile application GUIand the resulting model that is extracted. Following that, in section 3.2, we describe thealgorithm, that extracts that model from the GUI, and finally, section 3.3 elaborates onthe di�erences in our approach compared to the original approach.

3.1 DefinitionsA graphical user interface (GUI) is the front-end component of an application responsiblefor mediating between the applications logic and its user. It displays the application’scontent and enables the user to interact with it. The following formally defines allcomponents of a mobile applications GUI, as well as the resulting model that we wantto extract from it.

3.1.1 GUI ElementsAn application’s GUI changes constantly during its use to adapt itself to the content thatis presented and the user interactions that are enabled. The various distinct compositionsof graphical user interface elements on the screen each represent a state of the GUI. Werefer to every graphical user interface element as a widget. Each widget has a set ofproperties, whereof a subset constitutes of elementary properties, that are shared by allwidgets, while the possession of further properties depends on the type of the widget.Figure 3.1 shows a state of an mobile applications GUI. Some of the widget types present

Figure 3.1: GUI State

13

3 Approach

in this state are (a) cells, (b) labels, (c) images and (d) buttons. All these widgets have,amongst others, the following properties: vertical and horizontal position, width andheight, and background color. Some additional properties only possessed by certainwidget types may be, in the example of a label widget, its text, font, text color and textalignment.

Some widgets are graphical elements only, while others are also executable, meaningthat certain user interactions with it trigger specified actions that modify the GUI. Wecall this an event and define it as a pair, consisting of a predefined user interaction typeand a list of parameters. Examples for some user interactions are a simple tap, a swipegesture or the typing of text. Not all user interactions can be performed on all widgets,e.g. while it is possible to type text in a text field it is not possible to interact in thisway with a button. A widget is not restricted to only one event, therefore a tap on atable cell might open a corresponding view, but a swipe to left on the same cell mightremove the cell from the table containing it.

Formally we define a state as the quadruple {W, P, V, E}, where:

• W = {w1, ..., wl} is the set of all l widgets contained in the state.

• P = {p1, ..., pm} is the set of m properties for each widget in W .

• V = {(v11, ..., v1m), ..., (vl1, ..., vlm)} is the set of values for each property in P ofeach widget in W .

• E = {e1, ..., eg} is the set of the g executable events in the state.

3.1.2 GUI ModelThis section describes how the GUIs structure and behavior is captured in a model. Thismodel has the form of an event-flow-graph (EFG), representing all possible executionsof the application. The EFG consists of the following components:

• Y = {y1, ..., yn} is the set of the n unique states of the GUI.

• i œ Y is the initial state of the GUI, meaning the state the GUI is in immediatelyafter launching.

• L = {(e1, c1, d1), ..., (ef , cf , df )} is the set of f directed edges, that model thefollows-relation of all states in Y . L contains an edge (e, c, d) if the event e invokesa transition from state c to state d.

Figure 3.2 shows an example for one possible EFG. Any path on it that starts with theinitial state, represents one possible execution of the application, e.g. the path markedin red in figure 3.2, corresponds to first tapping the second button in the tab bar, thentapping on the Carbon cell and finally tapping the Details button on the bottom.

14

3.2 Ripper Algorithm

Figure 3.2: Example EFG

3.2 Ripper AlgorithmWith the previous definitions of a mobile applications GUI and its components, in thissection an abstract algorithm is formulated, that reverse engineers the defined model ofa GUI by dynamically executing its widgets, traversing its entire state space and rippingthe required information.

The algorithm is recursive and consists of two parts. The first sets up the initial valuesof the model and the second part defines a recursive step, that extracts the current stateof the application. The recursive nature of the algorithm leads to the model being rippedin a depth-first manner.

In line 3 the initial state is extracted, its executable events are identified in line 4and it is added to the state space in line 5. Afterwards, in line 6, the recursive rippingprocess is initiated.

A recursive step first checks if any unvisited events are still left for the current statein line 9. If so, it does the following: First, in line 10, the event is executed, then theinvoked state is extracted in line 11, all its executable events are identified in line 12and, if not already contained in the state space, the state is added to it in line 13. In the

15

3 Approach

Algorithm 1 GUI Ripper1: EFG {Y, i, L} = { ?, ?, ? }

2: function RIP( )3: i = extract-current-state( )4: identify-executable-events(i)5: Y = i6: RIP-RECURSIVE(i)7: end function

8: function RIP-RECURSIVE(State x)9: if e = next-unvisited-event(x) then

10: execute(e)11: y = extract-current-state( )12: identify-executable-events(y)13: Y = Y fi y

14: L = L fi (e, x, y)15: RIP-RECURSIVE(y)16: end if

17: end function

next step the edge from the previous to the current state, annotated with the executedevent, is added to the EFG in line 14. Finally, the recursive step is called on the currentstate in line 20.

3.3 Di�erences to Original ApproachWhile being very similar to the original GUI model and ripping algorithm describedin [40], the model and algorithm established above still bring about some substantialchanges. These di�erences are due to the fact that the original paper describes anapproach to rip GUIs of desktop applications, which have the concept of modal andmodeless windows, that is absent in mobile application GUIs. A modal window in adesktop application, once it was invoked, restricts all user interactions with the GUIto widgets contained in itself. Only after terminating the modal window, can the userinteract with widgets of underlying windows again. A modeless window, on the otherhand, only adds the user interactions that it enables to the existing set of possibleinteractions. That is, after invoking a modeless window, the user can still interact withwidgets in other windows, as well as with the newly added widgets. This distinctionbetween modal and modeless windows is necessary for desktop GUIs because of thepossibility to have multiple windows displayed at the same time side by side. A mobileapplication GUI however, only consists of one single window that covers the entire screenand thus the combination of multiple windows displayed at the same time is not apotential scenario that has to be dealt with. Although a new state in a mobile application

16

3.3 Di�erences to Original Approach

GUI can be presented modally, as described in section 2.3.1, it nonetheless behaves andhence is treated like every other state. This essential di�erence between desktop andmobile application GUIs makes several portions of the original model unsuitable for ourpurpose.

The triple {W, P, V } representing a window in the original approach is similar towhat we defined as a state in our model. Just like in a state, W is the set of widgetsin the window, P is the set of properties for each widget in W , and V is the set ofvalues for each property of each widget. The big di�erence is that a window doesn’tinclude executable events. For this reason the original approach creates two models tocapture the application’s behavior, namely a GUI tree and an EFG. The GUI tree — orGUI forest, since desktop applications can initiate its GUI with multiple simultaneouswindows and hence have multiple roots — is the triple {W, T, E}, where W is the set ofwindows in the GUI, T is the set of top-level windows, meaning the windows that aredisplayed when the application is launched, and E is the set of directed edges, with anedge from window A to window B, if an event in window A invokes window B. Besidesthe GUI tree an EFG is created that, similar to the EFG in our approach, models thebehavior of the application. In contrast to our approach however, the nodes in theEFG of the original approach are the executable events of the application and an edgefrom event A to event B indicates that event B is executable immediately after A wasexecuted. Our EFG is a combination of both the GUI tree and the EFG of the originalapproach.

Further the original approach introduces the concept of a GUI component. This is agroup of one modal window and any number of modeless windows, that where invoked byan event that either is contained in the modal window or a modeless window belongingto that very same group. Hence, a GUI component is the aggregation of all windowsthat a user can interact with, when this GUI component is currently active. A new GUIcomponent naturally always starts with a newly invoked modal window and ends whenthat same modal window is terminated. Without the concept of modal windows, GUIcomponents are not needed in our model.

Moreover, in the original approach the EFG over the events mentioned above is ac-tually constructed separately for every GUI component and an integration tree is thenused to connect the EFGs with each other. The integration tree is a 3-tuple consistingof the set of GUI components, the main component, which is the initial component ofthe GUI, and the set of directed edges representing the invokes relation between com-ponents. Again, the absence of modal windows makes the integration tree unnecessaryfor our model and allows us to create an EFG for the entire application instead.

Finally, the original method makes a distinction between several kinds of events,mainly in order to di�erentiate events invoking modal windows, called restricted-focusevents, events invoking modeless windows, called unrestricted-focus events and eventsterminating modal windows, called termination events. Furthermore, events that don’topen windows but menus are categorized into menu-open events and events that don’tmodify the GUI but only perform some action with the software, are categorized intosystem-interaction events. Distinguishing events related to modal and modeless win-dows is not necessary for mobile applications. The same applies to menu-open events,

17

3 Approach

since the classical desktop GUI menu element is not usually part of a mobile applicationGUI. Lastly, system-interaction events by definition have no e�ect on the GUI and aretherefore not considered as events according to our definition in section 3.1.1. Thus, ourmodel makes no distinction between di�erent kinds of events.

18

4 Design and ImplementationIn this chapter the design of the GUI ripper tool for iOS applications, that we developedbased on the approach detailed in chapter 3, is described along with some implementationdetails. First, in section 4.1, an overview of the entire ripping process is given and insection 4.2 the resulting data structure is presented. Then, the subsequent sections gointo more detail on the individual steps: 4.3 explains the state extraction, 4.4 the eventidentification, 4.5 the state comparison, 4.6 the state exploration strategy and the eventsynthesis is detailed in 4.7.

4.1 OverviewFigure 4.1 illustrates the entire ripping process. The AUT, as described in Section 2.3.1,is visible on the top left with its controller part, consisting of the UIApplication and theUIViewController hierarchy, and its view part, consisting of the key UIWindow and theUIView hierarchy.

As mentioned before the UIApplication object is notified by the operating system ofincoming events. In order to run the application for our dynamic analysis we make useof an iOS emulator, that is part of the Xcode IDE. Furthermore we use the UI Testingframework, also part of Xcode, to control the application running in the emulator. Theseelements are visible on the top right of figure 4.1.

The middle part of figure 4.1 illustrates the tool developed in this work and its com-ponents. Some of the components are implemented in an extension that augments theactual AUT, in order to attain the necessary access to the AUTs view and widget hierar-chies. An extension is a programming mechanism supported by Objective-C and Swift,that allows developers to add functionality to existing classes at runtime. In this waynew methods and properties can be added to a class or existing methods and propertiescan be overwritten without even having access to the source code.

Finally, the GUIs model, which is the tools resulting output, is depicted at the bottomof figure 4.1.

The particular steps of the ripping process are annotated with a number correspondingto their execution order:

1. The ripping tool tells the UI Testing framework to launch the AUT.

2. The UI Testing framework launches the AUT in the iOS emulator.

3. The state extraction component extracts the AUTs currently visible state.

4. The event Identification determines the executable events in the current state.

5. The extracted state and the identified events are handed to the state comparisoncomponent.

6. The state is compared to all states in the current state space:

19

4 Design and Implementation

a) If not a part of it, it is added to the EFG and connected with an edge startingin the state that invoked it. The new state is also pushed on the work listqueue of the next events component.

b) Otherwise, the current state is passed to the next events component.

7. Based on the work list and the EFG the event sequence, that will be executed nextis determined.

8. The event sequence is passed to the event synthesis.

9. The event synthesis correlates the widgets, that the event sequence contains, withthe corresponding UI Testing proxies.

Figure 4.1: Tool Components

20

4.2 GUI Model Structure

10. The event synthesis instructs the UI Testing framework to synthesize the eventsequence.

11. The UI Testing framework synthesizes the event sequence.

Steps 3–11 are repeated until, in step 7, the work list is found to be empty. At whichpoint the ripping tool continues with step 12.

12. The UI Testing framework is instructed to terminate the application.

13. The application is terminated by the UI Testing framework.

When considering the abstract algorithm presented in section 3.2, the following ta-ble shows the correspondence between the components of the tool and the part in thealgorithm they implement:

Algorithm Part Tool Componentextract-current-state() State Extractionidentify-executable-events(y) Event IdentificationY fi y State Comparisonnext-unvisited-event(x) Next Eventsexecute(e) Event Synthesis

The tool is written in the Swift programming language.

4.2 GUI Model StructureAccording to the basic definition in section 3.1.1, the information composing a stateis limited to the states widgets, their properties and corresponding values, as well asexecutable events. However, in order to later on have a better context to automaticallycreate test cases, the resulting data structure of an extracted state also includes someadditional meta information, namely the relationship information that the UIView andUIViewController hierarchies contain. Another GUI relevant information, that can bediscerned from the UIViewController hierarchy is the entity that a UIViewController,its root widget and all its sub widgets form. We call this entity a view.

Since it isn’t the case, that all elements that are part of the UIView hierarchy arealso visible on the screen, the state data structure mirrors the two hierarchies, yet onlyincluding the currently visible elements. It is shown in figure 4.2.

A state consists of view objects, that contain one or more widgets, and represent partsof the currently visible GUI. The views and equally the widgets are connected in twohierarchies mirroring the visible parts of the UIViewController and UIView hierarchy,respectively. Furthermore a widget can have executable events associated with it. Allthree classes — views, widgets and events — are GUI components, that have a type andoptionally some properties. A property has a name and a value, as well as a weight thatis used in the state comparison, discussed in section 4.5.

21


Figure 4.2: Model Data Structure

With this as the data structure for a state, the final model of the GUI is a set of allunique states, that the GUI can adopt, which are the nodes of the EFG described insection 3.1.2. The EFG is captured in a standard graph structure, where every nodeholds a reference to its corresponding state object.

4.3 State ExtractionAs hinted in the previous section, not every element present in the widget hierarchy isvisible on the screen. The challenge in the state extraction is to determine the currentlyvisible elements and only extracting them and their properties.

4.3.1 VisibilityIn order to determine the visibility of a widget on the screen, we consider followingindividually necessary but not su�cient conditions:

1. The widget is part of the widget hierarchy, that starts with the key window.

2. The widgets hidden property is set to false.

3. The widgets alpha property is greater than 0.01.

4. The widgets frame is at least 1point ◊ 1point.

5. The widget is not outside of the screen bounds, meaning its frame and the screensframe intersect at least in a 1point ◊ 1point area.

22

4.3 State Extraction

6. The widget is not covered completely by other widgets, meaning its frame has atleast a 1point ◊ 1point area, that doesn’t intersect with any opaque widgets framethat is above it. Widget A is above widget B, if it has a higher order than widgetB when traversing the hierarchy in level-order. A widget is opaque if the value ofits alpha property is 1.0.

Furthermore the properties in conditions 1–3 are overwritten by a widget’s ancestors,yielding this additional necessary condition:

7. Conditions 1–3 must also be true for all of the widgets ancestors in the widgethierarchy.

Conditions 4–6 are a little more complicated when considering the widgets ancestors,since it depends on the property clipsToBounds, that either confines all sub widgets ofa widget to its bounds or allows sub widgets to extend beyond them. This again resultsin a necessary condition:

8. If any of the widgets ancestors clipsToBounds property is set to true, all of theconditions 4–6 must be true for that same ancestor widget.

The only condition for a view to be visible is that at least one of its widgets is visible.

4.3.2 Extraction AlgorithmBefore delving into the extraction algorithm, we recall exactly which information isextracted. As detailed in section 3.1.1, a state consists of the view and widget hierarchiesand the events associated with the widgets. As then described in section 4.2 all of theseare GUIComponents, that have a type and certain properties. Yet the properties arenot the same for the three classes, nor for the di�erent types within a class. In orderto know which properties are extracted for which object, a lookup table is used, thatmaps the types — specified in the ViewType, WidgetType and EventType enums — totheir corresponding properties. With these as the extraction target and the visibilityconditions specified in the previous section, we define the recursive extraction algorithmlisted in Algorithm 2.

In the first part of the algorithm the roots of the view and widget hierarchy are set(line 5 and 8), by extracting the properties of the application object (line 4) and keywindow (line 7), respectively. Then the extracted view and widget are connected (line9), and the recursive process is initiated by calling extract-recursive on all child widgetsof the key window and supplying the extracted widget for the window, as well as theextracted view for the application as the remaining parameters (line10–12).

The second part describes one step of the recursion. The input parameters are thetarget widget (w), that should be extracted, its extracted super widget (esw) from aprevious step, the extracted view for the current phase of the extraction process (ev),and the last extracted view that was actually added to the view hierarchy (esv). First,it is checked if the target widget is a root widget, which would mark the beginning ofa new view phase. If this is the case, the new view is extracted with the corresponding

23


Algorithm 2 Extraction1: View Hierarchy V = ?2: Widget Hierarchy W = ?

3: function extract-current-state(Application a)4: ev = extract-view-with-properties(a)5: V = ev

6: k = get-key-window(a)7: ew = extract-widget-with-properties(k)8: W = ew

9: connect(ev, ew)10: for all cw œ get-child-widgets(k) do

11: extract-recursive(cw, ew, ev, ev)12: end for

13: end function

14: function extract-recursive(Widget w, Extracted Widget esw,Extracted View ev, Extracted View esv)

15: if is-root-widget(w) then

16: v = get-view-controller(w)17: ev = extract-view-with-properties(v)18: end if

19: if check-visibility(w) then

20: if esv ”= ev then

21: add-child(esv, ev)22: esv = ev

23: end if

24: ew = extract-widget-with-properties(w)25: add-child(esw, ew)26: esw = ew

27: connect(ev, ew)28: end if

29: for all cw œ get-child-widgets(w) do

30: extract-recursive(cw, esw, ev, esv)31: end for

32: end function

properties from the root widgets view controller (line 16, 17). In a next step, the con-ditions established in the precious section are checked to determine the current widgetsvisibility. If it is visible, then the corresponding view is also visible, as per the onlycondition stated in the previous section: a view is visible as long as at least one of itswidgets is visible. Therefore, if it isn’t currently part of the view hierarchy (checked inline 20), it is added to it and the variable holding the last added view is updated (line

24

4.4 Executable Event Identification

22). Following that, the current widget and its properties are extracted (line 24), addedto the widget hierarchy, and connected with its extracted view (line 27). Finally, thenext recursion step is set in motion if any child widgets are still remaining (lines 29–31).

4.4 Executable Event IdentificationThe first thing, that must be clarified are the circumstances under which it is possible toexecute an event on a widget. Naturally it must be visible, but this is already ensuredsince the extraction algorithm only extracts widgets that are visible and only these areanalyzed for potential executable events. Besides this the only other condition is that thewidgets userInteractionEnabled property is true. Given that, potentially every widgetcould be executable.

Now, one possible way to identify the executable widgets is to try all possible userinteractions on all potential widgets. Of course, for those interactions that have aninfinite input space, like typing text, we would have to confine the attempted inputs to afinite subset. While this is a valid approach it causes a massive overhead that decreasesthe velocity of the ripping process significantly. From a reasonable point of view it isalso unnecessarily excessive, since the overwhelming majority of events is handled by thedesignated classes in the UIKit framework, introduced in section 2.3.2.

In order to improve upon this approach, we propose to identify events by consideringthe ways that they possibly can be handled and respective best practices from thedeveloper guidelines [2]. Based on this information, a list of identification policies isprocessed, that each indicate which widget types they apply to and what conditionsmust hold to presume the associated user interaction to be executable on that widget.The conditions are checked by using the runtimes reflection APIs [5]. The policies listedin table 4.1 are su�cient to detect the vast majority of simple taps.

There are two event handling methods named in section 2.3.2 but missing in table4.1, namely raw event handling and custom UIGestureRecognizers. Unfortunately, fromthe information available through reflection, we can only tell that event handling withthese methods takes place, but not what kind of interaction is handled. Therefore, forthese two universal ways to handle events all possible interactions must be tried, whichresults in the policies listed in table 4.2.

The list of policies can easily be augmented in order to identify other user interactiontypes, as well as to cover very specific exceptions.

It is important to note, that even if all conditions in a policy are met, it doesn’t meanthat the corresponding event was identified. The policies only indicate which events areprobable, so that the number that is tried can be reduced. Only after executing theevent and asserting that it causes a state change, it is considered identified.

As already indicated at the beginning of this section another challenge that must beaddressed is the infinite input space for parameters of some events. One easy solution isto randomly select the input, but it is also possible to create policies to address specificcases. A useful policy for example could be to use predefined values, when a text field

25


Table 4.1: Event Identification PoliciesApplicable Widgets Conditions Presumed InteractionUIControl 1. Property enabled is true Simple tap

2.At least one target-action pair for anyof the following UIControlEvents:TouchDown, TouchUpInside,TouchUpOutside

UINavigationButton – Simple tapUITabBarButton – Simple tapUITableViewCell The managing table view controller im-

plements at least one delegate methodto handle cell selection

Simple tap

UIView At least one associated UITapGestur-eRecognizer with the properties:

Simple tap

numberOfTapsRequired = 1,numberOfTouchesRequired = 1

Table 4.2: Raw Event PoliciesApplicable Widgets Conditions Presumed InteractionUIView The managing controller implements at

least one UIResponder methodAll possible interac-tions

UIView At least one associated custom UIGes-tureRecognizer

All possible interac-tions

has a placeholder 1 saying username or password.

4.5 State ComparisonThe task of the state comparison component is to determine if the current state hasalready been visited by comparing it to all previously extracted states. To solve this weuse the following algorithm, that takes the current state and state space as parameters.

The algorithm loops over all states in the state space (line 2) and compares them tothe current state. The comparison of two states is done by looping over all GUI compo-nents, meaning all views, widgets and events, of the first state (line 5) and comparingtheir properties with the properties of the GUI components at the same index in thecorresponding hierarchy of the second state (line 6–9). As mentioned in section 4.2,each property has a weight, that is added to the accumulated weight w, if the propertiesare equal (line 9). However, the counter of compared properties is incremented in anycase (line 10). Finally, the accumulated weight is divided by the number of comparedproperties and is compared to the threshold (line 13). If the threshold is exceeded the

1text shown in the text field, when nothing has been entered yet to indicate its purpose

26

4.6 Next Events

Algorithm 3 Comparison1: function compare(State s1, Set<State> vs)2: for all s2 œ vs do

3: w = 04: n = 05: for all g1 œ s1.guiComponents do

6: g2 = get-component-at-index(s2.guiComponents, g1.index)7: for all p1 œ g1.properties do

8: p2 = get-property-with-name(g2, p1.name)9: w = w + p1.weight ◊ (p1 © p2)

10: n = n + 111: end for

12: end for

13: if (w/n) > t then

14: return s215: end if

16: end for

17: return nil18: end function

two states are considered equal and the previously extracted state is returned (line 14).Otherwise, if none of the comparisons to the states in the current state space exceedsthe threshold, nil is returned (line 17), meaning the current state is a new state.

4.6 Next EventsThe next events component maintains a work list to direct which events are executednext, in order to explore the entire GUI in a systematic manner. The underlying datastructure is a stack, holding elements that each represent a state. These elements main-tain a list of untried events in that state. Every time the state comparison detects a newstate, a new element representing the state, with all identified events in that state, ispushed on top of the work list. When the last untried event is popped, the correspondingelement is also popped from the work list. To select the next events the following simplerules are followed:

• If the topmost element on the stack represents the currently visible state, one ofits untried events is popped and returned.

• Otherwise, the event sequence that leads to the state belonging to the topmostelement is obtained by searching the EFG with a:

1. Breath first search for a path starting at the current state and ending in thedesired state.

27


2. Depth first search for a path starting at the initial state of the EFG andending in the desired state.

For the first search breath first is more suitable, because the current state is probablyvery close to the desired state. The reasoning behind this is that an event executed ineither the desired state or one of its adjacent states 2 invoked the current state, meaningthere is a path from the desired to the current state with a length of at most two.Because of the hierarchal structure of mobile application GUIs described before, in thevast majority of cases there should also be a path of length two in the other direction.The breath first search should almost always find a path, but there are some rare caseswere the model doesn’t contain a path, at least at this point of the process. In thiscase the depth first search finds a path from the application’s initial state to the desiredstate, the app is terminated, launched again and the found event sequence is executedin order to reach the desired state.

The ripping process is finished when the work list is empty.

4.7 Event SynthesisEvent synthesis is fundamental to the GUI ripping process, since it is the means throughwhich the GUI is explored. There are some requirements for this part of the process,that are explained in the following. First, the actual GUI should be exercised and notthe underlying methods that are called. This is important in order to only execute eventsthat are actually executable, as a GUI element could not be present on the screen or bedisabled. Secondly, the ripping process should only proceed after waiting for animationsto terminate after executing an event. Lastly, it should be possible to run generated testcases on their own without the entire GUI ripping process attached to it.

4.7.1 Xcode UI Testing FrameworkThe Xcode UI Testing framework allows developers to record or script GUI test casesand execute them. It works by running the AUT in an iOS emulator and emulatinguser interactions on its GUI elements. This framework satisfies all three requirementsimposed above and therefore we take advantage of its APIs to do the event synthesis.

The UI Testing APIs are the following:

• XCUIApplication serves as a proxy for the AUT. It mainly can launch and termi-nate the AUT.

• XCUIElementQuery defines a query to obtain GUI widgets currently on the screen.It enables queries for children or descendants of a widget, its type, as well as formatching predicates. By querying widgets for matching predicates, we can filterthem for some of their properties.

2if an executed event is the last untried event in a state, this state is popped from the work list, leaving

an adjacent state on top of the stack

28

4.7 Event Synthesis

• XCUIElement serves as a proxy for a widget. This proxy allows us to access someof the widgets properties and synthesize events.

Listing 4.1 shows how the UI Testing APIs can be used to start an AUT, find GUIwidgets on the screen and synthesizing events on them.// Launch the application

let app = XCUIApplication ()

app. launch ()

// Queries to get proxy for specific cell widget

let tables : XCUIElementQuery = app. descendantsMatchingType (. Table)

let table: XCUIElement = tables . elementBoundByIndex (0)

let cells: XCUIElementQuery = table. descendantsMatchingType (. Cell)

let someCells : XCUIElementQuery = cells. matchingIdentifier (" Carbon ")

let carbonCell : XCUIElement = someCells . elementBoundByIndex (0)

// Synthesize a tap event

carbonCell .tap ()

// Terminate the application

app. terminate ()

Listing 4.1: UI Testing API Example

4.7.2 Widget-Proxy CorrelationIn order to synthesize an event, a query needs to be developed, that returns the proxyfor the widget the event should be synthesized on. This is achieved by starting at thetarget widget and going up the extracted widget hierarchy to save information thatidentifies the widget and its ancestor widgets from the other widgets at each of thehierarchy levels. The saved information consists of the type of the widget and an indexidentifying the widget from all other siblings of the same type. With this information aquery chain is constructed, that starts at the window widget, for each step queries theimmediate children, filters the result by the desired type and finally picks the element atthe corresponding index. In this way the last query returns the desired proxy element.

Another point, that should be mentioned, is that the UI Testing framework runs inanother process than the AUT, where our extension gathers the information for the querychain. Therefore a communication channel in the form of an TCP socket is establishedbetween the two processes, to exchange that information.

29

5 ConclusionIn this paper we have proposed a model to represent mobile application GUIs and anapproach to automatically reverse engineer these models from the running applications.We have analyzed all constituting components of the approach with regard to the iOSplatform and designed methods to implement them.

Specifically, we found that the detection as to which GUI elements are visible on thescreen, to be the biggest challenge for the state extraction and proposed a recursivemethod to extract the visible GUI elements, that can be customized with respect to theproperties that are extracted. For the event identification the main challenge is deducingwhich widgets are probably executable, since trying every interaction on every widgethas a huge negative impact on the e�ciency of the tool. By considering the di�erentevent handling mechanism and their best practices we suggested a method that uses alist of policies to identify presumable events. While the most common ways to interactwith the GUI are covered, not all user interactions are identified by current policies. Butsince the policy list is extendable, they can be added easily. Furthermore, an heuristicstate comparison method with customizable weights was proposed, in order to addressthe important task of distinguishing, whether two states only di�er in their content orin a more substantial manner. We also described a state exploration strategy, that isexhaustive in respect to the identified events. Beyond that, we proposed a method tohook into the application and dynamically explore the GUI by using the o�cial GUItesting framework of the Xcode IDE. We identified the correlation between the actualGUI element and its corresponding proxy to be the biggest challenge and solved it witha method, that similarly can be used to convert paths on the EFG to test cases.

The application of the developed tool is mainly aimed towards generating a model forautomatic test case generation, but can be extended to a variety of other possibilities.The GUI ripping process, for example, within itself is a smoke testing practice, that canbe used to detect crashes. Further, the generated model can be used to get an overviewof the applications structure and functionality, useful for all kinds of tasks. Moreover,the tool could also be used to automatically create screenshots of the entire applicationsGUI — even for multiple languages by slightly extending it.

Clearly a considerable short-coming of this work is the missing evaluation of the pre-sented approach. To this end, an experiment is intended to be conducted, that is de-scribed in the next section.

31

6 Future WorkAfter finalizing the tools implementation, the immediate next step should be an empiricalevaluation of the tools quality. Our measurable definition of quality revolve around thefollowing criteria:

1. Coverage of unique states, as well as executable events. This is, every reachablestate in the application should be included in the model and simultaneously noneof these should be recognized as multiple distinct states. Further, all executableevents in every state should be identified and synthesized at least once.

2. Precision in the event identification. The ripper should not synthesize events thatare not supported by widgets in a given state.

3. E�ciency of the ripping process. The rippers execution time should not exceed athreshold, for which the integration in a typical agile development cycle becomesinfeasible.

4. E�ectiveness of application in industry. The tools benefits should justify the costsof maintaining it, when using it in the industry.

To assess the tools quality, a case study should be carried out on various simple iOSapplications with the following research questions to address the above criteria:

RQ1) How many unique states of an simple iOS application are recognized in comparisonto the actual number?

RQ2) How many unique states are recognized as multiple distinct states and as howmany?

RQ3) How many executable events are recognized in comparison to the actual number?

RQ4) How many events are synthesized, that do not invoke a state change?

RQ5) How much time elapses between starting the ripping process and its completion.

Since the only related work that implements a GUI ripper for iOS applications includesan empirical evaluation, that addresses research questions comparable to RQ1 and RQ5,it makes sense to use the same subjects chosen in [35] for our experiment and to comparethe results. These are six simple open source applications from Apple’s o�cial samplecode website, Github and a tutorial website.

Before conducting the evaluation the best values for the comparison weights, the list ofextracted properties and the used event identification policies should be determined bytrial-and-error experimentations on an application not belonging to the above subjects.

Regarding quality criterium 4, a separate experiment should be performed on morecomplex industrial apps.

33

Bibliography

Bibliography[1] Frank. http://www.testingwithfrank.com. [Online; accessed 26-April-2016].

[2] iOS Human Interface Guidelines. https://developer.apple.com/library/ios/documentation/UserExperience/Conceptual/MobileHIG/. [Online; accessed 26-April-2016].

[3] JFCUnit. http://jfcunit.sourceforge.net. [Online; accessed 26-April-2016].

[4] KIF iOS Integration Testing Framework. https://github.com/kif-framework/KIF. [Online; accessed 26-April-2016].

[5] Objective-C Runtime Reference. https://developer.apple.com/library/mac/documentation/Cocoa/Reference/ObjCRuntimeRef/. [Online; accessed 26-April-2016].

[6] Pounder. http://pounder.sourceforge.net. [Online; accessed 26-April-2016].

[7] ROBOT FRAMEWORK. http://robotframework.org. [Online; accessed 26-April-2016].

[8] Selenium. http://docs.seleniumhq.org. [Online; accessed 26-April-2016].

[9] Survey says: Agile and Continuous Integration Have Altered How Dev Ap-proaches Testing. https://saucelabs.com/press-room/press-coverage/survey-says-agile-and-continuous-integration-have-altered-how-dev-approaches-testing.[Online; accessed 26-April-2016].

[10] UI Testing in Xcode. https://developer.apple.com/videos/play/wwdc2015/406/. [Online; accessed 26-April-2016].

[11] UI/Application Exerciser Monkey. https://developer.android.com/tools/help/monkey.html. [Online; accessed 26-April-2016].

[12] The numbers don’t lie: Mobile devices overtaking PCs. http://fortune.com/2010/08/11/the-numbers-dont-lie-mobile-devices-overtaking-pcs/, 2010.[Online; accessed 26-April-2016].

[13] http://www.businessinsider.com/the-12-most-important-slides-about-mobile-from-mary-meekers-presentation-2012-5?op=1&IR=T, 2012. [Online; accessed 26-April-2016].

[14] http://www.businessinsider.com/mobile-will-eclipse-desktop-by-2014-2012-6?IR=T, 2012. [Online; accessed 26-April-2016].

[15] http://www.forbes.com/sites/johngaudiosi/2012/05/05/new-research-shows-apple-still-winning-the-video-game-war-against-android/#6c3eb94d5a76, 2012. [Online; accessed 26-April-2016].

35

Bibliography

[16] Android Vs. IOS: Usage and Engagement Patterns. http://www.tech-thoughts.net/2012/11/ios-android-usage-engagement-patterns.html#.Vx6uyWMRquN,2012. [Online; accessed 26-April-2016].

[17] IBM 2012 Holiday Benchmark Reports. http://www-01.ibm.com/software/marketing-solutions/benchmark-reports/black-friday-2012.html?cm_mmc=holiday2012-benchmark-reports-_-press-release-_-wire-_-text-link,2012. [Online; accessed 26-April-2016].

[18] Smart phones overtake client PCs in 2011. http://www.canalys.com/newsroom/smart-phones-overtake-client-pcs-2011, 2012. [Online; accessed 26-April-2016].

[19] Alert: Mobile Tra�c and Sales Surge on Christmas Day 2013. http://www-01.ibm.com/software/marketing-solutions/benchmark-hub/dec26.html, 2013. [On-line; accessed 26-April-2016].

[20] Apple’s users spend 4X as much as Google’s. http://fortune.com/2014/06/27/apples-users-spend-4x-as-much-as-googles/, 2014. [Online; accessed 26-April-2016].

[21] Mobile Internet Usage Skyrockets in Past 4 Years to Overtake Desktop asMost Used Digital Platform. https://www.comscore.com/Insights/Blog/Mobile-Internet-Usage-Skyrockets-in-Past-4-Years-to-Overtake-Desktop-as-Most-Used-Digital-Platform,2015. [Online; accessed 26-April-2016].

[22] UK Adults Spend More Time on Mobile Devicesthan on PCs. http://www.emarketer.com/Article/UK-Adults-Spend-More-Time-on-Mobile-Devices-than-on-PCs/1012356,2015. [Online; accessed 26-April-2016].

[23] App Annie Index Market Q1 2016. http://go.appannie.com/report-app-annie-index-market-q1-2016, 2016. [Online; accessed 26-April-2016].

[24] Mobile device market to reach 2.6 billion units by 2016. http://www.canalys.com/newsroom/mobile-device-market-reach-26-billion-units-2016, 2016. [On-line; accessed 26-April-2016].

[25] Domenico Amalfitano, Anna Rita Fasolino, and Porfirio Tramontana. A guicrawling-based technique for android mobile application testing. In Software Test-ing, Verification and Validation Workshops (ICSTW), 2011 IEEE Fourth Interna-tional Conference on, pages 252–261. IEEE, 2011.

[26] Domenico Amalfitano, Anna Rita Fasolino, Porfirio Tramontana, SalvatoreDe Carmine, and Atif M Memon. Using gui ripping for automated testing of androidapplications. In Proceedings of the 27th IEEE/ACM International Conference onAutomated Software Engineering, pages 258–261. ACM, 2012.

36

Bibliography

[27] Stephan Arlt, Ishan Banerjee, Cristiano Bertolini, Atif M Memon, and MartinSchäf. Grey-box gui testing: E�cient generation of event sequences. arXiv preprintarXiv:1205.4928, 2012.

[28] Ishan Banerjee, Bao Nguyen, Vahid Garousi, and Atif Memon. Graphical user in-terface (gui) testing: Systematic mapping and repository. Information and SoftwareTechnology, 55(10):1679–1694, 2013.

[29] Penelope A Brooks and Atif M Memon. Introducing a test suite similarity metricfor event sequence-based test cases. In Software Maintenance, 2009. ICSM 2009.IEEE International Conference on, pages 243–252. IEEE, 2009.

[30] Vanessa N Cooper and Hisham M Haddad. Study of agility in mobile applicationdevelopment. In Proceedings of the International Conference on Software Engi-neering Research and Practice (SERP), page 1. The Steering Committee of TheWorld Congress in Computer Science, Computer Engineering and Applied Com-puting (WorldComp), 2013.

[31] Cristian Duda, Gianni Frey, Donald Kossmann, Reto Matter, and Chong Zhou. Ajaxcrawl: Making ajax applications searchable. In Data Engineering, 2009. ICDE’09.IEEE 25th International Conference on, pages 78–89. IEEE, 2009.

[32] Ethar Elsaka, Walaa Eldin Moustafa, Bao Nguyen, and Atif Memon. Using methods& measures from network analysis for gui testing. In Third InternationalConference on Software Testing, Verification, and Validation Workshops, pages 240–246. IEEE, 2010.

[33] Patrice Godefroid, Nils Klarlund, and Koushik Sen. Dart: directed automatedrandom testing. In ACM Sigplan Notices, volume 40, pages 213–223. ACM, 2005.

[34] Daniel R Hackner and Atif M Memon. Test case generator for guitar. In Companionof the 30th international conference on Software engineering, pages 959–960. ACM,2008.

[35] Mona Erfani Joorabchi and Ali Mesbah. Reverse engineering ios mobile applica-tions. In Reverse Engineering (WCRE), 2012 19th Working Conference on, pages177–186. IEEE, 2012.

[36] Heejin Kim, Byoungju Choi, and Eric W Wong. Performance testing of mobileapplications at the unit test level. In Secure Software Integration and ReliabilityImprovement, 2009. SSIRI 2009. Third IEEE International Conference on, pages171–180. IEEE, 2009.

[37] Scott McMaster and Atif M Memon. Call-stack coverage for gui test suite reduction.Software Engineering, IEEE Transactions on, 34(1):99–115, 2008.

[38] Scott McMaster and Atif M Memon. An extensible heuristic-based framework forgui test case maintenance. In Software Testing, Verification and Validation Work-shops, 2009. ICSTW’09. International Conference on, pages 251–254. IEEE, 2009.

37

Bibliography

[39] A. Memon, I. Banerjee, B. N. Nguyen, and B. Robbins. The first decade of guiripping: Extensions, applications, and broader impacts. In 2013 20th WorkingConference on Reverse Engineering (WCRE), pages 11–20, Oct 2013.

[40] Atif Memon, Ishan Banerjee, and Adithya Nagarajan. Gui ripping: Reverse engi-neering of graphical user interfaces for testing. In null, page 260. IEEE, 2003.

[41] Atif M Memon and Qing Xie. Studying the fault-detection e�ectiveness of gui testcases for rapidly evolving software. Software Engineering, IEEE Transactions on,31(10):884–896, 2005.

[42] Ali Mesbah, Arie Van Deursen, and Stefan Lenselink. Crawling ajax-based webapplications through dynamic analysis of user interface state changes. ACM Trans-actions on the Web (TWEB), 6(1):3, 2012.

[43] Bao N Nguyen, Bryan Robbins, Ishan Banerjee, and Atif Memon. Guitar: aninnovative tool for automated testing of gui-driven software. Automated SoftwareEngineering, 21(1):65–105, 2014.

[44] Thomas L Rakestraw, Rangamohan V Eunni, and Rammohan R Kasuganti. Themobile apps industry: A case study. Journal of Business Cases and Applications,9:1, 2013.

[45] Qing Xie and Atif M Memon. Designing and comparing automated test oracles forgui-based software applications. ACM Transactions on Software Engineering andMethodology (TOSEM), 16(1):4, 2007.

[46] Wei Yang, Mukul R Prasad, and Tao Xie. A grey-box approach for automated gui-model generation of mobile applications. In Fundamental Approaches to SoftwareEngineering, pages 250–265. Springer, 2013.

[47] Xun Yuan, Myra B Cohen, and Atif M Memon. Towards dynamic adaptive auto-mated test generation for graphical user interfaces. In Software Testing, Verificationand Validation Workshops, 2009. ICSTW’09. International Conference on, pages263–266. IEEE, 2009.

[48] Xun Yuan and Atif M Memon. Alternating gui test generation and execution.In Practice and Research Techniques, 2008. TAIC PART’08. Testing: Academic& Industrial Conference, pages 23–32. IEEE, 2008.

[49] Xun Yuan and Atif M Memon. Generating event sequence-based test cases using guiruntime state feedback. Software Engineering, IEEE Transactions on, 36(1):81–95,2010.

38

Eidesstattliche ErklärungIch versichere an Eides statt, dass ich die vorliegende Bachelorarbeit selbstständig ver-fasst und keine anderen als die angegebenen Quellen und Hilfsmittel verwendet habe. DieArbeit wurde in dieser oder ähnlicher Form noch keiner Prüfungskommission vorgelegt.

Hamburg, den 26. April 2016

Andre Da Cruz Guerreiro

Documents

GUI Ripping of iOS Mobile Applicationswith the ocial Xcode UI Testing framework. • A tool that partially implements these approaches. ... [7] or record-replay [10] [3] [6] testing