24
CS 290C: Formal Models for Web Software Lecture 18: Verification of Data Driven Web Applications Instructor: Tevfik Bultan

CS 290C: Formal Models for Web Software Lecture 18: Verification of Data Driven Web Applications Instructor: Tevfik Bultan

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

CS 290C: Formal Models for Web Software

Lecture 18: Verification of Data Driven Web Applications

Instructor: Tevfik Bultan

WAVE: A Verification Tool

• Wave is a verification tool for web applications

• Focuses on data driven web applications– Such as the web applications targeted by WebML

• Wave is capable of verifying temporal properties of WebML style web application specifications– Temporal properties are specified using a variant of LTL– Web application is specified using a core query

language

Interactive, data-driven web applications

• Web application generates web pages dynamically by sending queries to a backend database

• Web application receives input from the user– It responds by taking some action, updating its internal

state and generating a new web page determined by a query

– In Wave queries are specified in FO: first order queries• FO is an abstraction of the data manipulation core of

SQL • A run is a sequence of inputs together with the web pages,

states and actions generated by the web application

Infinite State Verification

• Data driven web applications interact with a back-end database

• Since the possible configurations of the back-end database is unbounded, verification of data-driven web applications is an infinite state verification problem

• Most infinite state verification problems are undecidable• There are two main approaches to handle this:

– Perform unsound verification by restricting the state space (for example like in Alloy)

• This approach might miss bugs (false negative)– Perform sound verification by mapping the infinite state

space to a finite abstraction• This approach might generate false alarms (false

positive)

Infinite State Verification

• Wave uses a different approach– It focuses on a restricted class of systems and

implements a sound and complete verification technique for that restricted class

– For the specifications that are not in that class, Wave can still be used as an unsound verification tool

• The restricted class of systems analyzed by Wave are called input-bounded systems– The range of quantifications in the queries used to

specify the system are restricted to the input values– Example: for all x, for all y [pay(x,y) => price(x,y)]

where pay(x,y) is an input and price is a database relation

Verification with Wave

• Wave takes a web application specification WA and a property p as input

• The property p is specified using LTL-F0 – LTL-FO is an extension of LTL that allows specification

of data related properties• Verification approach

– Explicitly specify the tuples in the database that use only a small set of relevant constants C computed from WA and p. This is called the core of the database and remains unchanged throughout the run

– At each step in the run, make additional assumptions about the content of the database, needed to determine the next possible configurations. The assumptions involve only a small set of additional values

Formal Model

• Wave uses the following formal model for web applications• Each web application WA consists of

– a finite set of web page schemas, one of which is designated to be the home page

– A database relational schema D– A state relational schema S

• Each web page schema W specifies the following:– The types of inputs accepted by W– State update rules specifying the tuples to be inserted or

deleted from the state relations of S– Actions taken in response to user input– Target web page rules

Input schemas

• Each web page W has an input schema identifying the types of inputs accepted by W– Inputs can be either text inputs or a choice from an

option list (to model pull-down menus, radio buttons, scroll-down lists etc.)

• Input is specified as an input schema containing constants (corresponding to text input, which is undefined until entered by the user) and relations (representing input option lists)

• For each input relation R the options generated by the web page is defined as an input rule

OptionsR(x) <- q(x) where q is a query on the database, state relations and inputs provided by the user at the previous step

Rules for States, Actions and Target pages

• These specify the tuples to be inserted or deleted from the state relations of S– Insertions: S(x) <- q(x)– Deletions: !S(x) <- q(x)

• Actions taken in response to user input– Such as sending an e-mail, an invoice, etc.– Specified as insertions to actions relations, like the state

insertion rules• Target web page rules

– Specify for each possible target page the condition under which the transition occurs

– The conditions are queries to the database, current state and current or previous user inputs

Semantics

• A web application WA produces a sequence of pages in response to user inputs, starting at the home page

• Transitions occur as follows:– Each web page generates input options corresponding

to its input schema– User responds by making one choice for each input

relation and providing values for the input constants– Then the web application takes the actions defined by

the action rules, updates the state based on the state insertion and deletion rules and moves to the web page based on the target rules

Semantics

• The content of the database, state relations, current web page current input choices and computed actions form a configuration of WA

• A run over a database instance D is a sequence of configurations C0, C1, C2, … where C0 is the initial configuration of he home page

• The database does not change during a run

An Example

• An e-commerce website for online computer shopping

• The web application consists of a set of pages:– HP: home page– RP: new user registration page– CP: customer page– LSP: laptop search page– PIP: displays products returned by he search– CC: allows the user to view the cart contents and order

items in it

Specification for LSP

Page LSP

Inputs: laptopsearch(ram,hdisk,display), button(x)

Input Rules:

Optionsbutton(x) <- x = “search” or x=“new-cart” or x=“logout”

Optionslaptopsearch(r,h,d) <- criteria(“laptop”, “ram”, r) and

criteria(“laptop”,”hdd”, h) and criteria(“laptop”, “display”, d)

State Rules:

userchoice(r,h,d) <- laptopsearch(r,h,d) and button(“search”)

Target Web Page Rules:

HP <- button(“logout”)

PIP <- exists r, h, d laptopsearch(r,h,d) and button(“search”)

C <- button(“view cart”)

End Page

Specification of the properties

• Properties are specified in LTL-FO using temporal operators G, F, X, U, B where– p B q holds if either q never holds or it eventually does

and p must hold sometime before q becomes true

• An example LTL-FO formula:

for all x, y, id ( pay(id,x,y) and price(x,y) B ship(id,x) )

• Input bounded LTL-FO formulas: all quantified variables range over values from user inputs– Given an input bounded web application WA and an

input-bounded LTL-FO formula f, it is decidable to determine if WA satisfies f

Finite State LTL model checking

• Given an LTL property one can construct a Buchi automaton (an automaton that accepts infinite runs) that accepts precisely the sequences that satisfy the LTL property

• Finite state LTL model checking algorithm (used in Spin)– Given a transition system and a LTL property, construct

the Buchi automaton for the negated LTL property– Search for accepting runs in the product automaton that

corresponds to the product of the transition system and the negation of the input property. If such a run exists then the property is violated and the accepting run is the counter example

– An accepting run is always an accepting cycle reachable from an initial state. Search for an accepting run can be done using a nested depth first search

Finite State LTL Model Checking

• Finite state LTL model checking works because during the nested depth first search we eventually run out of new states to visit– There are only finitely many states to visit

• We either find a counter-example path and conclude that the property is violated or

• When there are no more new states left to visit we conclude that there are no counter-example paths, hence the property is satisfied

• This approach does not work for an infinite state system since we are not able to visit all states– For the web application verification problem, the issue is

that there are an infinite number of databases that should be searched

The Verification Approach in Wave

• An earlier result by the authors show that to verify a web application WA with respect to a LTL-FO property f it is sufficient to inspect only a finitely many runs– These are runs on a finite set of databases constructed

over a domain which depends only on the specification and the property

• Then, it is possible to use a nested-depth first search algorithm– For each representative database, execute a nested-

depth first search– Note that the only nondeterminism in a web application

model is due to user input, so the search algorithm can try all possible user inputs

The Verification Approach in Wave

• The number of representative databases to be investigated is exponential in the size of the web application specification and the property

• Since we also have to search for all combinations of user choices, the resulting algorithm ends up being double-exponential

• The first optimization of the algorithm constructs the representative database lazily– at each step in the run only construct the portions of he

database that is required for evaluating the rules and the property

• The second optimization prunes irrelevant configurations based on a dataflow analysis

• When the two optimization are combined the verification takes a few seconds

Further Optimizations

• Representing database configurations efficiently using bitmaps

• Detecting the visited configurations efficiently using a trie data structure (linear time update and membership tests in the size of the bitmap)

• The database configurations are stored in a database using a DBMS for efficient evaluation of the schema rules

• Efficient translation between the bitmap representation and the database tables

• Efficient evaluation of FO queries in page schema rules• Efficient evaluation of FO property components• Using an efficient DBMS

Experiments

• They experimented with the following representative web applications– An online computer shopping application (based on Dell

web site)– A specification of a sports website (based on Motorcycle

Grand Prix web site)– An airline reservation site (based on Expedia web site)– A specification of a book shopping applications (based

on Barnes & Noble web site)

Property Categories

• Sequence: p B q• Session: Gp => Gq• Correlation: Fp => Fq• Response: p => F q• Reachability: G p or F q• Progress: G(F p)• Strong non-progress: F(G p)• Weak non-progress: G(p => X p)• Guarantee: F p• Invariance: G p

Example properties

• F HP: The home page is eventually reached in all runs – This is trivially true since all runs start at the home page

• Any confirmed product was previously paid for• An order must have status ordered before it can be

cancelled • If a product is added into the cart, then the user must

eventually view the details of this product• Every run mus reach the error page EP and be trapped

there forever– This property should not hold

• Wave verifies these properties (and others)– The maximum verification time is 4 seconds

Wave Tool

• The tool consists of four modules– Specification module: the input can either be a text file in

the input language of Wave tool or it can be a WebML specification that can be translated to a Wave specification using a WebML import module

– Verification module: Implements the verification algorithm

– Explanation module: Prints out the counter-example runs in an informative way

– Code generation module: Automatically generates JSP pages from the input specification

Extensions

• The authors extended the verification approach used in Wave to verification of compositions of web services

• Message based asynchronous communication– Flat & nested messages– Rules can refer to messages currently read and sent

• They consider asynchronous communication with FIFO channels– In order to obtain decidability they have to bound the

channel sizes• They also investigated modular verification of service

compositions where only a subset of the peers are analyzed