14
Towards Automatic Resource Bound Analysis for OCaml Jan Hoffmann Shu-Chun Weng Yale University {jan.hoffmann, shu-chun.weng}@yale.edu Abstract Understanding the resource usage of programs is crucial for devel- oping software that is safe, secure, and efficient. Consequently, there is ongoing interest in the development of techniques that provide software developers with support for inferring resource bounds at compile time. This article introduces a new resource analysis system for OCaml programs. The system automatically derives worst-case resource bounds for higher-order, polymorphic programs with side- effects and user-defined inductive types. The technique is parametric in the resource and can derive bounds for time, memory, and energy usage. The derived bounds are multivariate resource polynomials that are functions of different size parameters that depend on the standard OCaml types. Bound inference is fully automatic and re- duced to a standard linear optimization problem that is passed to an off-the-shelf LP solver. Technically, the analysis system is based on a novel multivariate automatic amortized resource analysis (AARA). It builds on existing work on linear AARA for higher-order pro- grams with user-defined inductive types and on multivariate AARA for first-order programs with built-in lists and binary trees. For the first time, it is possible to automatically derive polynomial bounds for higher-order functions and polynomial bounds that depend on user-defined inductive types. Moreover, the analysis handles side effects and even outperforms the linear bound inference of previous systems. At the same time, it preserves the expressivity and effi- ciency of existing AARA techniques. The practicality of the analysis system is demonstrated with an implementation and the integration with Inria’s OCaml compiler. In a case study, the system infers bounds on the number of queries that are sent by OCaml programs to DynamoDB, a commercial NoSQL cloud database service. 1. Introduction The quality of software crucially depends on the amount of resources —such as time, memory, and energy—that are required for its exe- cution. Statically understanding and controlling the resource usage of software continues to be a pressing issue in software develop- ment. Performance bugs are very common and among the bugs that are most difficult to detect [40, 46] and large software systems are plagued by performance problems. Moreover, many security vulnerabilities exploit the space and time usage of software [21, 42]. Developers would greatly profit from high-level resource-usage information in the specifications of software libraries and other interfaces, and from automatic warnings about potentially high- resource usage during code review. Such information is particularly relevant in contexts of mobile applications and cloud services, where resources are limited or resource usage is a major cost factor. Recent years have seen fast progress in developing frameworks for statically reasoning about the resource usage of programs. Many advanced techniques for imperative integers programs apply abstract interpretation to generate numerical invariants. The obtained size- change information forms the basis for the computation of actual bounds on loop iterations and recursion depths; using counter instrumentation [27], ranking functions [2, 6, 15, 48], recurrence relations [3, 4], and abstract interpretation itself [18, 54]. Automatic resource analysis techniques for functional programs are based on sized types [50], recurrence relations [23], term-rewriting [10], and amortized resource analysis [32, 34, 41, 47]. Despite major steps forward, there are still many obstacles to overcome to make resource analysis technologies available to developers. On the one hand, typed functional programs are particularly well-suited for automatic resource-bound analysis since the use of pattern matching and recursion often results in a relatively regular code structure. Moreover, types provide detailed information about the shape of data structures. On the other hand, existing automatic techniques for higher-order programs can only infer linear bounds [41, 50]. Furthermore, techniques that can derive polynomial bounds are limited to bounds that depend on predefined lists and binary trees [29, 32] or integers [15, 48]. Finally, resource analyses for functional programs have been implemented for custom languages that are not supported by mature tools for compilation and development [32, 34, 41, 47, 50]. The goal of a long term research effort is to overcome these obstacles by developing Resource Aware ML (RAML), a resource- aware version of the functional programming language OCaml. RAML is based on an automatic amortized resource analysis (AARA) that derives multivariate polynomials that are functions of the sizes of the inputs. In this paper, we report on three main contributions that are part of this effort. 1. We present the first implementation of an AARA that is inte- grated with an industrial-strength compiler. 2. We develop the first automatic resource analysis system that infers multivariate polynomial bounds that depend on size parameters of complex user-defined data structures. 3. We present the first AARA that infers polynomial bounds for higher-order functions. The techniques we develop are not tied to a particular resource but are parametric in the resource of interest. RAML infers tight bounds for many complex example programs such as sorting algorithms with complex comparison functions, Dijkstra’s single-source shortest- path algorithm, and the most common higher-order functions such as (sequences) of nested maps, and folds. The technique is naturally compositional, tracks size changes of data across function bound- aries, and can deal with amortization effects that arise, for instance, from the use of a functional queue. Local inference rules generate linear constraints and reduce bound inference to off-the-shelf LP solving, despite deriving polynomial bounds. To ensure compatibility with OCaml’s syntax, we reuse the parser and type inference engine from Inria’s OCaml compiler. We extract a type-annotated syntax tree to perform (resource preserving) code transformations and the actual resource-bound analysis. To precisely model the evaluation of OCaml, we introduce a novel operational semantics that makes the efficient handling of function closures in Draft 1 2015/7/27

Towards Automatic Resource Bound Analysis for OCamlcs- · 2015. 7. 27. · Towards Automatic Resource Bound Analysis for OCaml Jan Hoffmann Shu-Chun Weng Yale University fjan.ho mann,

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • Towards Automatic Resource Bound Analysis for OCaml

    Jan Hoffmann Shu-Chun WengYale University

    {jan.hoffmann, shu-chun.weng}@yale.edu

    AbstractUnderstanding the resource usage of programs is crucial for devel-oping software that is safe, secure, and efficient. Consequently, thereis ongoing interest in the development of techniques that providesoftware developers with support for inferring resource bounds atcompile time. This article introduces a new resource analysis systemfor OCaml programs. The system automatically derives worst-caseresource bounds for higher-order, polymorphic programs with side-effects and user-defined inductive types. The technique is parametricin the resource and can derive bounds for time, memory, and energyusage. The derived bounds are multivariate resource polynomialsthat are functions of different size parameters that depend on thestandard OCaml types. Bound inference is fully automatic and re-duced to a standard linear optimization problem that is passed to anoff-the-shelf LP solver. Technically, the analysis system is based ona novel multivariate automatic amortized resource analysis (AARA).It builds on existing work on linear AARA for higher-order pro-grams with user-defined inductive types and on multivariate AARAfor first-order programs with built-in lists and binary trees. For thefirst time, it is possible to automatically derive polynomial boundsfor higher-order functions and polynomial bounds that depend onuser-defined inductive types. Moreover, the analysis handles sideeffects and even outperforms the linear bound inference of previoussystems. At the same time, it preserves the expressivity and effi-ciency of existing AARA techniques. The practicality of the analysissystem is demonstrated with an implementation and the integrationwith Inria’s OCaml compiler. In a case study, the system infersbounds on the number of queries that are sent by OCaml programsto DynamoDB, a commercial NoSQL cloud database service.

    1. IntroductionThe quality of software crucially depends on the amount of resources—such as time, memory, and energy—that are required for its exe-cution. Statically understanding and controlling the resource usageof software continues to be a pressing issue in software develop-ment. Performance bugs are very common and among the bugsthat are most difficult to detect [40, 46] and large software systemsare plagued by performance problems. Moreover, many securityvulnerabilities exploit the space and time usage of software [21, 42].

    Developers would greatly profit from high-level resource-usageinformation in the specifications of software libraries and otherinterfaces, and from automatic warnings about potentially high-resource usage during code review. Such information is particularlyrelevant in contexts of mobile applications and cloud services, whereresources are limited or resource usage is a major cost factor.

    Recent years have seen fast progress in developing frameworksfor statically reasoning about the resource usage of programs. Manyadvanced techniques for imperative integers programs apply abstractinterpretation to generate numerical invariants. The obtained size-change information forms the basis for the computation of actualbounds on loop iterations and recursion depths; using counter

    instrumentation [27], ranking functions [2, 6, 15, 48], recurrencerelations [3, 4], and abstract interpretation itself [18, 54]. Automaticresource analysis techniques for functional programs are based onsized types [50], recurrence relations [23], term-rewriting [10], andamortized resource analysis [32, 34, 41, 47].

    Despite major steps forward, there are still many obstaclesto overcome to make resource analysis technologies availableto developers. On the one hand, typed functional programs areparticularly well-suited for automatic resource-bound analysis sincethe use of pattern matching and recursion often results in a relativelyregular code structure. Moreover, types provide detailed informationabout the shape of data structures. On the other hand, existingautomatic techniques for higher-order programs can only inferlinear bounds [41, 50]. Furthermore, techniques that can derivepolynomial bounds are limited to bounds that depend on predefinedlists and binary trees [29, 32] or integers [15, 48]. Finally, resourceanalyses for functional programs have been implemented for customlanguages that are not supported by mature tools for compilationand development [32, 34, 41, 47, 50].

    The goal of a long term research effort is to overcome theseobstacles by developing Resource Aware ML (RAML), a resource-aware version of the functional programming language OCaml.RAML is based on an automatic amortized resource analysis(AARA) that derives multivariate polynomials that are functionsof the sizes of the inputs. In this paper, we report on three maincontributions that are part of this effort.

    1. We present the first implementation of an AARA that is inte-grated with an industrial-strength compiler.

    2. We develop the first automatic resource analysis system thatinfers multivariate polynomial bounds that depend on sizeparameters of complex user-defined data structures.

    3. We present the first AARA that infers polynomial bounds forhigher-order functions.

    The techniques we develop are not tied to a particular resource butare parametric in the resource of interest. RAML infers tight boundsfor many complex example programs such as sorting algorithms withcomplex comparison functions, Dijkstra’s single-source shortest-path algorithm, and the most common higher-order functions suchas (sequences) of nested maps, and folds. The technique is naturallycompositional, tracks size changes of data across function bound-aries, and can deal with amortization effects that arise, for instance,from the use of a functional queue. Local inference rules generatelinear constraints and reduce bound inference to off-the-shelf LPsolving, despite deriving polynomial bounds.

    To ensure compatibility with OCaml’s syntax, we reuse the parserand type inference engine from Inria’s OCaml compiler. We extracta type-annotated syntax tree to perform (resource preserving) codetransformations and the actual resource-bound analysis. To preciselymodel the evaluation of OCaml, we introduce a novel operationalsemantics that makes the efficient handling of function closures in

    Draft 1 2015/7/27

  • Inria’s compiler explicit. The semantics is complemented by a newtype system that refines function types.

    To express a wide range of bounds, we introduce a novel classof multivariate resource polynomials that map data of a given typeto a non-negative number. These novel multivariate resource poly-nomials are a substantial generalization of the resource polynomialsthat have been previously defined for lists and binary trees [32]. Todeal with realistic OCaml code, we develop a novel multivariateAARA that handles higher-order functions. To this end, we drawinspirations from multivariate AARA for first-order programs [32]and linear AARA for higher-order programs [41]. However, our newsolution is more than the combination of existing techniques. Forinstance, we infer linear bounds for the curried append function forlists, which has not been possible previously [41].

    We performed experiments on more then 3000 lines of OCamlcode. While it is still not straightforward to automatically analyzecomplete existing applications, it is easy to develop and analyzereal OCaml applications if we keep the current capabilities of thesystem in mind. In Section 8, we present a case study in which weautomatically bound the number of queries that an OCaml programissues to Amazon’s DynamoDB NoSQL cloud database service.Such bounds are interesting since Amazon charges DynamoDBusers based on the number of queries made to a database.

    2. OverviewBefore we describe the technical development, we give a shortoverview of the challenges and achievements of our work.Currying and Function Closures. Currying and function closurespose a challenge to automatic resource analysis systems that hasnot been addressed in the past. To see why, assume that we wantto design a type system to verify resource usage. Now consider forexample the curried append function which has the type append :α list Ñ α list Ñ α list in OCaml. At first glance, we might saythat the time complexity of append is Opnq if n is the length of thefirst argument. But a closer inspection of the definition of appendreveals that this is a gross simplification. In fact, the complexity ofthe partial function call app par = append ` is constant. Moreover,the complexity of the function app par is linear—not in the lengthof the argument but in the length of the list ` that is captured in thefunction closure. We are not aware of any existing approach that canautomatically derive a worst-case time bound for the curried appendfunction. For example, previous AARA systems would fail withoutderiving a bound [32, 41].

    In Inria’s OCaml implementation, the situation is even morecomplex since the resource usage (time and space) depends on howa function is used at its call sites. If append is partially applied to oneargument then a function closure is created as expected. However—and this is one of the reasons of OCaml’s great performance—ifappend is applied to both of its arguments at the same time thenthe intermediate closure is not created and the performance of thefunction is even better than that of the curried version since we donot have to create a pair before the application.

    To model the resource usage of curried functions accurately werefine function types to capture how functions are used at their callsites. For example, append can have both of the following types

    α list Ñ α list Ñ α list and rα list, α lists Ñ α list .The first type implies that the function is partially applied and thesecond type implies that the function is applied to both argumentsat the same time. Of course, it is possible that the function has bothtypes (technically we achieve this using let polymorphism). For thesecond type, our system automatically derives tight time and spacebounds that are linear in the first argument. However, our systemfails to derive a bound for the first type. The reason is that we madethe design decision to not derive bounds that asymptotically depend

    on data captured in function closures to keep the complexity of thesystem at a manageable level.

    Fortunately, append belongs to a large set of OCaml functions inthe standard library that is defined in the from let rec f x y z = e. Ifsuch a function is partially applied, the only computation that hap-pens is the creation of a closure. As a result, eta expansion does notchange the resource behavior of programs. This means for examplethat we can safely replace the expression let app par = append ` in ewith the expression let app par x = append ` x in e prior the analy-sis. Consequently, we can always use the type rα list, α lists Ñα list of append that we can successfully analyze.

    The conditions under which functions can be analyzed mightlook complex at first but they can be boiled down to simple principle:

    The worst-case resource usage of a function must be express-ible as a function of the sizes of its arguments.

    Higher-Order Arguments. The other main challenge with higher-order resource analysis is functions with higher-order arguments. Toa large extend, this problem has been successfully solved for linearresource bounds in previous work [41]. Basically, the higher-ordercase is reduced to the first-order case if the higher-order argumentsare available. It is not necessary to reanalyze such higher-orderfunctions for every call site since we can abstract the resourceusage with a constraint system that has holes for the constraintsof the function arguments. However, a presentation of the systemin such a way mixes type checking with the constraint-based typeinference. Therefore, we chose to present the analysis system in amore declarative way in which the bound of a function with higher-order arguments is derived with respect to a given set of resourcebehaviors of the argument functions.

    A concrete advantage of our declarative view is that we canderive a meaningful type for a function like map for lists even whenthe higher-order argument is not available. The function map canhave the following types.

    pαÑ βq Ñ α list Ñ β list rαÑ β, α lists Ñ β listUnlike append, the resource usage of map does not depend on thesize of the first argument. So both types are equivalent in our systemexcept for the cost of creating an intermediate closure. If the higher-order argument is not available then previous systems [41] producea constraint system that is not meaningful to a user. An innovationin this work is that we are also able to report a meaningful resourcebound for map if the arguments are not available. To this end, weassume that the argument function does not consume resources. Forexample, we report in the case of map that the number of evaluationsteps needed is 11n ` 3 and the number of heap cells needed is4n`2 where n is the length of the input list. Such bounds are usefulfor two purposes. First, a developer can see the cost that map itselfcontributes to the total cost of a program. Second, the time bound formap proves that map is guaranteed to terminate if the higher-orderargument terminates for every input.

    In contrast, consider the function rec scheme : pα list Ñα listq Ñ α list Ñ β list that is defined as follows.

    let rec rec_scheme f l =match l with | [] Ñ []

    | x::xs Ñ rec_scheme f (f l);;let g = rec_scheme tail;;

    Here, RAML is not able to derive an evaluation-step bound forrec scheme since the number of evaluation steps (and even termina-tion) depends on the argument f . However, RAML derives the tightevaluation-step bound 12n` 7 for the function g.Polynomial Bounds and Inductive Types. Existing AARA sys-tems are either limited to linear bounds [34, 41] or to polynomialbounds that are functions of the sizes of simple predefined list and

    Draft 2 2015/7/27

  • let comp f x g = fun z Ñ f x (g z)

    let rec walk f xs =match xs with | [] Ñ (fun z Ñ z)

    | x::ys Ñ match x with | Left _ Ñfun y Ñ comp (walk f) ys (fun z Ñ x::z) y

    | Right l Ñlet x’ = Right (quicksort f l) infun y Ñ comp (walk f) ys (fun z Ñ x’::z) y

    let rev_sort f l = walk f l []

    RAML output for rev sort (after 0.68s run time):

    10 + 23*K*M + 32*L*M + 20*L*M*Y + 13*L*M*Y^2where

    M is the num. of ::-nodes of the 2nd comp. of the arg.L is the fraction of Right-nodes in the ::-nodes of

    the 2nd component of the argumentY is the maximal number of ::-nodes in the Right-nodes

    in the ::-nodes of the 2nd component of the arg.K is the fraction of Left-nodes in the ::-nodes of the

    2nd component of the argument

    Figure 1. Modified challenge example from Avanzini et al. [10]and shortened output of the automatic bound analysis performed byRAML for the function rev sort. The derived bound is a tight boundon the number of evaluation steps in the big-step semantics if we donot take into account the cost of the higher-order argument f.

    binary-tree data structures [32]. In contrast, this work presents thefirst analysis that can derive polynomial bounds that depend on sizeparameters of complex user-defined data structures.

    The bounds we derive are multivariate resource polynomials thatcan take into account individual sizes of inner data structures. Whileit is possible to simplify the resource polynomials in the user output,it is essential to have this more precise information for intermediateresults to derive tight whole-program bounds.

    In general, the resource bounds are built of functions that countthe number of specific tuples that one can form from the nodesin a tree-like data structure. In their simplest form (i.e., withoutconsidering the data stored inside the nodes), they have the form

    λa.|t~a | ai is an Aki -node in a and if i ă j then ai ăapre aju|

    where a is an inductive data structure with constructorsA1, . . . , Am,~a “ pa1, . . . , anq, and ăapre denotes the pre-order on the tree a.We are able to keep track of changes of these quantities in patternmatches and data construction fully automatically by generatinglinear constraints. At the same time, they allow us to accuratelydescribe the resource usage of many common functions in thesame way it has been done previously for simple types [28]. Asan interesting special case, we can also derive conditional boundsthat describe the resource usage as a conditional statement. Forinstance, for an expression such as

    match x with | True Ñ quicksort y | False Ñ y

    we derive a bound that is quadratic in the length of y if and only if xis True.

    Effects. Our analysis handles references and arrays by ensuringthat resource cost does not asymptotically depend on values thathave been stored in mutable cells. While it has been shown thatit is possible to extend AARA to handle mutable state [17], wedecided not to add the feature in the current system to focus onthe presentation of the main contributions. There are still a lot ofpossible interactions with mutable state, such as storing functions inreferences.

    Example Bound Analysis. To demonstrate some of the capabili-ties of the new analysis system, Figure 1 shows the output of RAMLfor a concrete example. The code is an adoption of a challengingexample that has been recently presented by Avanzini et al. [10] asa function that can not be handled by existing tools. To illustrate thechallenges of resource analysis for higher-order programs, Avanziniet al. implemented a (somewhat contrived) reverse function rev forlists using higher-order functions. RAML can automatically derivea tight linear bound on the number of evaluation steps used by rev.

    To show more features of our analysis, we modified Avanzini etal.’s rev in Figure 1 by adding an additional argument f and a patternmatch to the definition of the function walk. The resulting type ofwalk ispαÑ αÑ boolq Ñ rpβ ˚ α listq either list; pβ ˚ α listq either lists

    Ñ pβ ˚ α listq either listLike before the modification, walk is essentially the append reversefunction for lists. However, we assume that the input lists containnodes of the form Left a or Right b so that b is a list. During thereverse process of the first list in the argument, we sort each list thatis contained in a Right-node using the standard implementation ofquick sort (not given here). RAML derives the tight evaluation-stepbound that is shown in Figure 1. Since the comparison function forquicksort (argument f) is not available, RAML assumes that it doesnot consume any resources during the analysis. If rev sort is appliedto a concrete argument f then the analysis is repeated to derive abound for this instance.

    3. Setting the StageWe describe and formalize the new resource analysis using CoreRAML, a subset of the intermediate language that we use to performthe analysis. Expressions in Core RAML are in share-let-normalform, which means that syntactic forms allow only variables insteadof arbitrary terms whenever possible without restricting expressivity.We automatically transform user-level OCaml programs to CoreRAML without changing their resource behavior before the analysis.Syntax. For the purpose of this article, the syntax of Core RAMLexpressions is defined by the following grammar. The actual coreexpressions also contain constants and operators for primitivedata types such as integer, float, and boolean; arrays and built-inoperations for arrays; conditionals; and free versions of syntacticforms. These free versions are semantically identical to the standardversions but do not contribute to the resource cost. This is neededfor the resource preserving translation of user-level code to share-let-normal form.

    e ::“ x | x x1 ¨ ¨ ¨xn | C x | λx.e | ref x | !x | x1 :“ x2| matchxwithC y Ñ e1 | e2| px1, . . . , xnq | matchxwith px1, . . . , xnq Ñ e| sharex as px1, x2q in e | letx “ e1 in e2 | let rec F in e

    F ::“ f “ λx.e | F1 andF2The syntax contains forms for variables, function application, dataconstructors, lambda abstraction, references, tuples, pattern match-ing, and (recursive) binding. For simplicity, we only allow recursivedefinitions of functions. In the function application we allow theapplication of several arguments at once. This is useful to staticallydetermine the cost of closure creation but also introduces ambigu-ity. The type system will determine if an expression like f x1 x2is parsed as pf x1 x2q or pf x1q x2. The sharing expressionssharex as px1, x2q in e is not standard and used to explicitly intro-duce multiple occurrences of a variable. It binds the free variablesx1 and x2 in e.

    We focus on this set of language features since it is sufficient topresent the main contributions of our work. We sometimes take the

    Draft 3 2015/7/27

  • V pxq “ `¨, V,H M$x ó p`,Hq |M var

    (E:VAR)S ‰ ¨ HpV pxqq “ pλx.e, V 1q S, V 1, H M$λx.e ó w | pq, q1q

    S, V,H M$x ó w |M var¨pq, q1q(E:VARAPP)

    S, V,H M$ e ó ˝ | 0(E:ABORT)

    V px1q:: ¨ ¨ ¨ ::V pxnq, V,H M$x ó w | pq, q1q S “ ¨ _ w P tK, ˝uS, V,H M$x x1 ¨ ¨ ¨xn ó w |M appn ¨pq, q1q

    (E:APP)

    S ‰ ¨ V px1q:: ¨ ¨ ¨ ::V pxnq, V,H M$x ó pλx.e, V 1q | pq, q1q S, V 1, H M$λx.e ó w | pp, p1qS, V,H M$x x1 ¨ ¨ ¨xn ó w |M appn ¨pq, q1q¨pp, p1q

    (E:APPAPP)

    S, V rx ÞÑ `s, H M$ e ó w | pq, q1q`::S, V,H M$λx.e ó w |Mbind¨pq, q1q

    (E:ABSBIND)¨, V,H M$ e1 ó w | pq, q1q w P tK, ˝uS, V,H M$ letx “ e1 in e2 ó w |M let1 ¨pq, q1q

    (E:LET1)

    H 1 “ H, ` ÞÑ pλx.e, V q¨, V,H M$λx.e ó p`,H 1q |M abs

    (E:ABSCLOS)¨, V,H M$ e1 ó p`,H 1q | pq, q1q S, V rx ÞÑ `s, H 1M$ e2 ó w | pp, p1q

    S, V,H M$ letx “ e1 in e2 ó w |M let1 ¨pq, q1q¨M let2 ¨pp, p1q¨w(E:LET2)

    F fi f1 “ λx1.e1 and ¨ ¨ ¨ and fn “ λxn.en V 1 “ V rf1 ÞÑ `1, . . . , fn ÞÑ `nsH 1 “ H, `1 ÞÑ pλx1.e1, V 1q, . . . , `n ÞÑ pλxn.en, V 1q S, V 1, H 1M$ e0 ó w | pq, q1q

    S, V,H M$ let recF in e0 ó w |M rec¨pq, q1q¨w(E:LETREC)

    Figure 2. Selected rules of the operational big-step semantics.

    liberty to describe examples in user level syntax and to use featuressuch as built-in data types that are not described in this article.

    Big-Step Operational Cost Semantics. The resource usage ofRAML programs is defined by a big-step operational cost semantics.The semantics has three interesting non-standard features. First, itmeasures (or defines) the resource consumption of the evaluationof an RAML expression by using a resource metric that defines aconstant cost for each evaluation step. If this cost is negative thenresources are returned. Second, it models terminating and divergingexecutions by inductively describing finite subtrees of infiniteexecution trees. Third, it models OCaml’s stack-based mechanismfor function application, which avoids creation of intermediatefunction closures.

    The semantics of Core RAML is formulated with respect to astack (to store arguments for function application), an environment,and a heap. Let Loc be an infinite set of locations modeling memoryaddresses. A heap is a finite partial mapping H : Loc á Val thatmaps locations to values. An environment is a finite partial mappingV : Var á Loc from variable identifiers to locations. An argumentstack S ::“ ¨ | `::S is a finite list of locations. The set of RAMLvalues Val is given by

    v ::“ ` | p`1, . . . , `kq | pλx.e, V q | pC, `q

    A value v P Val is either a location ` P Loc, a tuple of locationsp`1, . . . , `kq, a function closure pλx.e, V q, or a node of a datastructure pC, `q where C is a constructor and ` is a location. In afunction closure pλx.e, V q, V is an environment, e is an expression,and x is a variable.

    Since we also consider resources like memory that can becomeavailable during an evaluation, we have to track the watermarkof the resource usage, that is, the maximal number of resourceunits that are simultaneously used during an evaluation. To derivea watermark of a sequence of evaluations from the watermarks ofthe sub evaluations one has to also take into account the number ofresource units that are available after each sub evaluation.

    The big-step operational evaluation rules in Figure 2 are formu-lated with respect to a resource metric M . They define an evaluation

    judgment of the form

    S, V,H M$ e ó p`,H 1q | pq, q1q .It expresses the following. If the argument stack S, the environmentV , and the initial heap H are given then the expression e evaluatesto the location ` and the new heap H 1. The evaluation of e needsq P Q`0 resource units (watermark) and after the evaluation there areq1 P Q`0 resource units available. The actual resource consumptionis then δ “ q ´ q1. The quantity δ is negative if resources becomeavailable during the execution of e.

    There are two other behaviors that we have to express in thesemantics: failure (i.e., array access outside array bounds) anddivergence. To this end, our semantic judgement not only evaluatesexpressions to values but also to an error K and to incompletecomputations expressed by ˝. The judgement has the general formS, V,H M$ e ó w | pq, q1q where w ::“ p`,Hq | K | ˝ .

    Intuitively, this evaluation statement expresses that the watermark ofthe resource consumption after some number of evaluation steps is qand there are currently q1 resource units left. A resource metricM : K ˆ N Ñ Q defines the resource consumption in eachevaluation step of the big-step semantics where K is a set ofconstants. We write Mkn for Mpk, nq and Mk for Mpk, 0q.

    It is handy to view the pairs pq, q1q in the evaluation judgmentsas elements of a monoid Q “ pQ`0 ˆQ`0 , ¨q. The neutral elementis p0, 0q, which means that resources are neither needed beforethe evaluation nor returned after the evaluation. The operationpq, q1q ¨ pp, p1q defines how to account for an evaluation consistingof evaluations whose resource consumptions are defined by pq, q1qand pp, p1q, respectively. We define

    pq, q1q ¨ pp, p1q “"

    pq ` p´ q1, p1q if q1 ď ppq, p1 ` q1 ´ pq if q1 ą p

    If resources are never returned (as with time) then we only haveelements of the form pq, 0q and pq, 0q ¨ pp, 0q is just pq ` p, 0q.We identify a rational number q with an element of Q as follows:q ě 0 denotes pq, 0q and q ă 0 denotes p0,´qq. This notationavoids case distinctions in the evaluation rules since the constantsK that appear in the rules can be negative. In the semantic rules

    Draft 4 2015/7/27

  • we use the notation H 1 “ H, ` ÞÑ v to indicate that ` R dompHq,dompH 1q “ dompHq Y t`u, H 1p`q “ v, and H 1pxq “ Hpxq forall x ‰ `.

    To model the treatment of function application in Inria’s OCamlcompiler, we use a stack S on which we store the locations offunction arguments. The only rules that push locations to S areE:APP and E:APPAPP. To pop locations from the stack we modifythe leaf rules that can return a function closure, namely, the rulesE:VAR and E:ABS for variables and lambda abstractions: Wheneverwe would return a function closure pλx.e, V q we inspect theargument stack S. If S contains a location ` then we pop it formthe stack S, bind it to the argument x, and evaluate the functionbody e in the new environment V rx ÞÑ `s. This is defined by therule E:ABSBIND and indirectly by the rule E:VARAPP. Anotherrule that modifies the argument stack is E:LET2. Here, we evaluatethe subexpression e1 with an empty argument stack because thearguments on the stack when evaluating the let expressions areconsumed by the result of the evaluation of e2.

    The argument stack accurately captures Inria’s OCaml compiler’sbehavior to avoid the creation of intermediate function closures. Italso extends naturally to the evaluation of expressions that are notin share-let-normal form. As we will see in Section 6, the argumentstack is also necessary to prove the soundness of the multivariateresource bound analysis.

    Another important feature of the big-step semantics, is that itcan model failing and diverging evaluations by allowing partialderivation judgments that can be used to derive the resource usageafter n steps. Technically, this is realized by the rule E:ABORTwhich can be applied at any point to abort the current evaluationwithout additional resource cost. The mechanism of abording anevaluation is most visible in the rules E:LET1 and E:LET2: Duringthe evaluation of a let expression we have two possibilties. The firstpossibility is that the evaluation of the subexpression e1 is abortedusing E:ABORT at some point. We can then apply the rule E:LET1to pass on the resource usage before the abort. The second possibilityis that e1 evaluates to a location `. We can then apply the E:LET2to bind ` to the variable x and evaluate the expression e2.

    4. Simple Type SystemIn this section, we introduce a type system that is a refinement ofOCaml’s type system. In this type system, we mirror the resource-aware type system and introduce some particularities that explainfeatures of the resource-aware types. For the purpose of this article,we define simple types as follows.

    T ::“ X | T ref | T1 ˚ ¨ ¨ ¨ ˚ Tn | rT1, . . . , Tns Ñ T| µX. 〈C1 : T1˚Xn1 , . . . , Ck : Tk˚Xnk 〉

    A (simple) type T is an uninterpreted type variable X P X , atype T ref of references of type T , a tuple type T1 ˚ ¨ ¨ ¨ ˚ Tn,a function type rT1, . . . , Tns Ñ T , or an inductive data typeµX. 〈C1 : T1˚Xn1 , . . . , Ck : Tk˚Xnk 〉.

    Two parts of this definition are non-standard and deserve furtherexplanation. First, bracket function types rT1, . . . , Tns Ñ T cor-respond to the standard function type T1 Ñ ¨ ¨ ¨ Ñ Tn Ñ T . Themeaning of rT1, . . . , Tns Ñ T is that the function is applied to itsfirst n arguments at the same time. The type T1 Ñ ¨ ¨ ¨ Ñ Tn Ñ Tindicates that the function is applied to its first n arguments one afteranother. These two uses of a function can result in a very differentresource behavior. For instance, in the latter case we have to createn´ 1 function closures. Also we have n different costs to accountfor: the evaluation cost after the first argument is present, the cost ofthe closure when the second argument is present, etc. Of course, it ispossible that a function is used in different ways in program. We ac-count for that with let polymorphism (see the following subsection).

    Also note that rT1, . . . , Tns Ñ T still describes a higher-order func-tion while T1 ˚ ¨ ¨ ¨ ˚ Tn Ñ T describes a first-order function withn arguments.

    Second, inductive types are required to have a particular form.This makes it possible to track cost that depends on size parametersof values of such types. It is of course possible to allow arbritaryinductive types and not to track such cost. Such an extension isstraighforward and we do not present it in this article.

    We assume that each constructor C P C is part of at mostone recursive type. Furthermore we assume that each recursivetype has at least one constructor. For an inductive type T “µX. 〈C1 : T1˚Xn1 , . . . , Ck : Tk˚Xnk 〉 we sometimes write T “〈C1 : pT1, n1q, . . . , Ck : pTk, nkq〉. We say that Ti is the node typeand ni is the branching number of the constructor Ci. The maximalbranching number n “ maxtn1, . . . , nku of the constructors is thebranching number of T .

    Let Polymorphism and Sharing. Modelling the design of theresource-aware type system, our simple type system is affine. Thatmeans that a variable in a context can be used at most once in anexpression. However, we enable multiple uses of a variable with thesharing expression sharex as px1, x2q in e that denotes that x canbe used twice in e using the (different) names x1 and x2. For inputprograms we allow multiple uses of a variable x an expression ein RAML. We then introduce sharing constructs, and replace theoccurrences of x in e with the new names before the analysis.

    Interestingly, this mechanism is closely related to let polymor-phism. To see this relation, first note that our type system is poly-morphic but that a value can only be used with a single type in anexpression. In practice, that would mean for instance that we have todefine a different map function for every list type. A simple and well-known solution to this problem that is often applied in practice islet polymorphism. In principle, let polymorphism replaces variableswith their definitions before type checking. For our map function itwould mean to type the expression rmap ÞÑ emapse instead of typingthe expression let map “ emap in e.

    In principle, it would be possible to treat sharing of variables in asimilar way as let polymorphism. But if we start form an expressionletx “ e1 in e2 and replace the occurrences of x in the expressione2 with e1 then we also change the resource consumption of theevaluation of e2 because we evaluate e1 multiple times. Interestingly,this problem coincides with the treatment of let polymorphism forexpressions with side effects (the so called value restriction).

    In RAML, we support let polymorphism for function closuresonly. Assume we have a function definition let f “ λx.ef in e thatis used twice in e. Then the usual approach to enable the analysis inour system would be to use sharing

    let f “ λx.ef in share f as pf1, f2q in e1 .

    To enable let polymorphism, we will however define f twice andensure that we only pay once for the creation of the closure and thelet binding:

    let f1 “ λx.ef in let f2 “ λx.ef in e1

    The functions f1 and f2 can now have different types. This methodcan cause an exponential blow up of the size of the expression. Itis nevertheless appealing because it enables us to treat resourcepolymorphism in the same way as let polymorphism.

    Type Judgements. Type judgements have the form

    Σ; Γ $ e : T

    where Σ “ T1, . . . , Tn is a list of types, Γ : Var á T is a typecontext that maps variables to types, e is a core expression, and Tis a (simple) type. The intuitive meaning (which is formalized laterin this section) is as follows. Given an evaluation environment that

    Draft 5 2015/7/27

  • ¨;x : T $ x : T (T:VAR) Σ;x : Σ Ñ T $ x : T (T:VARPUSH) ¨;x : rT1, . . . , TnsÑT, x1:T1, . . . , xn:Tn $ xx1 ¨ ¨ ¨xn : T(T:APP)

    Σ; Γ, x:T1 $ e : T2T1::Σ; Γ $ λx.e : T2

    (T:ABSPUSH)Σ;x : rT1, . . . , TnsÑΣÑT, x1 : T1, . . . , xn : Tn $ xx1 ¨ ¨ ¨xn : T

    (T:APPPUSH)

    Σ; Γ $ λx.e : T¨; Γ $ λx.e : Σ Ñ T (T:ABSPOP)

    T “ µX. 〈. . . C : U˚Xn . . .〉¨;x : U˚Tn $ C x : T (T:CONS)

    ¨; Γ1 $ e1 : T1 Σ; Γ2, x : T1 $ e2 : T2Σ; Γ1,Γ2 $ letx “ e1 in e2 : T2

    (T:LET)

    Σ; Γ $ e : BΣ; Γ, x:A $ e : B (T:WEAK)

    F fi f1 “ λx1.e1 and ¨ ¨ ¨ and fn “ λxn.en∆ “ f1 : T1, . . . , fn : Tn @i : ¨; Γi,∆ $ λxi.ei : Ti Σ; Γ0,∆ $ e : T

    Σ; Γ0, . . . ,Γn $ let recF in e : T(T:LETREC)

    Figure 3. Selected rules of the simple affine type system.

    matches the type context Γ and an argument stack that matches thetype stack Σ then e evaluates to a value of type T .

    The most interesting feature of the type judgements is thehandling of bracket function types rT1, . . . , Tns Ñ T . Even thoughfunction types can have multiple forms, a well-typed expressionhas often a unique type (in a given type context). This type isderived from the way a function is used. For instance, we haveλf.λx.λy.f x y : prT1, T2s Ñ T q Ñ T1 Ñ T2 Ñ T andλf.λx.λy.pf xq y : pT1 Ñ T2 Ñ T q Ñ T1 Ñ T2 Ñ T , andthe two function types are unique.

    A type T of an expression e has a unique type derivation thatproduces a type judgement ¨,Γ $ e : T with an empty typestack. We call this canonical type derivation for e and a closedtype judgement. If T is a function type Σ Ñ T 1 then there is asecond type derivation for e that we call an open type derivation. Itderives the open type judgement Σ; Γ $ e : T 1 where |Σ| ą 0. Thefollowing lemma can be proved by induction on the type derivations.

    Lemma 1. ¨; Γ $ e : Σ Ñ T if and only if Σ; Γ $ e : T .

    Open and canonical type judgements are not interchangeable.An open type judgement Σ; Γ $ e : T can only appear in aderivation with an open root of the form Σ1,Σ; Γ $ e : T , orin a subtree of a derivation whose root is a closed judgement of theform ¨; Γ $ e : Σ2,Σ Ñ T where |Σ2| ą 0. In other words, in anopen derivation Σ; Γ $ e : T , the expression e is a function thathas to be applied to n ą |Σ| arguments at the same time. In a giventype context and for a fixed function type, a well-typed expressionhas as most one open type derivation.

    Type Rules. Figure 3 presents selected type rules of the typesystem. As usual Γ1,Γ2 denotes the union of the type contexts Γ1and Γ2 provided that dompΓ1q X dompΓ2q “ H. We thus have theimplicit side condition dompΓ1qXdompΓ2q “ H whenever Γ1,Γ2occurs in a typing rule. Especially, writing Γ “ x1:T1, . . . , xk:Tkmeans that the variables xi are pairwise distinct.

    There is a close correspondence between the evaluation rules andthe type rules in the sense that every evaluation rule corresponds toexactly one type rule. (We view the two rules for pattern match andlet binding as one rule, respectively.) The type stack is modified bythe rules T:VARPUSH, T:APPPUSH, T:ABSPUSH, and T:ABSPOP.For every leaf rule that can return a function type, such as T:VAR,T:APP, and T:APPPOP, we add a second rule that derives theequivalent open type. The reason becomes clear in the resource-aware type system in Section 6. The rules that directly control theshape of the function types are T:ABSPUSH and T:ABSPOP forlambda abstraction. While the other rules are (deterministically)syntax driven, the rules for lambda abstraction introduce a non-

    deterministic choice. However, there is often only one possiblechoice depending on how the abstracted function is used.

    As mentioned, the type system is affine and every variable in acontext can at most be used once in the typed expression. Multipleuses have to be introduced explicitly using the rule T:SHARE. Theonly exception is the rule T:LETREC. Here we allow the use of thecontext ∆ in the body of all defined functions. The reason for this isapparent in the resource aware version: sharing of function types isalways possible without any restrictions.

    Well-Formed Environments. For each simple type T we induc-tively define a set JT K of values of type T . Our goal here is not toadvance the state of the art in denotational semantics but rather tocapture the tree structure of data structures stored on the heap. Tothis end, we distinguish mainly inductive types (possible inner nodesof the trees) and other types (leaves). For the formulation of typesoundness, we also require that function closures are well-formed.We simply interpret polymorphic data with the set of locations Loc.

    JXK “ LocJT refK “ tRpaq | a P JT Ku

    JΣ Ñ T K “ tpλx.e, V q | DΓ : H ( V :Γ^ ¨; Γ $ λx.e : ΣÑT uJT1 ˚¨ ¨ ¨˚ TnK “ JT1Kˆ ¨ ¨ ¨ ˆ JTnK

    JBK “ TrpBq if B “ 〈C1:pT1, n1q, . . . , Cn:pTk, nkq〉

    Here, T “ Trp〈C1:pT1, n1q, . . . , Cn:pTk, nkq〉q is the set of treesτ with node labels C1, . . . , Ck which are inductively defined asfollows. If i P t1, . . . , ku, ai P JTiK, and τj P T for all 1 ď j ď nithen Cipai, τ1, . . . , τniq P T .

    If H is a heap, ` is a location, A is a type, and a P JAK thenwe write H ( ` ÞÑ a :A to mean that ` defines the semanticvalue a P JAK when pointers are followed in H in the obvious way.The judgment is formally defined in Figure 4. For a heap H theremay exist different semantic values a and simple types A such thatH ( ` ÞÑ a :A . However, if we fix a simple type A and a heap Hthen there exists at most one value a such that H ( ` ÞÑ a :A .

    We write H ( ` :A to indicate that there exists a, necessarilyunique, semantic value a P JAK so that H ( v ÞÑ a :A . Anenvironment V and a heap H are well-formed with respect to acontext Γ if H ( V pxq : Γpxq holds for every x P dompΓq. Wethen writeH ( V : Γ. Similarly, an argument stack S “ `1, . . . , `nis well-formed with respect to a type stack Σ “ T1, . . . , Tn in heapH , written H ( S : Σ, if H ( `i :Ti for all 1 ď i ď n.

    Note that the rules in Figure 4 are interpreted coinductively. Thereason is that in the rule V:FUN, the location ` can be part of theclosure environment V if the closure has been created with the ruleE:LETREC. The influence of the coinductive definition on the proofsis minimal since all proofs in this article are by induction.

    Draft 6 2015/7/27

  • X P X ` P dompHqH ( ` ÞÑ ` : X

    (V:TVAR)

    Hp`q “ `1 H ( `1 ÞÑ a : TH ( ` ÞÑ Rpaq : T ref

    (V:REF)

    Hp`q “ pλx.e, V qDΓ : H ( V : Γ ^ ¨; Γ $ λx.e : Σ Ñ T

    H ( ` ÞÑ pλx.e, V q : Σ Ñ T(V:FUN)

    Hp`q “ p`1, . . . , `nq @i : H ( `i ÞÑ ai : TiH ( ` ÞÑ pa1, . . . , anq : T1 ˚ ¨ ¨ ¨ ˚ Tn

    (V:TUPLE)

    B “ µX. 〈. . . , C : T˚Xn, . . .〉Hp`q “ pC, `1q H ( `1 ÞÑ pa, b1, . . . , bnq : T˚Bn

    H ( ` ÞÑ Cpa, b1, . . . , bnq : B(V:CONS)

    Figure 4. Coinductively relating heap cells to semantic values.

    Type Preservation. Theorem 1 shows that the evaluation of a well-typed expression in a well-formed environment results in a well-formed environment.

    Theorem 1. If Σ; Γ $ e : T , H ( V : Γ, H ( S : Σ, andS, V,H M$ e ó p`,H 1q | pq, q1q then H 1 ( V : Γ, H 1 ( S : Σ,and H 1 ( ` : T .

    Theorem 1 is proved by induction on the evaluation judgement.

    5. Multivariate Resource PolynomialsIn this section we define the set of resource polynomials which is asearch space of our automatic resource bound analysis. A resourcepolynomial p : JT K Ñ Q`0 maps a semantic value of some simpletype T to a non-negative rational number.

    An analysis of typical polynomial computations operating ona list ra1, . . . , ans shows that it consists of operations that areexecuted for every k-tuple pai1 , . . . , aik q with 1 ď i1 ă ¨ ¨ ¨ ăik ď n. The simplest examples are linear map operations thatperform some operation for every ai. Other common examples aresorting algorithms that perform comparisons for every pair pai, ajqwith 1 ď i ă j ď n in the worst case.

    In this article, we generalize this observation to user-defined tree-like data structures. In lists of different node types with constructorsC1, C2 and C3, a linear computation is for instance often carriedout for all C1-nodes, all C2-nodes, or all C1 and C3 nodes. Ingeneral, a typical polynomial computation is carried out for alltuples pa1, . . . , akq such that ai is a list element with constructorCj for some j and ai appears in the list before ai`1 for all i.

    As in previous work, which considered binary trees, we willessentially interpret all tree-like data structures as lists with differentnodes by flattening them in pre-order. As a result, our resourcepolynomials only depend on the number of nodes of a certain kindin tree but not on structural measures like the height of the tree.To include the height into resource polynomials in a general way,we would need a way to express a maximum (or a choice) in theresource polynomials. We leave this for future research in favorof compositionality and modularity. In practice, it is useful thatthe potential of a data structure is invariant under changes in thestructure of the tree.Base Polynomials and Indices. In Figure 5, we define for eachsimple type T a set PpT q of functions p : JT K Ñ N that map valuesof type T to natural numbers. The resource polynomials for type Tare then given as non-negative rational linear combinations of thesebase polynomials.

    λa. 1 P PpT q@i : pi P PpTiq

    λ~a.ź

    i“1,...,kpipaiq P PpT1 ˚ ¨ ¨ ¨ ˚ Tkq

    B “ 〈C1 : pT1, n1q, . . . , Cm : pTk, nmq〉C “ rCj1 , . . . , Cjk s @i : pi P PpTjiqλ b.

    ÿ

    ~aPτBpC,bq

    ź

    i“1,...,kpipaiq P PpBq

    Figure 5. Defining the set PpT q of base polynomials for type T .

    ˚ P IpT q@j : Ij P IpTjq

    pI1, . . . , Ikq P IpT1 ˚ ¨ ¨ ¨ ˚ Tkq

    B “ 〈C1 : pT1, n1q, . . . , Ck : pTm, nmq〉 @i : Iji P IpTjiqr〈I1, Cj1〉 , . . . , 〈Ik, Cjk 〉s P IpBq

    Figure 6. Defining the set IpT q of indices for type T .

    Let B “ 〈C1 : pT1, n1q, . . . , Ck : pTk, nmq〉 be an inductivetype. Let C “ rCj1 , . . . , Cjk s be a list of B-constructors andb P JBK. We inductively define a set τBpC, bq of k-tuples asfollow: τBpC, bq is the set of k-tuples pa1, . . . , akq such thatCj1pa1,~b1q, . . . , Cjk pak,~bkq are nodes in the tree b P JBK andCj1pa1,~b1q ăpre ¨ ¨ ¨ ăpre Cjk pak,~bkq for the pre-order ăpre on b.

    Like in the lambda calculus, we use the notation λa. epaq for theanonymous function that maps an argument a to the natural numberthat is defined by the expression epaq. Every set PpT q contains theconstant function λa. 1. In the case of an inductive data type B thisarises also for C “ rs (one element sum, empty product).

    In Figure 6, we indicatively define for each simple type T a setof indices IpT q. For tuple types T1 ˚ ¨ ¨ ¨ ˚ Tk we identify the index‹ with the index p‹, . . . , ‹q. Similarly, we identify the index ‹ withthe index rs for inductive types.

    Let T be a base type. For each index i P IpT q, we define a basepolynomial pi : JT K Ñ N as follows.

    p‹paq “ 1ppI1,...,Ikqpa1, . . . , akq “

    ź

    j“1,...,kpIj pajq

    pr〈I1,C1〉,...,〈Ik,Ck〉spbq “ÿ

    ~aPτBprC1,...,Cks,bq

    ź

    j“1,...,kpIj pajq

    Examples. To illustrate the definitions, we construct the set ofbase polynomials for different data types. It is handy to use the unittype that is treated like a type variable X in the previous definitionsbut only has a single semantic value, that is, JunitK “ tpqu.

    We first consider the inductive type singleton that has only oneconstructor without arguments.

    singleton “ µX 〈Nil : unit〉

    Then we have JsingletonK “ tNilppqqu and Ppsingletonq “tλa. 1, λ a. 0u. To see why, we first examine the set of tuplesT pCq “ τsingletonpC,Nilppqqq for different list of constructors C. If|C| ą 1 then T pCq “ H because the tree Nilppqq does not containany tuples of size 2. Thus we have pr〈I1,C1〉,...,〈Ik,Ck〉spNilppqqq “0 in this case (empty sum). The only list of remaining construc-tor lists C are rs and r〈‹,Nil〉s. As always prspNilppqqq “ 1

    Draft 7 2015/7/27

  • (singleton sum). Furthermore pr〈‹,Nil〉spNilppqqq “ 1 becauseτsingletonpr〈‹,Nil〉s,Nilppqqq “ tNilppqqu and P (unit) = {λa. 1}.

    Let us now consider the usual sum type

    sumpT1, T2q “ µX 〈Left : T1,Right : T2〉 ; .Then JsumpT1, T2qK “ tLeftpaq | a P JT1Ku Y tRightpbq | b PJT2Ku. If we define

    σCppqpC 1paqq"

    ppaq if C “ C 10 otherwise

    then PpsumpT1, T2qq “ tx ÞÑ 1, x ÞÑ 0uYtσLeftppq | p P PpT1quY tσRightppq | p P PpT2qu.

    The next example is the list type

    listpT q “ µX 〈Cons : T˚X,Nil : unit〉 .Then JlistpT qK “ tNilppqq,Conspa1,Nilppqqq, . . .u and we writeJlistpT qK “ trs, ra1s, ra1, a2s, . . . | ai P JT Ku. It holds thatτlistpr〈‹,Cons〉s, ra1, . . . , ansq “ ta1, . . . , anu and moreoverτlistpr〈‹,Cons〉 , 〈‹,Cons〉s, ra1, . . . , ansq “ tpai, ajq | 1 ď i ăj ď nu. More general, τlistpC, ra1, . . . , ansq “ tpai1 , . . . , aik q |1 ď i1 ă ¨ ¨ ¨ ă ik ď nu if C “ r〈‹,Cons〉 , . . . , 〈‹,Cons〉s orC “ r〈‹,Cons〉 , . . . , 〈‹,Cons〉 , 〈‹,Nil〉s for lists of length k andk`1, respectively. On the other hand, τlistpD, ra1, . . . , ansq “ H ifD “ 〈‹,Nil〉 ::D1 for someD1 ‰ rs. Since

    ř

    ~aPτlistpC,ra1,...,ansq 1 “`

    nk

    ˘

    and λa. 1 P PpT q we have tλ b.`|b|n

    ˘

    | n P Nu Ă PplistpT qq.Finally consider a list type with two different Cons-nodes

    list2pT1, T2q “ µX 〈C1 : T1˚X,C2 : T2 ˚X,Nil : unit〉 .Then Jlist2pT qK “ trs, ra1s, ra1, a2s, . . . | ai P tC1, C2u ˆ JT Ku.We furthermore have τlist2pr〈‹,C1〉s, rb1, . . . , bnsq “ tb1, . . . , bn |@iDa : bi “ pC1, aqu and τlist2pr〈‹,C1〉 , 〈‹,C2〉s, rb1, . . . , bnsq “tpbi, bjq | @i, jDa, a1 : bi “ pC1, aq ^ bj “ pC2, a1q ^ 1 ď i ăj ď nu. Let C “ r〈‹,Cons〉 , . . . , 〈‹,Cons〉s and |C| “ k. Itfurthermore holds that

    ř

    ~aPτlist2pC,bq 1 “`|b|C1k

    ˘

    where |b|C1 de-notes the number of C1-nodes in the list b. Therefore we havetλ b.

    `|b|C1n

    ˘

    | n P Nu Ă Pplist2pT qq.Coinductive types like streampT q “ µX 〈St : T˚X〉 are not

    inhabited in our language since we interpret them inductively. Adata structure of such a type cannot be created since we allowrecursive definitions only for functions.

    Spurious Indices. The previous examples illustrate that for someinductive data structures, different indices encode the same re-source polynomial. For example, for the type listpT q we havepr〈‹,Nil〉spaq “ prspaq “ 1 for all lists a. Additionally, some in-dices encode a polynomial that is constantly zero. For the typelistpT q this is for example the case for p〈‹,Nil〉 ::C if |C| ą 0. Wecall such indices spurious.

    In practice, it is not beneficial to have spurious indices in the in-dex sets since they slow down the analysis without being useful com-ponents of bounds. It is straightforward to identify spurious indicesfrom the data type definition. The index r〈I1, C1〉 , . . . , 〈Ik, Ck〉s isfor example spurious if k ą 1 and the branching number of Ci is 0for an i P t1, . . . , k ´ 1u.Resource Polynomials. A resource polynomial p : JT K Ñ Q`0for a simple type T is a non-negative linear combination of basepolynomials, i.e.,

    p “ÿ

    i“1,...,mqi ¨ pi

    for m P N, qi P Q`0 and pi P PpT q. We write RpT q for the set ofresource polynomials for the base type T .

    Selecting a Finite Index Set. Every resource polynomial is de-fined by a finite number of base polynomials. In an implementation,

    we also have to fix a finite set of indices to make possible an effec-tive analysis. The selection of the indices to track can be customizedfor each inductive data type and for every program. However, wecurrently allow the user only to select a maximal degree of thebounds and then track all indices that correspond to polynomials ofthe same or a smaller degree.

    6. Resource-Aware Type SystemIn this section, we describe the resource aware type system. Es-sentially, we annotate the simple type system from Section 4 withresource annotations so that type derivations correspond to proofsof resource bounds.

    Type Annotations. We use the indexes and base polynomials todefine type annotations and resource polynomials.

    A type annotation for a simple type T is defined to be a family

    QT “ pqIqIPIpT q with qI P Q`0We write QpT q for the set of type annotations for the type T .

    An annotated type is a pair pA,Qq where Q is a type annotationfor the simple type |A| where A and |A| are defined as follows.

    A ::“ X | A ref | A1 ˚ ¨ ¨ ¨ ˚An | 〈rA1, . . . , Ans Ñ B,F〉| µX. 〈C1 : A1˚Xn1 , . . . , Ck : Ak˚Xnk 〉

    We define |A| to be the simple type T that can be obtained from Aby removing all type annotations from function types.

    A function type 〈rA1, . . . , Ans Ñ B,F〉 is annotated with a setF Ď tpQA, QBq | QA P Qp|A1 ˚ ¨ ¨ ¨ ˚ An|q ^ QB P Qp|B|qu.The set F potentially contains multiple valid resource annotationsfor arguments and the result of the function.

    Potential of Annotated Types and Contexts. Let pA,Qq be anannotated type. LetH be a heap and let v be a value withH ( v ÞÑa : |A|. Then the type annotation Q defines the potential

    ΦHpv : pA,Qqq “ÿ

    IPIpT qqI ¨ pIpaq

    Usually, we define type annotations Q by only stating the values ofthe non-zero coefficients qI .

    If a P J|A|K and Q P Qp|A|q is a type annotation then we alsowrite Φpa : pA,Qqq for

    ř

    I qI ¨ pIpaq.For use in the type system we need to extend the definition of

    resource polynomials to type contexts and stacks. We treat themlike tuple types. Let Γ “ x1:A1, . . . , xn:An be a type context andlet Σ “ B1, . . . , Bm be a list of types. The index set IpΣ; Γq isdefined through

    IpΣ; Γq“tpI1, . . . , Im, J1, . . . , Jnq | IjPIp|Bj |q, JiPIp|Ai|u .

    A type annotation Q for Σ; Γ is a family

    Q “ pqIqIPIpΣ;Γq with qI P Q`0 .

    We denote a resource-annotated context with Σ; Γ;Q. Let H bea heap and V be an environment with H ( V : Γ whereH ( V pxjq ÞÑ axj : |Γpxjq| . Let furthermore S “ `1, . . . , `m bean argument stack with H ( S : Σ where H ( `i ÞÑ bi : |Bi| forall i. The potential of Σ; Γ;Q with respect to H and V is

    ΦS,V,HpΣ; Γ;Qq “ÿ

    ~IPIpΣ;Γq

    q~I

    j“1pIj pbjq

    m`nź

    j“m`1pIj paxj q

    Here, ~I “ pI1, ¨ ¨ ¨ , Im`nq. In particular, if Σ “ Γ “ ¨ thenIpΣ; Γq “ tpqu and ΦV,HpΣ; Γ; qpqq “ qpq. We sometimes alsowrite q‹ for qpq.

    Draft 8 2015/7/27

  • Q “ Q1 `M var

    ¨;x:B;QM$x : pB,Q1q(A:VAR)

    pP, P 1q P F πΣ;¨‹ pQq “ P `M var P 1 “ Q1

    Σ;x: 〈Σ Ñ B,F〉 ;QM$x : pB,Q1q(A:VARPUSH)

    Γ “ x1:A1, . . . , xn:An pP, P 1q P F πΓ‹ pQq “ P `M appn Q1 “ P 1

    ¨;x: 〈rA1, . . . , AnsÑB,F〉 ,Γ;QM$xx1 ¨ ¨ ¨xn : pB,Q1q(A:APP)

    Γ “ x1:A1, . . . , xn:An pP, P 1q P F pR,R1q P F 1 πΓ‹ pQq “ P `M appn πΣ‹ pQq´q‹`p1‹ “ R R1 “ Q1

    Σ;x:〈rA1, . . . , AnsÑ

    〈ΣÑB,F 1

    〉,F

    〉,Γ;QM$xx1 ¨ ¨ ¨xn : pB,Q1q

    (A:APPPUSH)

    Σ; Γ, x:A;P M$ e : pB,Q1q Q “ R`Mbind @ I, ~J : rpI, ~Jq “ pp~J,IqA::Σ; Γ;QM$λx.e : pB,Q1q

    (A:ABSPUSH)

    Q “ Q1 `M abs @pP, P 1q P F : Σ; Γ;RM$λx.e : pB,P 1q ^ rp~I, ~Jq “"

    p~I if ~J “ ~‹0 otherwise

    ¨; Γ;QM$λx.e : p〈Σ Ñ B,F〉 , Q1q(A:ABSPOP)

    B “ µX. 〈. . . C : A˚Xn . . .〉 Σ; Γ, y:A˚Bn;P M$ e1 : pA1, P 1qΣ; Γ, x:B;RM$ e2 : pA1, R1q CCBpQq “ P`Mmat1 P 1 “ Q1 Q “ R`Mmat2 R1 “ Q1

    Σ; Γ, x:B;QM$match x with C y Ñ e1 | e2 : pA1, Q1q(A:MAT)

    B “ µX. 〈. . . C : A˚Xn . . .〉 Q “ CCBpQ1q`M cons

    ¨;x:A˚Bn;QM$C x : pB,Q1q(A:CONS)

    Σ; Γ, x1:A, x2:A;P M$ e : pB,Q1q Q “M share` .pP qΣ; Γ, x:A;QM$ sharex as px1, x2q in e : pB,Q1q

    (A:SHARE)

    Σ; Γ2,Γ1;P M$ e1 Σ; Γ2, x:A;P 1 Σ; Γ2, x:A;RM$ e2 : pB,Q1q Q “ P `M let1 P 1 “ R`M let2Σ; Γ2,Γ1;QM$ letx “ e1 in e2 : pB,Q1q

    (A:LET)

    F fi f1 “ λx1.e1 and ¨ ¨ ¨ and fn “ λxn.en ∆ “ f1:A1, . . . , fn:An@i : ¨; Γi,∆;Pi M$λxi.ei : pAi, P 1i q πΣ;Γ0~‹ pQq “ π

    Σ;Γ0~‹ pP q `M

    rec ` n¨M abs Σ; Γ0,∆;P M$ e : pB,Q1qΣ; Γ0, . . . ,Γn;QM$ let recF in e : pB,Q1q

    (A:LETREC)

    Σ; Γ;P $ e : pB,P 1q Q ě P ` c Q1 ď P 1 ` cΣ; Γ;Q $ e : pB,Q1q

    (A:WEAK-A)Σ; Γ;πΓ‹ pQqM$ e : pB,Q1qΣ; Γ, x:A;QM$ e : pB,Q1q

    (A:WEAK-C)

    Σ; Γ;QM$ e : pB1, Q1q B1 ă: BΣ; Γ;QM$ e : pB,Q1q

    (A:SUBTYPE-R)Σ; Γ, x:A1;QM$ e : pB,Q1q A ă: A1

    Σ; Γ, x:A;QM$ e : pB,Q1q(A:SUBTYPE-C)

    ˛ ˛ ˛

    @j P IpΣ; ∆q: j“~‹ ùñ ¨; Γ;πΓj pQqM$ e : pA, πx:Aj pQ1qq j‰~‹ ùñ ¨; Γ;πΓj pQq cf$ e : pA, πx:Aj pQ1qqΣ; ∆,Γ;QM$ e Σ; ∆, x:A;Q1

    (B:BIND)

    Figure 7. Selected type rules for annotated types.

    Folding of Potential Annotations. A key notion in the type systemis the folding for potential annotations that is used to assign potentialto typing contexts that result from a pattern match (unfolding) orfrom the application of a constructor of an inductive data type(folding). Folding of potential annotations is conceptually similar tofolding and unfolding of inductive data types in type theory.

    Let B “ µX. 〈. . . , C : A˚Xn, . . .〉 be an inductive datatype. Let Σ be a type stack, Γ, b:B be a context and let Q “pqIqIPIpΣ;Γ,y:Bq be a context annotation. The C-unfolding CCBpQqof Q with respect to B is an annotation CCBpQq “ pq1IqIPIpΣ;Γ1qfor a context Γ1 “ Γ, x:A˚Bn that is defined by

    q1pI,pJ,L1,...,Lnqq “"

    qpI,〈J,C〉 ::L1¨¨¨Lnq ` qpI,L1¨¨¨Lnq j “ 0qpI,〈J,C〉 ::L1¨¨¨Lnq j ‰ 0

    Here, L1 ¨ ¨ ¨Ln is the concatenation of the lists L1, . . . , Ln.Lemma 2. Let B “ µX. 〈. . . , C : A˚Xn, . . .〉 be an inductivedata type. Let Σ; Γ, x:B;Q be an annotated context, H ( V :Γ, x:B, H ( S : Σ, HpV pxqq “ pC, `q, and V 1 “ V ry ÞÑ`s. Then H ( V 1 : Γ, y:A˚Bn and ΦS,V,HpΣ; Γ, x:B;Qq “ΦS,V 1,HpΣ; Γ, y:A˚Bn;CCBpQqq.

    Sharing. Let Σ; Γ, x1:A, x2:A;Q be an annotated context. Thesharing operation . Q defines an annotation for a context ofthe form Σ; Γ, x:A. It is used when the potential is split betweenmultiple occurrences of a variable. Lemma 3 shows that sharing is alinear operation that does not lead to any loss of potential.

    Lemma 3. Let A be a data type. Then there are natural numberscpi,jqk for i, j, k P Ip|A|q such that the following holds. For ev-

    Draft 9 2015/7/27

  • ery context Σ; Γ, x1:A, x2:A;Q and every H,V with H ( V :Γ, x:A and H ( S : Σ it holds that ΦS,V,HpΣ,Γ, x:A;Q1q “ΦS,V 1,HpΣ; Γ, x1:A, x2:A;Qq where V 1 “ V rx1, x2 ÞÑ V pxqsand q1p`,kq “

    ř

    i,jPIpAq cpi,jqk qp`,i,jq.

    The coefficients cpi,jqk can be computed effectively. We werehowever not able to derive a closed formula for the coefficients.The proof is similar as in previous work [33]. For a contextΣ; Γ, x1:A, x2:A;Q we define .Q to be Q1 from Lemma 3.Type Judgements. A resource-aware type judgement has the form

    Σ; Γ;QM$ e : pA,Q1qwhere Σ; Γ;Q is an annotated context, M is a resource metric,A is an annotated type and Q1 is a type annotation for |A|. Theintended meaning of this judgment is that if there are more thanΦpΣ; Γ;Qq resource units available then this is sufficient to coverthe evaluation cost of e under metric M . In addition, there are atleast Φpv:pA,Q1qq resource units left if e evaluates to a value v.Notations. Families that describe type and context annotations aredenoted with upper case letters Q,P,R, . . . with optional super-scripts. We use the convention that the elements of the families arethe corresponding lower case letters with corresponding superscripts,i.e., Q “ pqIqIPI and Q1 “ pq1IqIPI .

    If Q,P and R are annotations with the same index set I thenwe extend operations on Q pointwise to Q,P and R. For example,we write Q ď P `R if qI ď pI ` rI for every I P I. For K P Qwe write Q “ Q1 `K to state that q‹ “ q1‹ `K ě 0 and qI “ q1Ifor I ‰ ‹ P I. Let Q be an annotation for a context Σ; Γ1,Γ2. ForJ P IpΓ2q we define the projection πΓ1pJ,J1qpQq of Q to Γ1 to be theannotation Q1 for ¨; Γ1 with q1I “ qpJ,I,J1q. In the same way, wedefine the annotations πΣJ pQq for Σ; ¨ and πΣ;Γ1J pQq for Σ; Γ1.Cost Free Types. We write Σ; Γ;Q cf$ e : pA,Q1q to refer to cost-free type judgments where cf is the cost-free metric with cfpKq “ 0for constants K. We use it to assign potential to an extended contextin the let rule. More info is available in previous work [30].Subtyping. As usual, subtyping is defined inductively so that typeshave to be structurally identical. The most interesting rule is the onefor function types:

    F 1 Ď F @i : A1i ă: Ai B ă: B1

    〈rA1, . . . , Ans Ñ B,F〉 ă:〈rA11, . . . , A1ns Ñ B1,F 1

    〉 (S:FUN)A function type is a subtype of another function type if it allows moreresource behaviors (F 1 Ď F ). Result types are treated covariant andarguments are treated contravariant.

    Unsurprisingly, our type system does not have principle types.This is to allow the typing of examples such as rec scheme fromSection 2. In a principle type, we would have to assume the weakesttype for the arguments, that is, function types that are annotatedwith empty sets of type annotations. This would mean that wecannot use functions in the arguments. However, it is possible toderive a principle type 〈Σ Ñ B,F〉 for fixed argument types Σ.Here, we would derive all possible annotations pQ,Q1q P F in thefunction annotation and all possible annotations pQ,Q1q that appearin function annotations of the result type.

    If we take the more algorithmic view of previous work [41]then we can express a principle type for a function with a set ofconstraints that has holes for the constraint sets of the higher-orderarguments. It is however unclear what such a type means for a userand we prefer a more declarative view that clearly separates typechecking and type inference. An open problem with constraint basedprinciple types is polymorphism.Type Rules. Figure 7 contains selected type rules for annotatedtypes. Many of the rules are similar to the rules in previous pa-pers [31, 33, 41] and detailed explanations can be found there.

    Soundness. Our goal is to prove the following soundness state-ment for type judgements. Intuitively, it says that the initial potentialis an upper bound on the watermark resource usage, no matter howlong we execute the program.

    If Σ; Γ;QM$ e : pA,Q1q and S, V,H M$ e ó ˝ | pp, p1qthen p ď ΦS,V,HpΣ; Γ;Qq.

    To prove this statement by induction, we need to prove a strongerstatement that takes into account the return value and the annotatedtype pA,Q1q of e. Moreover, the previous statement is only trueif the values in S, V and H respect the types required by Σ andΓ. Therefore, we adapt our definition of well-formed environmentsto annotated types. We simply replace the rule V:FUN in Figure 4with the following rule. Of course, H ( V : Γ refers to the newlydefined judgment.

    Hp`q “ pλx.e, V q DΓ, Q,Q1 : H ( V : Γ^¨; Γ;QM$λx.e : p〈Σ Ñ B,F〉 , Q1qH ( ` ÞÑ pλx.e, V q : 〈Σ Ñ B,F〉 (V:FUN)

    In addition to the aforementioned soundness, the Theorem 2 statesa stronger property for terminating evaluations. If an expressione evaluates to a value v in a well-formed environment then thedifference between initial and final potential is an upper bound onthe resource usage of the evaluation.

    Theorem 2 (Soundness). Let H ( V : Γ, H ( S : Σ, andΣ; Γ;QM$ e : pB,Q1q.1. If S, V,H M$ e ó p`,H 1q | pp, p1q then p ď ΦS,V,HpΣ; Γ;Qq,p´ p1 ď ΦS,V,HpΣ; Γ;Qq ´ΦH1p`:pB,Q1qq, and H ( ` : B.

    2. If S, V,H M$ e ó ˝ | pp, p1q then p ď ΦS,V,HpΣ; Γ;Qq.Theorem 2 is proved by a nested induction on the derivation

    of the evaluation judgment and the type judgment Σ; Γ;Q $e:pB,Q1q. The inner induction on the type judgment is neededbecause of the structural rules. There is one proof for all possibleinstantiations of the resource constants. An sole induction on thetype judgement fails because the size of the type derivation canincrease in the case of the function application in which we retrievea type derivation for the function body from the well-formedjudgement as defined by the (updated) rule V:FUN.

    The structure of the proof matches the structure of the previoussoundness proofs for type systems based on AARA [31, 33, 34, 41].The induction case of many rules is similar to the induction casesof the corresponding rules for multivariate AARA for first-orderprograms [33] and linear AARA for higher-order programs [41].For one thing, additional complexity is introduced by the newresource polynomials for user-defined data types. We designedthe system so that this additional complexity is dealt with locallyin the rules A:MAT, A:CONS, and A:SHARE. The soundness ofthese rules follows directly from an application of Lemma 2 andLemma 3, respectively. As in previous work [34] the well-formedjudgement that captures type derivations enables us to treat functionabstraction and application in a very similar fashion as in the first-order case [33]. The coinductive definition of the well-formednessjudgement does not cause any difficulties. A major novel aspectin the proof is the typed argument stack S : Σ that also carriespotential. Surprisingly, this typed stack is simply treated like a typedenvironment V : Γ in the proof. It is already incorporated in theshift and share operations (Lemma 2 and Lemma 3).

    We deal with the mutable heap by requiring that array elementsdo not influence the potential of an array. As a result, we can provethe following lemma, which is used in the proof of Theorem 2.

    Lemma 4. If H ( V :Γ, H ( S : Σ, Σ; Γ;QM$ e : pB,Q1qand stack, V,H M$ e ó p`,H 1q | pp, p1q then ΦS,V,HpΓ;Qq “ΦS,V,H1pΓ;Qq.

    Draft 10 2015/7/27

  • Parser

    Type Inference

    Input Program

    Typed OCamlSyntax Tree

    OCamlBytecode

    RAML Compiler

    Bracket-Type Inference

    Share-Let Normal Form

    Stack-Based Type Checking

    Explicit Let Polymorphism

    Typed RAML Syntax Tree

    OCaml - C Bindings

    RAML Analyzer

    Resource Type Interpretation

    LP Solver Frontend

    Multivariate AARA

    CLP

    Resource Metrics

    Resource Bounds

    Figure 8. Implementation of RAML.

    7. Implementation and Bound InferenceFigure 8 shows an overview of the implementation of RAML. Itconsists of about 12000 lines of OCaml code, excluding the parts thatwe reused from Inria’s OCaml implementation. The developmenttook around 8 person months. We found it very helpful to developthe implementation and the theory in parallel, and many theoreticalideas have been inspired by implementation challenges.

    We reuse the parser and type inference algorithm from OCaml4.01 to derive a typed OCaml syntax tree from the source program.We then analyze the function applications to introduce bracketfunction types. To this end, we copy a lambda abstraction for everycall site. We still have to implement a unification algorithm sincefunctions, such as let g = f x, that are defined by partial applicationmay be used at different call sites. Moreover, we have to deal withfunctions that are stored in references.

    In the next step, we convert the typed OCaml syntax tree into atyped RAML syntax tree. Furthermore, we transform the programinto share-let-normal form without changing the resource behavior.For this purpose, each syntactic form has a free flag that specifieswhether it contributes to the cost of the original program. Forexample, all share forms that are introduced are free. We also inserteta expansions whenever they do not influence resource usage.

    After this compilation phase, we perform the actual multivariateAARA on the program in share-let-normal form. Resource metricscan be easily specified by a user. We include a metric for heap cells,evaluation steps, and ticks. The letter allows the user to flexiblyspecify the resource cost of programs by inserting tick commandsRaml.tick(q) where q is a (possibly negative) floating-point number.

    In principle, the actual bound inference works similarly as inprevious AARA systems [32, 34]: First, we fix a maximal degreeof the bounds and annotate all types in the derivation of the simpletypes with variables that correspond to type annotations for resourcepolynomials of that degree. Second, we generate a set of linearinequalities, which express the relationships between the addedannotation variables as specified by the type rules. Third, we solvethe inequalities with Coin-Or’s fantastic LP solver CLP. A solutionof the linear program corresponds to a type derivation in whichthe variables in the type annotations are instantiated accordingto the solution. The objective function contains the coefficientsof the resource annotation of the program inputs to minimize theinitial potential. Modern LP solvers provide support for iterativesolving that allows us to express that minimization of higher-degreeannotations should take priority.

    The type system we use in the implementation significantlydiffers from the declarative version we describe in this article. Forone thing, we have to use algorithmic versions of the type rules inthe inference in which the non-syntax-directed rules are integratedinto the syntax-directed ones [33]. For another thing, we annotatefunction types not with a set of type annotations but with a functionthat returns an annotation for the result type if presented with anannotation of the return type. The annotations here are symbolicand the actual number are yet to be determined by the LP solver.Function annotations have the side effect of sending constraintsto the LP solver. It would be possible to keep a constraint set forthe respective function in memory and to send a copy with fresh

    variables to the LP solver at every call. However, it is more efficientto lazily trigger the constraint generation from the function body atevery call site when the function is provided with a return annotation.

    To make the resource analysis more expressive, we also allowresource-polymorphic recursion. This means that we need a typeannotation in the recursive call that differs from the annotation inthe argument and result types of the function. To infer such typeswe successively infer type annotations of higher and higher degree.Details can be found in previous work [30].

    For the most part, our constraints have the form of a so-callednetwork (or network-flow) problem [49]. LP solvers can handlenetwork problems very efficiently and in practice CLP solves theconstraints RAML generates in linear time. Because our problemsizes are large, we can save memory and time by reducing the num-ber of constraints that are generated during typing. A representativeexample of an optimization is that we try to reuse constraint namesinstead of producing constraints like p “ q.

    RAML provides two ways of analyzing a program. In main modeRAML derives a bound for evaluation cost of the main expressionof the program, that is, the last expression in the top-level list oflet bindings. In module mode, RAML derives a bound for everytop-level let binding that has a function type.

    Apart from the analysis itself, we also implemented the con-version of the derived resource polynomials into easily-understoodpolynomial bounds and a pretty printer for RAML types and expres-sions. Additionally, we implemented an efficient RAML interpreterthat we use for debugging and to determine the quality of the bounds.

    8. Case Study: Bounds for DynamoDB QueriesHaving integrated the analysis with Inria’s OCaml compiler enablesus to analyze and compile real programs. An interesting use caseof our resource bound analysis is to infer worst-case bounds onDynamoDB queries. DynamoDB is a commercial NoSQL clouddatabase service, which is part of Amazon Web Services (AWS).Amazon charges DynamoDB users on a combination of numberof queries, transmitted fields, and throughput. Since DynamoDBis a NoSQL service, it is often only possible to retrieve the wholetable—which can be expensive for large data sets—or single entriesthat are identified by a key value. The DynamoDB API is availablethrough the Opam package aws. We make the API available to theanalysis by using tick functions that specify resource usage. Sincethe query cost for different tables can be different, we provide onefunction per action and table.

    let db_query student_id course_id =Raml.tick(1.0); Awslib.get_item ...

    In the following, we describe the analysis of a specific OCamlapplication that uses a database that contains a large table that storesgrades of students for different courses. Our first function computesthe average grade of a student for a given list of courses.

    let avge_grade student_id course_ids =let f acc cid =

    let (length,sum) = acc inlet grade = match db_query student_id cid with

    | Some q Ñ q| None Ñ raise (Not_found (student_id,cid))

    in(length +. 1.0, sum +. grade)

    inlet (length,sum) = foldl f (0.0,0.0) course_ids insum /. length

    In 0.03s RAML computes the tight bound 1 ¨ m where m is thelength of the argument course ids. We omit the standard definitionsof functions like foldl and map. However, they are not built-in intoour systems but the bounds are derived form first principles.

    Draft 11 2015/7/27

  • Next, we sort a given list of students based on the average gradesin a given list of classes using quick sort. As a first approximationwe use a comparison function that is based on average grade.

    let geq sid1 sid2 cour_ids =avge_grade sid1 cour_ids >= avge_grade sid2 cour_ids

    This results in Opn2mq database queries where n is the number ofstudents andm is the number of courses. The reason is that there areOpn2q comparisons during a run of quick sort. Since the resourceusage of quick sort depends on the number of courses, we have tomake the list of courses an explicit argument and cannot store it inthe closure of the comparison function.

    let rec partition gt acc l =match l with

    | [] Ñ let (cs,bs,_) = acc in (cs,bs)| x::xs Ñ let (cs,bs,aux) = acc in

    let acc’ = if gt x aux then (cs,x::bs,aux)else (x::cs,bs,aux)

    in partition gt acc’ xs

    let rec qsort gt aux l = match l with | [] Ñ []| x::xs Ñ

    let ys,zs = partition (gt x) ([],[],aux) xs inappend (qsort gt aux ys) (x::(qsort gt aux zs))

    let sort_students s_ids c_ids = qsort geq c_ids s_ids

    In 0.31s RAML computes the tight bound n2m ´ nm forsort students where n is the length of the argument s ids and m isthe length of the argument c ids. The negative factor arises from thetranslation of the resource polynomials to the standard basis.

    Given the alarming cubic bound, we reimplement our sortingfunction using memoization. To this end we create a table thatlooks up and stores for each student and course the grade inthe DynamoDB. We then replace the function db query with thefunction lookup.

    let lookup sid cid table =let cid_map = find (fun id Ñ id = sid) table infind (fun id Ñ id = cid) cid_map

    For the resulting sorting function, RAML computes the tight boundnm in 0.87s.

    9. Related WorkOur work builds on past research on automatic amortized resourceanalysis (AARA). AARA has been introduced by Hofmann andJost for a strict first-order functional language with built-in datatypes [34]. The technique has been applied to higher-order func-tional programs and user defined types [41], to derive stack-spacebounds [16], to programs with lazy evaluation [47, 52], to object-oriented programs [35, 38], and to low-level code by integratingit with separation logic [8]. All the aforementioned amortized-analysis–based systems are limited to linear bounds. Hoffmann etal. [29, 32, 33] presented a multivariate AARA for a first-order lan-guage with built-in lists and binary trees. Hofmann and Moser [37]have proposed a generalization of this system in the context of(first-order) term rewrite systems. However, it is unclear how toautomate this system. In this article, we introduce the first AARAthat is able to automatically derive (multivariate) polynomial boundsthat depend on user-defined inductive data structures. Our systemis the only one that can derive polynomial bounds for higher-orderfunctions. Even for linear bounds, our analysis is more expressivethan existing systems for strict languages [41]. For instance, we canfor the first time derive an evaluation-step bound for the curriedappend function for lists. Moreover, we integrated AARA for thefirst time with an existing industrial-strength compiler.

    Type systems for inferring and verifying resource bounds havebeen extensively studied. Vasconcelos et al. [50, 51] described anautomatic analysis system that is based on sized-types [39] andderives linear bounds for higher-order functional programs. Herewe derive polynomial bounds.

    Dal Lago et al. [43, 44] introduced linear dependent typesto obtain a complete analysis system for the time complexity ofthe call-by-name and call-by-value lambda calculus. Crary andWeirich [20] presented a type system for specifying and certifyingresource consumption. Danielsson [22] developed a library, basedon dependent types and manual cost annotations, that can be usedfor complexity analyses of functional programs. The advantage ofour technique is that it is fully automatic.

    Classically, cost analyses are often based on deriving and solvingrecurrence relations. This approach was pioneered by Wegbreit [53]and is actively studied for imperative languages [1, 5, 7, 25]. Theseworks are not concerned with higher-order functions and bounds donot depend on user-defined data structures.

    Benzinger [11] has applied Wegbreit’s method in an automaticcomplexity analysis for Nuprl terms. However, complexity infor-mation for higher-order functions has to be provided explicitly.Grobauer [26] reported a mechanism to automatically derive costrecurrences from DML programs using dependent types. Danner etal. [23, 24] propose an interesting technique to derive higher-orderrecurrence relations from higher-order functional programs. Solvingthe recurrences is not discussed in these works and in contrast to ourwork they are not able to automatically infer closed-form bounds.

    Abstract interpretation based approaches to resource analysis [12,18, 27, 48, 54] focus on first-order integer programs with loops.Cicek et al. [19] study a type system for incremental complexity.

    In an active area of research, techniques from term rewritingare applied to complexity analysis [9, 15, 45]; sometimes in combi-nation with amortized analysis [36]. These techniques are usuallyrestricted to first-order programs and time complexity. Recently,Avanzini et al. [10] proposed a complexity preserving defunctional-iztion to deal with higher-order programs. While the transformationis asymptotically complexity preserving, it is unclear whether thistechnique can derive bounds with precise constant factors.

    Finally, there exists research that studies cost models to formallyanalyze parallel programs. Blelloch and Greiner [13] pioneeredthe cost measures work and depth. There are more advanced costmodels that take into account caches and IO (see, e.g., Blelloch andHarper [14]), However, these works do not provide machine supportfor deriving static cost bounds.

    10. ConclusionWe have presented important first steps towards a practical automaticresource bound analysis system for OCaml. Our three main contribu-tions are (1) the integration of automatic amortized resource analysiswith the OCaml compiler, (2) a novel automatic resource analysissystem that infers multivariate polynomial bounds that depend onsize parameters of user-defined data structures, and (3) the firstAARA that infers polynomial bounds for higher-order functions.

    As the title of this article indicates, there are many open problemsleft on the way to a usable resource analysis system for OCaml. Inthe future, we plan to improve the bound analysis for programswith side-effects and exceptions. We will also work on mechanismsthat allow user interaction for manually deriving bounds if theautomation fails. Furthermore, we will work on taking into accountgarbage collection and the runtime system when deriving time andspace bounds. Finally, we will investigate techniques to link the high-level bounds with hardware and the low-level code that is producedby the compiler. These open questions are certainly challenging butwe now have the tools to further push the boundaries of practicalquantitative software verification.

    Draft 12 2015/7/27

  • References[1] E. Albert, P. Arenas, S. Genaim, G. Puebla, and D. Zanardini. Cost

    Analysis of Java Bytecode. In 16th Euro. Symp. on Prog. (ESOP’07),pages 157–172, 2007.

    [2] E. Albert, P. Arenas, S. Genaim, and G. Puebla. Closed-Form UpperBounds in Static Cost Analysis. Journal of Automated Reasoning,pages 161–203, 2011.

    [3] E. Albert, P. Arenas, S. Genaim, M. Gómez-Zamalloa, and G. Puebla.Automatic Inference of Resource Consumption Bounds. In Logic forProgramming, Artificial Intelligence, and Reasoning, 18th Conference(LPAR’12), pages 1–11, 2012.

    [4] E. Albert, P. Arenas, S. Genaim, G. Puebla, and D. Zanardini. CostAnalysis of Object-Oriented Bytecode Programs. Theor. Comput. Sci.,413(1):142 – 159, 2012.

    [5] E. Albert, J. C. Fernández, and G. Román-Dı́ez. Non-cumulativeResource Analysis. In Tools and Algorithms for the Construction andAnalysis of Systems - 21st International Conference, (TACAS’15), pages85–100, 2015.

    [6] C. Alias, A. Darte, P. Feautrier, and L. Gonnord. Multi-dimensionalRankings, Program Termination, and Complexity Bounds of FlowchartPrograms. In 17th Int. Static Analysis Symposium (SAS’10), pages117–133, 2010.

    [7] D. E. Alonso-Blas and S. Genaim. On the limits of the classicalapproach to cost analysis. In 19th Int. Static Analysis Symp. (SAS’12),pages 405–421, 2012.

    [8] R. Atkey. Amortised Resource Analysis with Separation Logic. In 19thEuro. Symp. on Prog. (ESOP’10), pages 85–103, 2010.

    [9] M. Avanzini and G. Moser. A Combination Framework for Complex-ity. In 24th International Conference on Rewriting Techniques andApplications (RTA’13), pages 55–70, 2013.

    [10] M. Avanzini, U. D. Lago, and G. Moser. Analysing the Complexityof Functional Programs: Higher-Order Meets First-Order. In 29th Int.Conf. on Functional Programming (ICFP’15), 2012.

    [11] R. Benzinger. Automated Higher-Order Complexity Analysis. Theor.Comput. Sci., 318(1-2):79–103, 2004.

    [12] R. Blanc, T. A. Henzinger, T. Hottelier, and L. Kovács. ABC: AlgebraicBound Computation for Loops. In Logic for Prog., AI., and Reasoning- 16th Int. Conf. (LPAR’10), pages 103–118, 2010.

    [13] G. E. Blelloch and J. Greiner. A Provable Time and Space EfficientImplementation of NESL. In 1st Int. Conf. on Funct. Prog. (ICFP’96),pages 213–225, 1996.

    [14] G. E. Blelloch and R. Harper. Cache and I/O Efficent FunctionalAlgorithms. In 40th ACM Symp. on Principles Prog. Langs. (POPL’13),pages 39–50, 2013.

    [15] M. Brockschmidt, F. Emmes, S. Falke, C. Fuhs, and J. Giesl. Alternat-ing Runtime and Size Complexity Analysis of Integer Programs. InTools and Alg. for the Constr. and Anal. of Systems - 20th Int. Conf.(TACAS’14), pages 140–155, 2014.

    [16] B. Campbell. Amortised Memory Analysis using the Depth of DataStructures. In 18th Euro. Symp. on Prog. (ESOP’09), pages 190–204,2009.

    [17] Q. Carbonneaux, J. Hoffmann, and Z. Shao. Compositional CertifiedResource Bounds. In 36th Conf. on Prog. Lang. Design and Impl.(PLDI’15), 2015.

    [18] P. Cerný, T. A. Henzinger, L. Kovács, A. Radhakrishna, and J. Zwirch-mayr. Segment Abstraction for Worst-Case Execution Time Analysis.In 24th European Symposium on Programming (ESOP’15), pages 105–131, 2015.

    [19] E. Çiçek, D. Garg, and U. A. Acar. Refinement Types for IncrementalComputational Complexity. In 24th European Symposium on Program-ming (ESOP’15), pages 406–431, 2015.

    [20] K. Crary and S. Weirich. Resource Bound Certification. In 27th ACMSymp. on Principles of Prog. Langs. (POPL’00), pages 184–198, 2000.

    [21] S. A. Crosby and D. S. Wallach. Denial of service via algorithmiccomplexity attacks. In 12th USENIX Security Symposium (USENIX’12),2003.

    [22] N. A. Danielsson. Lightweight Semiformal Time Complexity Analysisfor Purely Functional Data Structures. In 35th ACM Symp. on PrinciplesProg. Langs. (POPL’08), pages 133–144, 2008.

    [23] N. Danner, D. R. Licata, and R. Ramyaa. Denotational Cost Semanticsfor Functional Languages with Inductive Types. In 29th Int. Conf. onFunctional Programming (ICFP’15), 2012.

    [24] N. Danner, J. Paykin, and J. S. Royer. A Static Cost Analysis for aHigher-Order Language. In 7th Workshop on Prog. Languages MeetsProg. Verification (PLPV’13), pages 25–34, 2013.

    [25] A. Flores-Montoya and R. Hähnle. Resource Analysis of ComplexPrograms with Cost Equations. In Programming Languages andSystems - 12th Asian Symposiu (APLAS’14), pages 275–295, 2014.

    [26] B. Grobauer. Cost Recurrences for DML Programs. In 6th Int. Conf.on Funct. Prog. (ICFP’01), pages 253–264, 2001.

    [27] S. Gulwani, K. K. Mehra, and T. M. Chilimbi. SPEED: Precise andEfficient Static Estimation of Program Computational Complexity. In36th ACM Symp. on Principles of Prog. Langs. (POPL’09), pages127–139, 2009.

    [28] J. Hoffmann. Types with Potential: Polynomial Resource Boundsvia Automatic Amortized Analysis. PhD thesis, Ludwig-Maximilians-Universität München, 2011.

    [29] J. Hoffmann and M. Hofmann. Amortized Resource Analysis withPolynomial Potential. In 19th Euro. Symp. on Prog. (ESOP’10), 2010.

    [30] J. Hoffmann and M. Hofmann. Amortized Resource Analysis withPolymorphic Recursion and Partial Big-Step Operational Semantics. InProg. Langs. and Systems - 8th Asian Symposium (APLAS’10), 2010.

    [31] J. Hoffmann and Z. Shao. Type-Based Amortized Resource Analy-sis with Integers and Arrays. In 12th International Symposium onFunctional and Logic Programming (FLOPS’14), 2014.

    [32] J. Hoffmann, K. Aehlig, and M. Hofmann. Multivariate AmortizedResource Analysis. In 38th ACM Symp. on Principles of Prog. Langs.(POPL’11), 2011.

    [33] J. Hoffmann, K. Aehlig, and M. Hofmann. Multivariate AmortizedResource Analysis. ACM Trans. Program. Lang. Syst., 2012.

    [34] M. Hofmann and S. Jost. Static Prediction of Heap Space Usage forFirst-Order Functional Programs. In 30th ACM Symp. on Principles ofProg. Langs. (POPL’03), pages 185–197, 2003.

    [35] M. Hofmann and S. Jost. Type-Based Amortised Heap-Space Analysis.In 15th Euro. Symp. on Prog. (ESOP’06), pages 22–37, 2006.

    [36] M. Hofmann and G. Moser. Amortised Resource Analysis and TypedPolynomial Interpretations. In Rewriting and Typed Lambda Calculi(RTA-TLCA;14), pages 272–286, 2014.

    [37] M. Hofmann and G. Moser. Multivariate Amortised Resource Analysisfor Term Rewrite Systems. In 13th International Conference on TypedLambda Calculi and Applications (TLCA’15), pages 241–256, 2015.

    [38] M. Hofmann and D. Rodriguez. Automatic Type Inference for Amor-tised Heap-Space Analysis. In 22nd Euro. Symp. on Prog. (ESOP’13),pages 593–613, 2013.

    [39] J. Hughes, L. Pareto, and A. Sabry. Proving the Correctness of ReactiveSystems Using Sized Types. In 23th ACM Symp. on Principles of Prog.Langs. (POPL’96), pages 410–423, 1996.

    [40] G. Jin, L. Song, X. Shi, J. Scherpelz, and S. Lu. Understandingand Detecting Real-World Performance Bugs. In Conference onProgramming Language Design and Implementation PLDI’12, pages77–88, 2012.

    [41] S. Jost, K. Hammond, H.-W. Loidl, and M. Hofmann. Static Deter-mination of Quantitative Resource Usage for Higher-Order Programs.In 37th ACM Symp. on Principles of Prog. Langs. (POPL’10), pages223–236, 2010.

    [42] P. C. Kocher. Timing Attacks on Implementations of Diffie-Hellman,RSA, DSS, and Other Systems. In Advances in Cryptology - 16thAnnual International Cryptology Conference (CRYPTO’96), pages 104–113, 1996.

    [43] U. D. Lago and M. Gaboardi. Linear Dependent Types and RelativeCompleteness. In 26th IEEE Symp. on Logic in Computer Science(LICS’11), pages 133–142, 2011.

    Draft 13 2015/7/27

  • [44] U. D. Lago and B. Petit. The Geometry of Types. In 40th ACM Symp.on Principles Prog. Langs. (POPL’13), pages 167–178, 2013.

    [45] L. Noschinski, F. Emmes, and J. Giesl. Analyzing Innermost RuntimeComplexity of Term Rewriting by Dependency Pairs. J. Autom.Reasoning, 51(1):27–56, 2013.

    [46] O. Olivo, I. Dillig, and C. Lin. Static Detection of Asymptotic Per-formance Bugs in Collection Traversals. In Conference on Program-ming Language Design and Implementation (PLDI’15), pages 369–378,2015.

    [47] H. R. Simões, P. B. Vasconcelos, M. Florido, S. Jost, and K. Ham-mond. Automatic Amortised Analysis of Dynamic Memory Allocationfor Lazy Functional Programs. In 17th Int. Conf. on Funct. Prog.(ICFP’12), pages 165–176, 2012.

    [48] M. Sinn, F. Zuleger, and H. Veith. A Simple and Scalable Approachto Bound Analysis and Amortized Complexity Analysis. In ComputerAided Verification - 26th Int. Conf. (CAV’14), page 743–759, 2014.

    [49] R. J. Vanderbei. Linear Programming: Foundations and Extensions.Springer US, 2001.

    [50] P. Vasconcelos. Space Cost Analysis Using Sized Types. PhD thesis,School of Computer Science, University of St Andrews, 2008.

    [51] P. B. Vasconcelos and K. Hammond. Inferring Costs for Recursive,Polymorphic and Higher-Order Functi