10
Research Article Modeling and Querying Business Data with Artifact Lifecycle Danfeng Zhao, 1 Wei Zhao, 2 Le Sun, 1 and Dongmei Huang 1 1 College of Information, Shanghai Ocean University, Shanghai 201306, China 2 Information & Telecommunication Branch, Heilongjiang Electric Power Company Ltd., State Grid Corporation of China, Heilongjiang 150090, China Correspondence should be addressed to Danfeng Zhao; [email protected] and Dongmei Huang; [email protected] Received 28 August 2014; Accepted 11 September 2014 Academic Editor: L. W. Zhang Copyright © 2015 Danfeng Zhao et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Business data has been one of the current and future research frontiers, with such big data characteristics as high-volume, high- velocity, high-privacy, and so forth. Most corporations view their business data as a valuable asset and make efforts on the development and optimal utilization on these data. Unfortunately, data management technology at present has been lagging behind the requirements of business big data era. Based on previous business process knowledge, a lifecycle of business data is modeled to achieve consistent description between the data and processes. On this basis, a business data partition method based on user interest is proposed which aims to get minimum number of interferential tuples. en, to balance data privacy and data transmission cost, our strategy is to explore techniques to execute SQL queries over encrypted business data, split the computations of queries across the server and the client, and optimize the queries with syntax tree. Finally, an instance is provided to verify the usefulness and availability of the proposed method. 1. Introduction With the advent of Big Data, attentions from all walks of life gradually focus on exploiting their controllable data so as to realize a satisfactory profit. Against this background, data resource is widely recognized to be equal in status and value to mineral resource. In enterprise-led dataspace, data gener- ated in business process are the most significant factor which will affect the performance of process execution. As business process is closely related to enterprise’s business strategy and market competiveness, researches on business data will benefit enterprises in coping with the challenges brought by Big Data and are significant in predicting and responding to potential business risks in a timely way as well as offering business opportunities. Recently, research work on data man- agement in business process has gradually become a research hotspot. During business process execution, there is usually a large data transfer, which falls into the scope of Big Data. For example, currently China Unicom monthly stores more than 2 trillion records, data volume is over 525 TB, and the highest data volume has reached a peak of 5 PB [1]. China UnionPay daily handles more than 60 billion transactions; thereby the generated data are exceptionally large. Google supports such a great many of services as both processing over 20 petabytes (10 15 bytes) of data and monitoring 7.2 billion pages per day [2]. Starting from 2005, NTDB (National Trauma Data Bank) has tracked more than half a million trauma patients by now and stored their records, and many service retailers collect data from multiple sales channels, catalogs, stores, and online interaction, such as Client-Side Click-to-Action [3]. Hence, Big Data is ubiquitous (business process data arises in enter- prises (large or small)) and grows exponentially, which poses huge challenges in data management. To address this, the first priority is to build an adaptive data model, which provides basis and direction for efficient data acquisition. Secondly, we see it as the next big issue about devising a suitable query strategy for business data which is a prerequisite for data processing and analysis. Data modeling is the foundation for dataspace building. e research work in the early days focused on dataspace modeling where its subject is individual [4, 5]. iDM (iMeMex data model) [6] is the first model which is able to represent all heterogeneous personal information into a single model. is data model uses database approach so easy to understand but introduce a new query language iQL, which is a little hard Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2015, Article ID 506272, 9 pages http://dx.doi.org/10.1155/2015/506272

Research Article Modeling and Querying Business Data with ...downloads.hindawi.com/journals/mpe/2015/506272.pdf · monitoring for performance or business concerns, auditing, and compliance

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research Article Modeling and Querying Business Data with ...downloads.hindawi.com/journals/mpe/2015/506272.pdf · monitoring for performance or business concerns, auditing, and compliance

Research ArticleModeling and Querying Business Data with Artifact Lifecycle

Danfeng Zhao1 Wei Zhao2 Le Sun1 and Dongmei Huang1

1 College of Information Shanghai Ocean University Shanghai 201306 China2 Information amp Telecommunication Branch Heilongjiang Electric Power Company Ltd State Grid Corporation of ChinaHeilongjiang 150090 China

Correspondence should be addressed to Danfeng Zhao dfzhaoshoueducn and Dongmei Huang dmhuangshoueducn

Received 28 August 2014 Accepted 11 September 2014

Academic Editor L W Zhang

Copyright copy 2015 Danfeng Zhao et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Business data has been one of the current and future research frontiers with such big data characteristics as high-volume high-velocity high-privacy and so forth Most corporations view their business data as a valuable asset and make efforts on thedevelopment and optimal utilization on these data Unfortunately datamanagement technology at present has been lagging behindthe requirements of business big data era Based on previous business process knowledge a lifecycle of business data is modeled toachieve consistent description between the data and processes On this basis a business data partitionmethod based on user interestis proposed which aims to get minimum number of interferential tuples Then to balance data privacy and data transmission costour strategy is to explore techniques to execute SQL queries over encrypted business data split the computations of queries acrossthe server and the client and optimize the queries with syntax tree Finally an instance is provided to verify the usefulness andavailability of the proposed method

1 Introduction

With the advent of Big Data attentions from all walks of lifegradually focus on exploiting their controllable data so as torealize a satisfactory profit Against this background dataresource is widely recognized to be equal in status and valueto mineral resource In enterprise-led dataspace data gener-ated in business process are the most significant factor whichwill affect the performance of process execution As businessprocess is closely related to enterprisersquos business strategyand market competiveness researches on business data willbenefit enterprises in coping with the challenges brought byBig Data and are significant in predicting and responding topotential business risks in a timely way as well as offeringbusiness opportunities Recently research work on dataman-agement in business process has gradually become a researchhotspot

During business process execution there is usually a largedata transfer which falls into the scope of Big Data Forexample currently China Unicom monthly stores more than2 trillion records data volume is over 525 TB and the highestdata volume has reached a peak of 5 PB [1] China UnionPaydaily handles more than 60 billion transactions thereby the

generated data are exceptionally large Google supports sucha great many of services as both processing over 20 petabytes(1015 bytes) of data and monitoring 72 billion pages per day[2] Starting from 2005 NTDB (National TraumaData Bank)has tracked more than half a million trauma patients by nowand stored their records and many service retailers collectdata frommultiple sales channels catalogs stores and onlineinteraction such as Client-Side Click-to-Action [3] HenceBig Data is ubiquitous (business process data arises in enter-prises (large or small)) and grows exponentially which poseshuge challenges in datamanagement To address this the firstpriority is to build an adaptive data model which providesbasis and direction for efficient data acquisition Secondly wesee it as the next big issue about devising a suitable querystrategy for business data which is a prerequisite for dataprocessing and analysis

Data modeling is the foundation for dataspace buildingThe research work in the early days focused on dataspacemodeling where its subject is individual [4 5] iDM (iMeMexdatamodel) [6] is the first model which is able to represent allheterogeneous personal information into a singlemodelThisdata model uses database approach so easy to understand butintroduce a new query language iQL which is a little hard

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2015 Article ID 506272 9 pageshttpdxdoiorg1011552015506272

2 Mathematical Problems in Engineering

for normal users to learn UDM (unified data model) [7] usesthe integrated IR-DB approach which is able to represent thepartial sections of a file but is also not able to support rela-tional data query Triple model [8] represents heterogeneousdata in triple form which is a simple and flexible solutionbut does not support the path expression queries uncertaintyand lineage queries PDM (probabilistic semantic model) [9]supports top-k query answering but it is difficult to obtainreliable probability functions The methods above are basedon personal dataspace Unfortunately in enterprise-led dataspace scenarios today there is rare research works on datamodeling

Query ability is the basis of the exploitation of Big Datarsquosvalue Query language iQL [6] realizes rules-based queryoptimization but ignores the evaluation of optimization costUDM [7] introduces a new query language which is basedon SQL query language with some extended core opera-tions called TALZBRA operation Triple model [8] supportssubject predicate object (SPO) query language that can beenhanced by RDF-based query language DSSP (dataspacesupport platforms) supports some useful services on datas-pace helps to recognize the correlation among sources ofdataspace and provides a basic query schema upon thesedata sources In enterprise-led dataspace business processdata is the key element in data modeling which has suchcharacteristics as large-volume strong temporal correlationand stable lifecycle These characteristics make it an extremechallenge for current query schemes

Business data realistically records the whole executionprocess of a single task including execution status resourcestatus and real-time usage and correlation with other busi-ness process instances Executing a business process wouldgenerate additional data for a variety of reasons such asmonitoring for performance or business concerns auditingand compliance checking Even business process schemasand enactments can be viewed as data so that they can bemanaged queried mined for process schemas and analyzed[10] An artifact is a kind of widely recognized business pro-cess data representing key business entities Artifact-centricapproach [11] is the representative method in data-centricbusiness process management and has been applied in vari-ous client engagements including financial [12] supply chainretailer [13] bank pharmaceutical research [14] and cooper-ative work [15] In this paper we firstly adopt artifact as a basicelement analyze its evolution process and then model busi-ness data through corresponding artifact lifecycle Secondlywe make efforts on devising a safe and quick query strategyin consideration of the privacy and storage distribution ofartifacts

The rest of the paper is organized as follows In Section 2we introduce the concept of artifact in field of workflowmanagement and model business data with its lifecycle fromthe perspective of process In Section 3 we propose a businessdata partition method concerning user interest based onwhich we further present a cryptograph query for off-sitestorage data Then we give a detailed instance to verify theproposed method in Section 4 In the last section we draw aconclusion

2 Business Data Modeling

As before artifacts describe the business-relevant data andtheir lifecycles which is an important property of businessdata and describes the whole dynamic process of businessdata It also contains specific time information To takeadvantage of these characteristics our strategy is to modelbusiness data with its lifecycle which aims to realize the com-pleted description of dynamic business data In this sectionwe introduce artifact-relevant notions and take artifact-centric process description method to model business datawith artifact lifecycle Furthermore we adopt business pro-cess logic model to illustrate the lifecycle of business data andthen measure the quality of above model

21 Basic Definition

Definition 1 Artifact [16] is an objective data entity whichrecords the business process Artifact comprises both aunique immutable identity and self-describing mutable con-tent

Definition 2 An artifact lifecycle captures the end-to-endprocess of a specific artifact from creation to completion andarchiving

Definition 3 Artiflow model (artifact logical flow) [17] is 5-tuple (119873 119878 119877 119862 Ru) where 119873 is the name of model 119878 is afinite set of services119877 is a finite set of repositories119862 is a finiteset of transport channels and Ru is a finite set of businessrules

Definition 4 The states of artifact are a set and119898119894=1

119868119904119863119890119891119894119899119890(119860119894)

(conjunction expression) where 119868119904119863119890119891119894119899119890(119860119894) is a mapping

function that assigns a Boolean value 0 1 to each singleattribute 119860

119894in attribute set 119860 119860

119894isin 119860 119894 isin 1 119898 and 119898 is

the number of attributes in artifact If the attribute is definedand has value it will return 1 else it will return 0

Definition 5 Service is 5-tuple (119899 119881119903 119881119908 119875 119864) where 119899 is

the name of a certain service 119899 isin 119878 119881119903 119881119908are the finite set of

artifact classes where 119881119903is a set of artifacts which the service

is about to read and 119881119908is a set of artifacts which the service is

about to rewrite119875 is the description of artifact states inputtedby 119881 119864 is the description of activities on 119881

Definition 6 Repository is 4-tuple 119877 = (re 119877119886 119877119903 119862119905) where

re is the name of repository 119877119886 119877119903are the set of stored and

read artifacts respectively 119862119905is the reading condition for 119877

119903

Definition 7 Transport channel is 2-tuple (Cn Cs) where Cnis the name of the channel Cs is 3-tuple (prior servicerepos-itory name rear servicerepository name channel type) Cs isin

119877 times 119878 times ReadReadOnly cup 119878 times 119877 times Write where 119877 119878 arethe finite set of repository elements and service elements inArtiflow respectively and the set of transport channel typesis described as Read ReadOnly Write

22 Data Modeling with Artifact Lifecycle As suggested inDefinitions 2 and 3 Artiflow is a logical model that records

Mathematical Problems in Engineering 3

M1 request fordetection

E1Detection

taskregistration

DIS DIS DIS

DIS

DIS

DIS

DIS DIS DIS

DIS

Tasklibrary

DIS

M2 updatemethod and standard

E2

Method andstandard

management

Method and standardaltering table

Product standardaltering table

Product standardaltering table

Method and standardaltering table

Method andstandard

Product standardinformation

Taskassignment

Assignmenttask library

M3 alter product standard

E3

Productstandard

management

Productstandard

Product standardinformation

Method and standardinformation

Method and standardinformation

Dectinginformation sheet

Lab 1

Lab 2

Detectionresult

library

M4 submitauditing

Auditing

E4

E4

E4

E5

Auditedinformation

library

M5 edit report

Reportediting

Detectioninformation sheet

Detection report

Queryapplication

M4 querydetection result

E4

Queryand

management

Queryapplication

Taskarchiving

Figure 1 Lifecycle of business data in ldquomonitoring informationrdquo

the artifact lifecycle in which elements of repository serviceartifact type and transport channel are abstracted to rep-resent a realistic business process Artiflow views businessprocess as a graph where nodes are either ldquoservicerdquo or ldquorepos-itoryrdquo We formalize Artiflow to facilitate data analysis andillustrate it to facilitate process analysis Figure 1 illustratesa quality inspection process instance of a certain enterprisewhere themain artifact is the ldquomonitoring information sheetrdquoThe artifact captures the detected productrsquos evolvement fromcreation to archiving which includes all the business-relevantdata in this process The whole process comprises detectiontask registration task assignment task inspection task audi-tion and so forth Note that artifact ldquomonitoring informationsheetrdquo is inseparable from the coordinatewith such other arti-facts as ldquoproduct standardrdquo and ldquomethod amp standardrdquo withinits lifecycleWhen ldquomonitoring information sheetrdquo completesits lifecycle it will serve as a reference to form a new artifactmdashldquodetection information sheet (DIS for short)rdquo

In this figure there are nine services (ldquotask assignmentrdquoldquoauditingrdquo etc) seven repositories (ldquoassignment task libraryrdquoetc) and serial transport channels between these repositoriesand services

23 Model Quality Evaluation Exactly one business objectcan be achieved by implementing different business pro-cesses while different business process corresponds to a dif-ferent Artiflowmodel However wewillmeasure theArtiflowbased on two factors (1) the number of services determinesthe flexibility of model (2) The repository services read andupdate artifacts It is in this context that we define followingtheorem to measure the quality of artifact models

Theorem 8 Given an Artiflow (119873 119878 119877 119862 119877119906) it has 119895 Arti-facts where the number of attributes in any119860119903119905119894119891119886119888119905

119894is 119899119894 Sup-

pose |119878119894| and |119877

119894| represent the service amount and repository

amount of corresponding 119860119903119905119894119891119886119888119905119894 respectively formula (1)

is defined to calculate Artiflowrsquos web service granularity and

repository service proportion so as to measure the quality ofmodels

120587 =

sum119895

119894=1120588119894(120572 (

1003816100381610038161003816100381611987811989410038161003816100381610038161003816

119899119894) + 120573 (1 minus

1003816100381610038161003816100381611987711989410038161003816100381610038161003816

(1003816100381610038161003816100381611987811989410038161003816100381610038161003816

+1003816100381610038161003816100381611987711989410038161003816100381610038161003816)))

sum119894

119894=1120572 (

10038161003816100381610038161198781198941003816100381610038161003816 119899119894) + 120573 (1 minus

10038161003816100381610038161198771198941003816100381610038161003816 (

10038161003816100381610038161198781198941003816100381610038161003816 +

10038161003816100381610038161198771198941003816100381610038161003816))

(1)

where 120572 120573 and 120588119894are known

Theorem Proving For a given Artiflow (119873 119878 119877 119862 Ru) eachArtifact

119894comprises both a service sequence and a reposi-

tory sequence marked as (119878119909 119877119906 119878

119910 119877V) where service

sequence is described as 119878119894

= (119878119909 119878

119910) and repository

sequence is described as119877119894= (119877119906 119877V) Each Artifact119894 also

contains 119898 attributesSuppose |119878

119894| and |119877

119894| represent the service amount and

repository amount respectively then |119878119894|119898 represents the

granularity of services when dividing the whole lifecycle ofartifact by its attribute number 119898 A larger value indicatesthere are more blocks that are divided and the granularity isless which contributes to building a more flexible model

In Artiflow normally each artifact has a following repos-itory to store its intermediate state but there is exceptionthat some services can directly communicate with each otherand do not need intermediate repositories Therefore for thesame Artifact the few the repository elements are the lessthe redundancy would be |119877

119894|(|119878119894| + |119877

119894|) represents the

proportion of repository elements in both service and repos-itory elements within its corresponding artifact lifecycle Theshorter the value is the better the designed lifecycle would be

The quality of Artifact119894is computed by the following

formula

120587 = 120572 (

1003816100381610038161003816100381611987811989410038161003816100381610038161003816

119899119894

) + 120573 (1 minus

1003816100381610038161003816100381611987711989410038161003816100381610038161003816

10038161003816100381610038161198781198941003816100381610038161003816 +

10038161003816100381610038161198771198941003816100381610038161003816

) (2)

where 120572 and 120573 are predefined constants which is used tobalance the different magnitude between values both beforeand after the plus

4 Mathematical Problems in Engineering

Each Artiflow comprises multiple Artifacts so the qual-ity measurement formula for the whole Artiflow is 120587 =

sum119895

119894minus1120588119894120587119894 sum119895

119894minus1120587119894 where 119895 is the number of artifacts and

sum119895

119894minus1120588119894

= 1 120588119894represents the importance of Artifact

119894 The

optimization of key Artifacts has a great impact on the wholemodel to great extent while the optimization of less-valuableartifact does not contribute too much to the model efficiencyNote that 120588

119894can be either given by user or obtained by data

analysisBy integrating with both repository element redundancy

and service element granularity

120587 =sum119895

119894=1120588119894120587119894

sum119895

119894=1120587119894

=

sum119895

119894=1120588119894(120572 (

1003816100381610038161003816100381611987811989410038161003816100381610038161003816

119899119894) + 120573 (1 minus

1003816100381610038161003816100381611987711989410038161003816100381610038161003816

(1003816100381610038161003816100381611987811989410038161003816100381610038161003816

+1003816100381610038161003816100381611987711989410038161003816100381610038161003816)))

sum119894

119894=1120572 (

10038161003816100381610038161198781198941003816100381610038161003816 119899119894) + 120573 (1 minus

10038161003816100381610038161198771198941003816100381610038161003816 (

10038161003816100381610038161198781198941003816100381610038161003816 +

10038161003816100381610038161198771198941003816100381610038161003816))

(3)

can be deduced and taken to measure the model quality

3 Business Data Querying

Enterprises like Google Amazon have provided plenty ofcloud services which provide an open storage solution fordata like process data all over the world But off-site storageis unsafe due to data privacy even public cloud In this casethese data need to be encrypted and then stored in databaseBut it is hard to make a trade-off between data security andquery speed which is because process data need to be fre-quent queried modified and transmitted In this section wemake study on partitioning encrypted artifacts and comingup with a superior query plan for cryptograph query thatminimizes the execution cost

31 Business Data Partition In order to ensure the efficiencyof business process a superior data partition is on-demandWhen using Bucket partitionmethod query result on crypto-graph is actually a superset of true results generated by rele-vant operators and then filtered at the client after decryptionThus superior partition method is of great help and aims tominimize the work done as much as possible such as mini-mizing the number of interferential results

311 Data Analysis

Definition 9 (Bucket [18]) Mapping the domain of attribute119860 into another partitions set 119901

1 119901

119872 where 119901

119894cup 119901119895

= 01 ⩽ 119894 119895 ⩽ 119872 each partition 119901

119894is named as a Bucket 119872 is the

Bucket number

Definition 10 (the user interest on artifact) Querying onArti-factrsquos attribute 119860 of 119899 times respectively while 119902(119886

119894) repre-

sents any single result of queries 119902 that contains value 119886119894

suppose 119891(119902(119886119894)) is the frequency of 119902(119886

119894) occurring in 119899

trials as 119899 increases the frequency stabilizes at a certain valuewhich is expressed as 119901(119902(119886

119894)) In other words 119901(119902(119886

119894) is the

probability of artifact attribute 119886119894emerged in query result

dataset called user interest

Definition 11 (interferential artifact) (Intf-Artifact) is an arti-fact which is incorrect result but belong to cryptograph queryresult 119902

lowast(119886119894) named as INTFA(119902

lowast(119886119894))

312 Min-Interference Partition All Artifacts in each Bucketcorrespond to a given index number in Bucket-based crypto-graph partition Cryptograph query returns all the encryptedArtifacts in Bucket where true result existsThe rest in Bucketwould be transmitted to users as Intf-Artifact and then itshould be deciphered and further filtered Hence Bucketpartition method determines the number of Intf-Artifactswhich further effects the query processing cost

Suppose a cryptograph relation contains 119899 tuples Arti-fact1Artifact

2 Artifact

119899 and 119896 is a large integer then we

pose 119896 random queries Totally there are 119896119894queries where

their final query results are Artifact119894 and other 119897

119894tuples are

returned as the provisional result In this case the expectationof Intf-Artifact is 119897

119894lowast (119896119894119896)

There are 119899 tuples in the relation at all and then theexpectation of total Intf-Artifacts is

1198971

lowast (1198961

119896) + 1198972

lowast (1198962

119896) + sdot sdot sdot + 119897

119899lowast (

119896119899

119896) (4)

As for each Bucket containing 119899 different attribute valuesits user interest is119901(119902(119861)) = 119901(119902(119886

1))+119901(119902(119886

2))+sdot sdot sdot+119901(119902(119886

119899)

If the user interest on 119894th artifact in a given Bucket is119901(119902(119886119894)119901(119902(119861)) (1 le 119894 le 119899) then the number of Intf-

Artifacts brought by above query is |INTFA(119902 lowast (119886119894))| = 119891

1+

1198912

+ sdot sdot sdot + 119891119894minus1

+ 119891119894+1

+ sdot sdot sdot + 119891119899

As for Bucket 119895 (1 le 119895 le 119896) based on the user interest onartifact and the number of Intf-Artifacts in each Bucket wecan describe Bucket Intf-Artifact as followsWINTFA

=

119872

sum

119895=1

⟦119901 (119902 (119861119895))⟧ lowast

10038161003816100381610038161003816INTFA (Bucket

119895)10038161003816100381610038161003816

=

119872

sum

119895=1

[119901 (119902 (119861119895)) lowast

119899

sum

119894=1

(119901 (119902 (119886

119894))

119901 (119902 (119861119895))

1003816100381610038161003816INTFA (119902lowast

(119886119894))

1003816100381610038161003816)]

=

119872

sum

119895=1

[119901 (119902 (119861119895)) lowast

119899

sum

119894=1

(119901 (119902 (119886

119894))

119901 (119902 (119861119895))

(

119899

sum

119894=1

119891119895

119894minus 119891119895

119894))]

=

119872

sum

119895=1

[

119899

sum

119894=1

119901 (119902 (119886119894))

119899

sum

119894=1

119891119895

119894minus

119899

sum

119894=1

119901 (119902 (119886119894)) 119891119895

119894]

=

119872

sum

119895=1

[

119899

sum

119894=1

119901 (119902 (119886119894)) 119865119895

minus

119899

sum

119894=1

119901 (119902 (119886119894)) 119891119895

119894]

(5)

From here we see that in the case of a fixed Bucket num-ber the smaller the value of formula (5) is the more excellentthe index would be A larger value brings a heavy cost whenquerying and renders a low efficiency of Bucket partitionFrom the probability angle Bucket where artifact with higheruser interest exists should contain fewer Artifacts Thereforeuser interest on each artifact should be viewed as the weight

Mathematical Problems in Engineering 5

in the whole processMoreover when the index is being builtformula (5) is used to determine which Bucket we store eachartifact in which helps to obtain an optimal partition result

32 Business Data Query Cloud service stores encryptedartifact information and corresponding index informationwhile such other information as the partitioning of attributesmapping function and so forth are stored at client When auser issues a query request query 119902 should be rewritten toits server-side cryptograph query 119902

lowast which is then executedon cloudThe purpose of rewriting SQL queries is to split thequery computation across the client and cloud

321 Basic Definitions

Definition 12 120585(lowast119886]

(119909) is a function which returns a set ofall the Bucket ID where its right boundary value 119861

119895

lowastright is

not greater than 119909 when once partitioning Bucket that is120585(lowast119886]

(119909) = BIDVlowast

| 119861Vlowastright le 119909

Definition 13 120585[119886lowast)

(119909) is a function which returns a set of allthe Bucket ID where its left boundary value 119861

119895

lowastleft is greater

than 119909 when once partitioning Bucket that is 120585[119886lowast)

(119909) =

BIDVlowast

| 119861Vlowastleft ge 119909

Definition 14 120585(lowast119901(119902(V119894))](119909) is a function which returns a set

of all the Bucket ID where its maximum artifact query proba-bility 119861

lowast

119895119901right is not greater than 119909 when twice partitioning

Bucket that is 120585(lowast119901(119902(V119894))](119909) = BIDlowast

119895| 119861lowast

119895119901right le 119909

Definition 15 120585[119901(119902(119886119894))lowast)

(119909) is a function which returns aset of all the Bucket ID where its minimum artifact queryprobability 119861

lowast

119895119901left is not less than 119909when twice partitioning

Bucket that is 120585[119901(119902(119886119894))lowast)

(119909) = BIDlowast119895

| 119861lowast

119895119901left ge 119909

Definition 16 120575cond(119862) is a function that translates specificquery conditions to encrypted ones

Definition 17 Query rewriting function is described as120575query(119902) rArr 119902

lowast where 119902 is the original query and 119902lowast is the

cryptograph query

322 Query Rewriting Rules In view of grammatical rulesquery condition cond includes V 119860 119860 cond

1or cond

2

cond1

and cond2 where ldquordquo is the operator such as equal less

than not greater than greater than and not less than We listthe rewrite formulas for various query conditions as shownin Formulas (6) to (8)

(1) 119860 V120575cond (119909 = 119890) 997904rArr

1119860lowast

= 120585119890(119909)

120575cond (119909 lt 119890) 997904rArr2119860lowast

le 120585119890(119909)

120575cond (119909 lt 119890) 997904rArr3119860lowast

isin 120585(lowast119890]

(119909)

120575cond (119909 gt 119890) 997904rArr4119860lowast

ge 120585119890(119909)

120575cond (119909 gt 119890) 997904rArr5119860lowast

isin 120585[119890lowast)

(119909)

(6)

where both Map 2 and Map 4 are order preserving and bothMap3 and Map 5 are random

(2) 119860 119860

120575cond (119860119894lt 119860119895)

997904rArr1

or (119860lowast

119894= Bid

119860119894(119901119896) and 119860lowast

119895ge 120585119860119894

(119901left))

120575cond (119860119894lt 119860119895)

997904rArr2

or (119860lowast

119895= Bid

119860119895(119901119897) and 119860lowast

119894ge 120585119860119895

(119901right))

120575cond (119860119894lt 119860119895) 997904rArr3

or (120585119860119894

(119901119896left) le 120585

119860119895(119901119897right))

120575cond (119860119894lt 119860119895)

997904rArr4

or (119860lowast

119894= Bid

119860119894(119901119896) and 119860lowast

119895= Bid

119860119895(119901119897))

(7)

where 119901119896

isin partition(119860119894) 119901119897

isin partition(119860119895) and 119901

119897high ge

119901119896lowWhen the condition is 119860

119894lt 119860119895 in Map 1 119860

119894is order pre-

serving while in Map 3 both 119860119894and 119860

119895are order preserving

Meanwhile in Map 2 119860119895is order preserving and in Map 4

both 119860119894and 119860

119895are random

(3) cond1

or and cond2

120575cond (cond1

or cond2)

997904rArr 120575cond (cond1) or 120575cond (cond

2)

120575cond (cond1

and cond2)

997904rArr 120575cond (cond1) and 120575cond (cond

2)

(8)

For instance suppose there are two artifact plaintexttables in cloud database which are app (aid aname timecontent cid) check (cid aid result) respectively where therange of attribute aid is divided into 6 partitions includingidappaid([0 100]) = 3 idappaid((100 200]) = 7 idappaid((200

300]) = 5 idappaid((300 400]) = 1 idcheckaid([0 200]) = 2idcheckaid((200 400]) = 6

Given above partition results we rewrite the followingquery conditions based on above formulas

120575cond (aid = 256) 997904rArr aidlowast = 5

120575cond (aid lt 180) 997904rArr aidlowast isin 3 7

120575cond (aid gt 240) 997904rArr aidlowast isin 5 1

(9)

120575cond (appdid = checkdid)rArr (applowastdidlowast = 3 and checklowastdidlowast = 2) or (applowastdidlowast = 7 and checklowastdidlowast = 2) or (applowastdidlowast =5 and checklowastdidlowast = 6) or (applowastdidlowast = 1 and checklowastdidlowast = 6)

120575cond (appdid lt checkdid)rArr (applowastdidlowast = 3 and checklowastdidlowast = 2) or (applowastdidlowast = 3 and checklowastdidlowast = 6) or (applowastdidlowast =7 and checklowastdidlowast = 2) or (applowastdidlowast = 7 and checklowastdidlowast = 6) or

(applowastdidlowast = 5 and checklowastdidlowast = 6) or (applowastdidlowast = 1 and checklowastdidlowast = 6)

323 Query Optimization Principles Because data isencrypted and stored in various places in order to reducethe transmission cost and improve the query efficiency we

6 Mathematical Problems in Engineering

Planeairway = airwayairway

Airway

Price Plane Plane

Πchairman

120590price lt 900 120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Planeplaneid = planeplaneid

Figure 2 Initial syntax tree

120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Planeairway = airwayairway

Πchairman

120590price lt 900

Pricelowast Planelowast Planelowast

DecryptDecryptDecrypt

Decrypt

Airwaylowast

Planeplaneid = planeplaneid

Figure 3 Syntax tree applied to cloud DB

should run operations on cloud services as much as possibleand the answers can be computed with little effort by theclient

For clear expression operation procedure is expressedby using the syntax tree The decryption operation splits thetree into cryptograph operations and plaintext operationsBecause any single operation on the original tree ends withthe selection after decryption thereby the principle of queryoptimization by using syntax tree is to iteratively pull up theselection

For example given a selection ldquoSELECT chairman FROMAirway Price PlaneWHERE price lt 900 AND begin = ldquoshang-hairdquo AND end = ldquobeijingrdquo ANDPriceplaneid = PlaneplaneidAND Planeairway = Airwayairwayrdquo we take query tree toillustrate how to optimize this query and describe its detailedprocedures

In Figure 2 the SQL statement is converted into an initialsyntax tree If the enterprise use cloud services or other off-site storage platforms we need to first decrypt the crypto-graph then query the data at client as shown in Figure 3where cryptograph database on cloud is bounded by the dot-ted line Query objects (Price Plane and Airway) are con-verted to cryptograph tables (Pricelowast Planelowast and Airwaylowast) inthe cloud database

Operations on syntax tree are performed from bottom toup In Figure 3 the first step is to execute selection while thefollowing steps include rewriting the condition of selectionoperations converting it to a selection on cryptograph incloud database and then decrypting and further filtering theresult at client A new syntax tree is derived as shown inFigure 4

Planeairway = airwayairway

Planeplaneid = planeplaneid

Πchairman

120590price lt 900

Pricelowast Planelowast Planelowast

Decrypt

Decrypt Decrypt Decrypt

Airwaylowast

120590lowast120575cond(price lt 900) 120590

lowast120575cond(begin = ldquoShanghairdquo) 120590

lowast120575cond(end = ldquoBeijingrdquo)

120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Figure 4 Rewriting syntax tree

Planeairway = airwayairway

Planeplaneid = planeplaneid

Πchairman

Pricelowast Planelowast

Decrypt

Decrypt Decrypt

Airwaylowast

120590lowast120575cond(price lt 900)

120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

(begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590lowast120575cond

Figure 5 Moving selections in syntax tree

According to optimization principles described abovewe should iteratively pull up selections Therefore by bothexchanging the positions between selection operations(pricelt 900 begin = ldquoshanghairdquo and end = ldquobeijingrdquo) and joinoperation and then combing corresponding conditions weobtain a new syntax tree as shown in Figure 5

Moreover based on operation rewriting rules join opera-tion in Figure 5 should be converted into two parts includingthe join on cryptograph in the cloud database and the selec-tion on decrypted provisional results as shown in Figure 6Repeat the above steps rewrite all kinds of operations andcontinuously exchange the positions between selection oper-ations and other operations till all the selections cannot bepulled up As a result we get the ultima syntax tree as shownin Figure 7 Operations within dotted line would be executedon cloud service whereas user only needs to execute the lastselection From here we see that the above method takes fulladvantage of cloud service to reduce the cost of transmittingand postprocessing and improve the efficiency of artifactquerying in business process

4 Case Study

In this section we will introduce a business instance of acertain enterprise Based on themethod in Section 2we com-plete the data modeling with artifact lifecycle from a givenprocess instance and illustrate the query process throughquery tree mentioned in Section 3

An enterprisersquos process of equipment purchasescrapinvolves the following steps At first equipment division fillsout the equipment purchasescrap application and hands it to

Mathematical Problems in Engineering 7

Planeairway = airwayairway

Πchairman

Pricelowast Planelowast

Decrypt

Decrypt

Airwaylowast

lowast

120590priceplaneid = planeplaneid

120575cond(priceplaneid = planeplaneid)

120590lowast120575cond(price lt 900)

120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

(begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590lowast120575cond

Figure 6 Rewriting syntax tree

departmentmanagers and companyrsquos leadership for approvalIf the application is consented then we should archive itelse we withdraw it Purchasing department does purchaseaccording to a copy of application and when the purchaseis completed documents should be archived Equipmentdivision scraps the equipment based on specific methods andstandards and then archives the processing results Assetsdepartment regularly verifies companyrsquos assets based onpurchasescrap equipment information Archive departmenthas permission to query all the archived information

This process involves multiple departments and multiplesets of information If we manage the data alone as businessdata are complicated and even one attribute has differencevalue in different event thereby it is difficult to manage If wemanage the process alone only the department activities willbe involvedwhile business data in the process will be ignoredIn this context we analyze the process concerning both dataand process and describe this instance with an Artiflow (119873119878 119877 119862 Ru) where

119873 ldquoEP119878rdquo

119878 FilloutEPAAudit1 Audit2Query Purchase AssetVerification

119877 NewEPA PrimaryEPA FinalEPA Unapro-vedEPA PO FAL

119862 FtoN NtoA

Ru constraint (EPA) = (FilloutEPA Audit1 Aud-it2)

The model contains multiple artifacts ldquoEPArdquo is fordescribing equipment purchase application while its lifecyclestarts from filling auditing to archiving When having beenarchived it will provide asset verification and support queryprocessing ldquoESArdquo is for describing equipment scrap applica-tion while its lifecycle starts from filling auditing to archiv-ing It is associated with another artifact called ldquomethod ampstandardrdquo The lifecycle of ldquoPOrdquoldquoSLrdquo captures process from

Πchairman

Pricelowast Planelowast

Decrypt

Airwaylowastlowast

lowast

120575cond(priceplaneid = planeplaneid)

120575condplaneairway = airwayairway

priceplaneid = planeplaneid and planeairway = airwayairway120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

120590lowast120575cond(price lt 900) (begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590

lowast120575cond

Figure 7 The ultima syntax tree

purchasescrap to application archive The whole process isshown in Figure 8

Artifact Example

Artifact (119862 A 120591 119876 119904 119865)119862 = ldquoEPArdquoA EquipmentName PurchaseAmount UnitPriceApplicationDate Applicant AuditingComment Audit-ingDate119876 empty table (initial state 119904) basic information fill-ing delivery auditing auditing completion auditedapplication archiving (terminate state 119865)120591 EquipmentName verchar PurchaseAmount InUnitPrice Int ApplicationDate Date Applicant Ver-char AuditingComment Verchar AuditingDate Date

Service Example

service = (119899 119881119903 119881119908 119875 119864) where

119899 = ldquoAudit2rdquo119881119903 EPA

119881119908 EPA

119875 DEFINED (EquipmentName) and DEFINED (Pur-chaseAmount) and DEFINED (UnitPrice) and DEFINED(ApplicationDate) and DEFINED (Applicant) and

notDEFINED (AuditingComment) and notDEFINED(AuditingDate)119864 DEFINED (EquipmentName) and DEFINED (Pur-chaseAmount) and DEFINED (UnitPrice) and DEFINED(ApplicationDate) and DEFINED (Applicant) and

DEFINED (AuditingComment) and DEFINED (Audit-ingDate)

Repository Example

119877 = (re 119877119886 119877119903 119862119905)

re ldquoFinalEPArdquo119877119886 EPA

8 Mathematical Problems in Engineering

M1 purchaserequest

EPA

EPA

EPA

EPA EPA

EPA

EPA

EPA

EPA

EPA

EPA

E1

Fill outEPA

E3

E3

E3

NewEPA M3 submit 1

E4

E4

NewESA

M5 scrap request

Fill outESA

UnapprovedEPA

EPA

Unapproved

PrimaryEPA

PrimaryESA

ESA

ESA

ESAESA

ESAESA

ESA

ESA

ESA

ESA

Methodand

standard

M4 submit 2

Audit2

E5

E6

E6

E6

E2

Scrap

FinalESA SL

SL

SL

SLMS

FAL

FAL FAL

FAL

Query

M2 query

M6 perform

M6 perform

Purchase

PO

PO

PO

PO

Assetverification

Final

Audit1

Figure 8 Lifecycle of business data in equipment purchasescrap process

DecryptDecrypt

Decrypt

POEPAid = finalEPAId

FAL Poid = POId 120590finalEPAaudit1 = ldquoDoctor Lirdquo

120590FALname = ldquodepth sounderrdquo

FinalEPAlowast

FAClowast

POlowast

Πapp name

Figure 9 Initial query on EPS

119877119903 EPA

119862119905 IsDefine(AuditingComment)

There is a repository named ldquoFinalEPArdquo which reads andstores artifact ldquoEPArdquo only if ldquoAuditingCommentrdquo has beenassigned

Given a query ldquoSELECT app name FROM FAL POFinalEPA WHERE Falname = lsquodepthsounderrsquo AND FALPOid=POIdANDPOEPAid= FinalEPAIdANDFinalEPAAudit2 = lsquoDoctor Lirdquorsquo it can be converted into a syntax treeas shown in Figure 9 which can be further converted into anew syntax tree shown in Figure 10 Queries will be issued onthis syntax tree

Decrypt

FinalEPAlowast

FAClowast

POlowast

Πapp name

120590FALname = ldquodepth sounderrdquo and FAL POId=POId and

POEPAid = finalEPAId and finalEPA audit2 = ldquoDoctor Lirdquo

120575cond(POEAPId = finalEPAId)

120575cond(FAL Poid = POId) 120590lowast120575cond (finalEPA audit2 = ldquoDoctor Lirdquo)

120590lowast120575cond FAL name = ldquodepth sounderrdquo

lowast

lowast

Figure 10 Ultima query on EPS

5 Conclusion

There is no doubt that more and more large datasets will bepoured out during business process execution meanwhilethese business data are extremely valuable In this case wemodeled business data through its lifecycle from the per-spective of process which ensures the integrity of dynamicbusiness data Furthermore we present the notion of userinterest on business data which has a superior function in

Mathematical Problems in Engineering 9

countingminimum interferential tuples during data partitionand ensuring a lower cost of postprocessing brought bydata partition Considering current business data are mostlystored on cloud we proposed a query rewriting strategy foroff-site encrypted data which has a significant advantage inreducing the postprocessing cost Currently there is littleresearch on business data modeling and querying in the truesense Our research lays great foundation for business datarsquosapplication in enterprise That is the initial step of businessdata management architecture and we will further researchon business data analysis with its lifecycle to fully dig thesignificant value of business data

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Natural Science Foun-dation of China (61272098) and Science and TechnologyDevelopment Foundation of Shanghai Ocean University

References

[1] W Huang Z ChenW Dong H Li B Cao and J Cao ldquoMobileinternet big data platform in china unicomrdquo Tsinghua Scienceand Technology vol 19 no 1 pp 95ndash101 2014

[2] S Sagiroglu andD Sinanc ldquoBig data a reviewrdquo in Proceedings ofthe International Conference on Collaboration Technologies andSystems (CTS rsquo13) pp 42ndash47 San Diego Calif USA May 2013

[3] M Chui B Brown J Bughin et al Big Data The Next Frontierfor Innovation Competition and Productivity McKinsey GlobalInstitute 2011

[4] M Singh and S K Jain ldquoA survey on dataspacerdquo in Advances inNetwork Security and Applications vol 196 of Communicationsin Computer and Information Science pp 608ndash621 Springer2011

[5] K Belhajjame N W Paton S M Embury A A A Fernandesand C Hedeler ldquoIncrementally improving dataspaces based onuser feedbackrdquo Information Systems vol 38 no 5 pp 656ndash6872013

[6] J-P Dittrich and M A Vaz Salles ldquoIDM a unified and versa-tile data model for Personal Dataspace Managementrdquo in Pro-ceedings of the 32nd International Conference onVery LargeDataBases (VLDB rsquo06) pp 367ndash378 September 2006

[7] S Pradhan ldquoTowards a novel desktop search techniquerdquoin Database and Expert Systems Applications pp 192ndash201Springer Berlin Germany 2007

[8] M Zhong M Liu and Q Chen ldquoModeling heterogeneousdata in dataspacerdquo in Proceedings of the IEEE InternationalConference on Information Reuse and Integration (IEEE IRI rsquo08)pp 404ndash409 Las Vegas Nev USA July 2008

[9] A Sarma X Dong andAHalevy ldquoDatamodeling in dataspacesupport platformsrdquo in Conceptual Modeling Foundations andApplications pp 122ndash138 Springer Berlin Germany 2009

[10] R Hull and J Su Report on NSF Workshop on Data-CentricWorkflows 2012 httpdcw2009csucsbedureportpdf

[11] R Hull J Su and R Vaculin ldquoData management perspectiveson business process managementrdquo in Proceedings of the ACMSIGMODConference onManagement ofData (SIGMOD rsquo13) pp943ndash947 June 2013

[12] K Bhattacharya N S Caswell S Kumaran A Nigam and FY Wu ldquoArtifact-centered operational modeling lessons fromcustomeengagementsrdquo IBM Systems Journal vol 46 no 4 pp703ndash721 2007

[13] K Bhattacharya R Guttman K Lyman et al ldquoA model-drivenapproach to industrializing discovery processes in pharmaceu-tical researchrdquo IBM Systems Journal vol 44 no 1 pp 145ndash1622005

[14] R Vaculın R Hull T Heath C Cochran A Nigam and PSukaviriya ldquoDeclarative business artifact centric modeling ofdecision and knowledge intensive business processesrdquo in Pro-ceedings of the 15th IEEE International EDOC Enterprise Com-puting Conference (EDOC rsquo11) pp 151ndash160 September 2011

[15] R Vaculın R Hull M Vukovic T Heath N Mills and Y SunldquoSupporting collaborative decision processesrdquo in Proceedings ofthe IEEE 10th International Conference on Services Computing(SCC rsquo13) pp 651ndash658 July 2013

[16] A Nigam and N S Caswell ldquoBusiness artifacts an approach tooperational specificationrdquo IBM Systems Journal vol 42 no 3pp 428ndash445 2003

[17] G Liu X Liu H Qin et al ldquoAutomated realization of businessworkflow specificationrdquo in Proceedings of the 1st InternationalWorkshop on SOA Globalization People and Work (SG-PAWrsquo09) pp 8ndash9 2009

[18] H Hacigumus B Iyer C Li and S Mehrotra ldquoExecuting SQLover encrypted data in the database-service-providermodelrdquo inProceedings of the ACM SIGMOD International Conference onManagment of Data pp 216ndash227 June 2002

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article Modeling and Querying Business Data with ...downloads.hindawi.com/journals/mpe/2015/506272.pdf · monitoring for performance or business concerns, auditing, and compliance

2 Mathematical Problems in Engineering

for normal users to learn UDM (unified data model) [7] usesthe integrated IR-DB approach which is able to represent thepartial sections of a file but is also not able to support rela-tional data query Triple model [8] represents heterogeneousdata in triple form which is a simple and flexible solutionbut does not support the path expression queries uncertaintyand lineage queries PDM (probabilistic semantic model) [9]supports top-k query answering but it is difficult to obtainreliable probability functions The methods above are basedon personal dataspace Unfortunately in enterprise-led dataspace scenarios today there is rare research works on datamodeling

Query ability is the basis of the exploitation of Big Datarsquosvalue Query language iQL [6] realizes rules-based queryoptimization but ignores the evaluation of optimization costUDM [7] introduces a new query language which is basedon SQL query language with some extended core opera-tions called TALZBRA operation Triple model [8] supportssubject predicate object (SPO) query language that can beenhanced by RDF-based query language DSSP (dataspacesupport platforms) supports some useful services on datas-pace helps to recognize the correlation among sources ofdataspace and provides a basic query schema upon thesedata sources In enterprise-led dataspace business processdata is the key element in data modeling which has suchcharacteristics as large-volume strong temporal correlationand stable lifecycle These characteristics make it an extremechallenge for current query schemes

Business data realistically records the whole executionprocess of a single task including execution status resourcestatus and real-time usage and correlation with other busi-ness process instances Executing a business process wouldgenerate additional data for a variety of reasons such asmonitoring for performance or business concerns auditingand compliance checking Even business process schemasand enactments can be viewed as data so that they can bemanaged queried mined for process schemas and analyzed[10] An artifact is a kind of widely recognized business pro-cess data representing key business entities Artifact-centricapproach [11] is the representative method in data-centricbusiness process management and has been applied in vari-ous client engagements including financial [12] supply chainretailer [13] bank pharmaceutical research [14] and cooper-ative work [15] In this paper we firstly adopt artifact as a basicelement analyze its evolution process and then model busi-ness data through corresponding artifact lifecycle Secondlywe make efforts on devising a safe and quick query strategyin consideration of the privacy and storage distribution ofartifacts

The rest of the paper is organized as follows In Section 2we introduce the concept of artifact in field of workflowmanagement and model business data with its lifecycle fromthe perspective of process In Section 3 we propose a businessdata partition method concerning user interest based onwhich we further present a cryptograph query for off-sitestorage data Then we give a detailed instance to verify theproposed method in Section 4 In the last section we draw aconclusion

2 Business Data Modeling

As before artifacts describe the business-relevant data andtheir lifecycles which is an important property of businessdata and describes the whole dynamic process of businessdata It also contains specific time information To takeadvantage of these characteristics our strategy is to modelbusiness data with its lifecycle which aims to realize the com-pleted description of dynamic business data In this sectionwe introduce artifact-relevant notions and take artifact-centric process description method to model business datawith artifact lifecycle Furthermore we adopt business pro-cess logic model to illustrate the lifecycle of business data andthen measure the quality of above model

21 Basic Definition

Definition 1 Artifact [16] is an objective data entity whichrecords the business process Artifact comprises both aunique immutable identity and self-describing mutable con-tent

Definition 2 An artifact lifecycle captures the end-to-endprocess of a specific artifact from creation to completion andarchiving

Definition 3 Artiflow model (artifact logical flow) [17] is 5-tuple (119873 119878 119877 119862 Ru) where 119873 is the name of model 119878 is afinite set of services119877 is a finite set of repositories119862 is a finiteset of transport channels and Ru is a finite set of businessrules

Definition 4 The states of artifact are a set and119898119894=1

119868119904119863119890119891119894119899119890(119860119894)

(conjunction expression) where 119868119904119863119890119891119894119899119890(119860119894) is a mapping

function that assigns a Boolean value 0 1 to each singleattribute 119860

119894in attribute set 119860 119860

119894isin 119860 119894 isin 1 119898 and 119898 is

the number of attributes in artifact If the attribute is definedand has value it will return 1 else it will return 0

Definition 5 Service is 5-tuple (119899 119881119903 119881119908 119875 119864) where 119899 is

the name of a certain service 119899 isin 119878 119881119903 119881119908are the finite set of

artifact classes where 119881119903is a set of artifacts which the service

is about to read and 119881119908is a set of artifacts which the service is

about to rewrite119875 is the description of artifact states inputtedby 119881 119864 is the description of activities on 119881

Definition 6 Repository is 4-tuple 119877 = (re 119877119886 119877119903 119862119905) where

re is the name of repository 119877119886 119877119903are the set of stored and

read artifacts respectively 119862119905is the reading condition for 119877

119903

Definition 7 Transport channel is 2-tuple (Cn Cs) where Cnis the name of the channel Cs is 3-tuple (prior servicerepos-itory name rear servicerepository name channel type) Cs isin

119877 times 119878 times ReadReadOnly cup 119878 times 119877 times Write where 119877 119878 arethe finite set of repository elements and service elements inArtiflow respectively and the set of transport channel typesis described as Read ReadOnly Write

22 Data Modeling with Artifact Lifecycle As suggested inDefinitions 2 and 3 Artiflow is a logical model that records

Mathematical Problems in Engineering 3

M1 request fordetection

E1Detection

taskregistration

DIS DIS DIS

DIS

DIS

DIS

DIS DIS DIS

DIS

Tasklibrary

DIS

M2 updatemethod and standard

E2

Method andstandard

management

Method and standardaltering table

Product standardaltering table

Product standardaltering table

Method and standardaltering table

Method andstandard

Product standardinformation

Taskassignment

Assignmenttask library

M3 alter product standard

E3

Productstandard

management

Productstandard

Product standardinformation

Method and standardinformation

Method and standardinformation

Dectinginformation sheet

Lab 1

Lab 2

Detectionresult

library

M4 submitauditing

Auditing

E4

E4

E4

E5

Auditedinformation

library

M5 edit report

Reportediting

Detectioninformation sheet

Detection report

Queryapplication

M4 querydetection result

E4

Queryand

management

Queryapplication

Taskarchiving

Figure 1 Lifecycle of business data in ldquomonitoring informationrdquo

the artifact lifecycle in which elements of repository serviceartifact type and transport channel are abstracted to rep-resent a realistic business process Artiflow views businessprocess as a graph where nodes are either ldquoservicerdquo or ldquorepos-itoryrdquo We formalize Artiflow to facilitate data analysis andillustrate it to facilitate process analysis Figure 1 illustratesa quality inspection process instance of a certain enterprisewhere themain artifact is the ldquomonitoring information sheetrdquoThe artifact captures the detected productrsquos evolvement fromcreation to archiving which includes all the business-relevantdata in this process The whole process comprises detectiontask registration task assignment task inspection task audi-tion and so forth Note that artifact ldquomonitoring informationsheetrdquo is inseparable from the coordinatewith such other arti-facts as ldquoproduct standardrdquo and ldquomethod amp standardrdquo withinits lifecycleWhen ldquomonitoring information sheetrdquo completesits lifecycle it will serve as a reference to form a new artifactmdashldquodetection information sheet (DIS for short)rdquo

In this figure there are nine services (ldquotask assignmentrdquoldquoauditingrdquo etc) seven repositories (ldquoassignment task libraryrdquoetc) and serial transport channels between these repositoriesand services

23 Model Quality Evaluation Exactly one business objectcan be achieved by implementing different business pro-cesses while different business process corresponds to a dif-ferent Artiflowmodel However wewillmeasure theArtiflowbased on two factors (1) the number of services determinesthe flexibility of model (2) The repository services read andupdate artifacts It is in this context that we define followingtheorem to measure the quality of artifact models

Theorem 8 Given an Artiflow (119873 119878 119877 119862 119877119906) it has 119895 Arti-facts where the number of attributes in any119860119903119905119894119891119886119888119905

119894is 119899119894 Sup-

pose |119878119894| and |119877

119894| represent the service amount and repository

amount of corresponding 119860119903119905119894119891119886119888119905119894 respectively formula (1)

is defined to calculate Artiflowrsquos web service granularity and

repository service proportion so as to measure the quality ofmodels

120587 =

sum119895

119894=1120588119894(120572 (

1003816100381610038161003816100381611987811989410038161003816100381610038161003816

119899119894) + 120573 (1 minus

1003816100381610038161003816100381611987711989410038161003816100381610038161003816

(1003816100381610038161003816100381611987811989410038161003816100381610038161003816

+1003816100381610038161003816100381611987711989410038161003816100381610038161003816)))

sum119894

119894=1120572 (

10038161003816100381610038161198781198941003816100381610038161003816 119899119894) + 120573 (1 minus

10038161003816100381610038161198771198941003816100381610038161003816 (

10038161003816100381610038161198781198941003816100381610038161003816 +

10038161003816100381610038161198771198941003816100381610038161003816))

(1)

where 120572 120573 and 120588119894are known

Theorem Proving For a given Artiflow (119873 119878 119877 119862 Ru) eachArtifact

119894comprises both a service sequence and a reposi-

tory sequence marked as (119878119909 119877119906 119878

119910 119877V) where service

sequence is described as 119878119894

= (119878119909 119878

119910) and repository

sequence is described as119877119894= (119877119906 119877V) Each Artifact119894 also

contains 119898 attributesSuppose |119878

119894| and |119877

119894| represent the service amount and

repository amount respectively then |119878119894|119898 represents the

granularity of services when dividing the whole lifecycle ofartifact by its attribute number 119898 A larger value indicatesthere are more blocks that are divided and the granularity isless which contributes to building a more flexible model

In Artiflow normally each artifact has a following repos-itory to store its intermediate state but there is exceptionthat some services can directly communicate with each otherand do not need intermediate repositories Therefore for thesame Artifact the few the repository elements are the lessthe redundancy would be |119877

119894|(|119878119894| + |119877

119894|) represents the

proportion of repository elements in both service and repos-itory elements within its corresponding artifact lifecycle Theshorter the value is the better the designed lifecycle would be

The quality of Artifact119894is computed by the following

formula

120587 = 120572 (

1003816100381610038161003816100381611987811989410038161003816100381610038161003816

119899119894

) + 120573 (1 minus

1003816100381610038161003816100381611987711989410038161003816100381610038161003816

10038161003816100381610038161198781198941003816100381610038161003816 +

10038161003816100381610038161198771198941003816100381610038161003816

) (2)

where 120572 and 120573 are predefined constants which is used tobalance the different magnitude between values both beforeand after the plus

4 Mathematical Problems in Engineering

Each Artiflow comprises multiple Artifacts so the qual-ity measurement formula for the whole Artiflow is 120587 =

sum119895

119894minus1120588119894120587119894 sum119895

119894minus1120587119894 where 119895 is the number of artifacts and

sum119895

119894minus1120588119894

= 1 120588119894represents the importance of Artifact

119894 The

optimization of key Artifacts has a great impact on the wholemodel to great extent while the optimization of less-valuableartifact does not contribute too much to the model efficiencyNote that 120588

119894can be either given by user or obtained by data

analysisBy integrating with both repository element redundancy

and service element granularity

120587 =sum119895

119894=1120588119894120587119894

sum119895

119894=1120587119894

=

sum119895

119894=1120588119894(120572 (

1003816100381610038161003816100381611987811989410038161003816100381610038161003816

119899119894) + 120573 (1 minus

1003816100381610038161003816100381611987711989410038161003816100381610038161003816

(1003816100381610038161003816100381611987811989410038161003816100381610038161003816

+1003816100381610038161003816100381611987711989410038161003816100381610038161003816)))

sum119894

119894=1120572 (

10038161003816100381610038161198781198941003816100381610038161003816 119899119894) + 120573 (1 minus

10038161003816100381610038161198771198941003816100381610038161003816 (

10038161003816100381610038161198781198941003816100381610038161003816 +

10038161003816100381610038161198771198941003816100381610038161003816))

(3)

can be deduced and taken to measure the model quality

3 Business Data Querying

Enterprises like Google Amazon have provided plenty ofcloud services which provide an open storage solution fordata like process data all over the world But off-site storageis unsafe due to data privacy even public cloud In this casethese data need to be encrypted and then stored in databaseBut it is hard to make a trade-off between data security andquery speed which is because process data need to be fre-quent queried modified and transmitted In this section wemake study on partitioning encrypted artifacts and comingup with a superior query plan for cryptograph query thatminimizes the execution cost

31 Business Data Partition In order to ensure the efficiencyof business process a superior data partition is on-demandWhen using Bucket partitionmethod query result on crypto-graph is actually a superset of true results generated by rele-vant operators and then filtered at the client after decryptionThus superior partition method is of great help and aims tominimize the work done as much as possible such as mini-mizing the number of interferential results

311 Data Analysis

Definition 9 (Bucket [18]) Mapping the domain of attribute119860 into another partitions set 119901

1 119901

119872 where 119901

119894cup 119901119895

= 01 ⩽ 119894 119895 ⩽ 119872 each partition 119901

119894is named as a Bucket 119872 is the

Bucket number

Definition 10 (the user interest on artifact) Querying onArti-factrsquos attribute 119860 of 119899 times respectively while 119902(119886

119894) repre-

sents any single result of queries 119902 that contains value 119886119894

suppose 119891(119902(119886119894)) is the frequency of 119902(119886

119894) occurring in 119899

trials as 119899 increases the frequency stabilizes at a certain valuewhich is expressed as 119901(119902(119886

119894)) In other words 119901(119902(119886

119894) is the

probability of artifact attribute 119886119894emerged in query result

dataset called user interest

Definition 11 (interferential artifact) (Intf-Artifact) is an arti-fact which is incorrect result but belong to cryptograph queryresult 119902

lowast(119886119894) named as INTFA(119902

lowast(119886119894))

312 Min-Interference Partition All Artifacts in each Bucketcorrespond to a given index number in Bucket-based crypto-graph partition Cryptograph query returns all the encryptedArtifacts in Bucket where true result existsThe rest in Bucketwould be transmitted to users as Intf-Artifact and then itshould be deciphered and further filtered Hence Bucketpartition method determines the number of Intf-Artifactswhich further effects the query processing cost

Suppose a cryptograph relation contains 119899 tuples Arti-fact1Artifact

2 Artifact

119899 and 119896 is a large integer then we

pose 119896 random queries Totally there are 119896119894queries where

their final query results are Artifact119894 and other 119897

119894tuples are

returned as the provisional result In this case the expectationof Intf-Artifact is 119897

119894lowast (119896119894119896)

There are 119899 tuples in the relation at all and then theexpectation of total Intf-Artifacts is

1198971

lowast (1198961

119896) + 1198972

lowast (1198962

119896) + sdot sdot sdot + 119897

119899lowast (

119896119899

119896) (4)

As for each Bucket containing 119899 different attribute valuesits user interest is119901(119902(119861)) = 119901(119902(119886

1))+119901(119902(119886

2))+sdot sdot sdot+119901(119902(119886

119899)

If the user interest on 119894th artifact in a given Bucket is119901(119902(119886119894)119901(119902(119861)) (1 le 119894 le 119899) then the number of Intf-

Artifacts brought by above query is |INTFA(119902 lowast (119886119894))| = 119891

1+

1198912

+ sdot sdot sdot + 119891119894minus1

+ 119891119894+1

+ sdot sdot sdot + 119891119899

As for Bucket 119895 (1 le 119895 le 119896) based on the user interest onartifact and the number of Intf-Artifacts in each Bucket wecan describe Bucket Intf-Artifact as followsWINTFA

=

119872

sum

119895=1

⟦119901 (119902 (119861119895))⟧ lowast

10038161003816100381610038161003816INTFA (Bucket

119895)10038161003816100381610038161003816

=

119872

sum

119895=1

[119901 (119902 (119861119895)) lowast

119899

sum

119894=1

(119901 (119902 (119886

119894))

119901 (119902 (119861119895))

1003816100381610038161003816INTFA (119902lowast

(119886119894))

1003816100381610038161003816)]

=

119872

sum

119895=1

[119901 (119902 (119861119895)) lowast

119899

sum

119894=1

(119901 (119902 (119886

119894))

119901 (119902 (119861119895))

(

119899

sum

119894=1

119891119895

119894minus 119891119895

119894))]

=

119872

sum

119895=1

[

119899

sum

119894=1

119901 (119902 (119886119894))

119899

sum

119894=1

119891119895

119894minus

119899

sum

119894=1

119901 (119902 (119886119894)) 119891119895

119894]

=

119872

sum

119895=1

[

119899

sum

119894=1

119901 (119902 (119886119894)) 119865119895

minus

119899

sum

119894=1

119901 (119902 (119886119894)) 119891119895

119894]

(5)

From here we see that in the case of a fixed Bucket num-ber the smaller the value of formula (5) is the more excellentthe index would be A larger value brings a heavy cost whenquerying and renders a low efficiency of Bucket partitionFrom the probability angle Bucket where artifact with higheruser interest exists should contain fewer Artifacts Thereforeuser interest on each artifact should be viewed as the weight

Mathematical Problems in Engineering 5

in the whole processMoreover when the index is being builtformula (5) is used to determine which Bucket we store eachartifact in which helps to obtain an optimal partition result

32 Business Data Query Cloud service stores encryptedartifact information and corresponding index informationwhile such other information as the partitioning of attributesmapping function and so forth are stored at client When auser issues a query request query 119902 should be rewritten toits server-side cryptograph query 119902

lowast which is then executedon cloudThe purpose of rewriting SQL queries is to split thequery computation across the client and cloud

321 Basic Definitions

Definition 12 120585(lowast119886]

(119909) is a function which returns a set ofall the Bucket ID where its right boundary value 119861

119895

lowastright is

not greater than 119909 when once partitioning Bucket that is120585(lowast119886]

(119909) = BIDVlowast

| 119861Vlowastright le 119909

Definition 13 120585[119886lowast)

(119909) is a function which returns a set of allthe Bucket ID where its left boundary value 119861

119895

lowastleft is greater

than 119909 when once partitioning Bucket that is 120585[119886lowast)

(119909) =

BIDVlowast

| 119861Vlowastleft ge 119909

Definition 14 120585(lowast119901(119902(V119894))](119909) is a function which returns a set

of all the Bucket ID where its maximum artifact query proba-bility 119861

lowast

119895119901right is not greater than 119909 when twice partitioning

Bucket that is 120585(lowast119901(119902(V119894))](119909) = BIDlowast

119895| 119861lowast

119895119901right le 119909

Definition 15 120585[119901(119902(119886119894))lowast)

(119909) is a function which returns aset of all the Bucket ID where its minimum artifact queryprobability 119861

lowast

119895119901left is not less than 119909when twice partitioning

Bucket that is 120585[119901(119902(119886119894))lowast)

(119909) = BIDlowast119895

| 119861lowast

119895119901left ge 119909

Definition 16 120575cond(119862) is a function that translates specificquery conditions to encrypted ones

Definition 17 Query rewriting function is described as120575query(119902) rArr 119902

lowast where 119902 is the original query and 119902lowast is the

cryptograph query

322 Query Rewriting Rules In view of grammatical rulesquery condition cond includes V 119860 119860 cond

1or cond

2

cond1

and cond2 where ldquordquo is the operator such as equal less

than not greater than greater than and not less than We listthe rewrite formulas for various query conditions as shownin Formulas (6) to (8)

(1) 119860 V120575cond (119909 = 119890) 997904rArr

1119860lowast

= 120585119890(119909)

120575cond (119909 lt 119890) 997904rArr2119860lowast

le 120585119890(119909)

120575cond (119909 lt 119890) 997904rArr3119860lowast

isin 120585(lowast119890]

(119909)

120575cond (119909 gt 119890) 997904rArr4119860lowast

ge 120585119890(119909)

120575cond (119909 gt 119890) 997904rArr5119860lowast

isin 120585[119890lowast)

(119909)

(6)

where both Map 2 and Map 4 are order preserving and bothMap3 and Map 5 are random

(2) 119860 119860

120575cond (119860119894lt 119860119895)

997904rArr1

or (119860lowast

119894= Bid

119860119894(119901119896) and 119860lowast

119895ge 120585119860119894

(119901left))

120575cond (119860119894lt 119860119895)

997904rArr2

or (119860lowast

119895= Bid

119860119895(119901119897) and 119860lowast

119894ge 120585119860119895

(119901right))

120575cond (119860119894lt 119860119895) 997904rArr3

or (120585119860119894

(119901119896left) le 120585

119860119895(119901119897right))

120575cond (119860119894lt 119860119895)

997904rArr4

or (119860lowast

119894= Bid

119860119894(119901119896) and 119860lowast

119895= Bid

119860119895(119901119897))

(7)

where 119901119896

isin partition(119860119894) 119901119897

isin partition(119860119895) and 119901

119897high ge

119901119896lowWhen the condition is 119860

119894lt 119860119895 in Map 1 119860

119894is order pre-

serving while in Map 3 both 119860119894and 119860

119895are order preserving

Meanwhile in Map 2 119860119895is order preserving and in Map 4

both 119860119894and 119860

119895are random

(3) cond1

or and cond2

120575cond (cond1

or cond2)

997904rArr 120575cond (cond1) or 120575cond (cond

2)

120575cond (cond1

and cond2)

997904rArr 120575cond (cond1) and 120575cond (cond

2)

(8)

For instance suppose there are two artifact plaintexttables in cloud database which are app (aid aname timecontent cid) check (cid aid result) respectively where therange of attribute aid is divided into 6 partitions includingidappaid([0 100]) = 3 idappaid((100 200]) = 7 idappaid((200

300]) = 5 idappaid((300 400]) = 1 idcheckaid([0 200]) = 2idcheckaid((200 400]) = 6

Given above partition results we rewrite the followingquery conditions based on above formulas

120575cond (aid = 256) 997904rArr aidlowast = 5

120575cond (aid lt 180) 997904rArr aidlowast isin 3 7

120575cond (aid gt 240) 997904rArr aidlowast isin 5 1

(9)

120575cond (appdid = checkdid)rArr (applowastdidlowast = 3 and checklowastdidlowast = 2) or (applowastdidlowast = 7 and checklowastdidlowast = 2) or (applowastdidlowast =5 and checklowastdidlowast = 6) or (applowastdidlowast = 1 and checklowastdidlowast = 6)

120575cond (appdid lt checkdid)rArr (applowastdidlowast = 3 and checklowastdidlowast = 2) or (applowastdidlowast = 3 and checklowastdidlowast = 6) or (applowastdidlowast =7 and checklowastdidlowast = 2) or (applowastdidlowast = 7 and checklowastdidlowast = 6) or

(applowastdidlowast = 5 and checklowastdidlowast = 6) or (applowastdidlowast = 1 and checklowastdidlowast = 6)

323 Query Optimization Principles Because data isencrypted and stored in various places in order to reducethe transmission cost and improve the query efficiency we

6 Mathematical Problems in Engineering

Planeairway = airwayairway

Airway

Price Plane Plane

Πchairman

120590price lt 900 120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Planeplaneid = planeplaneid

Figure 2 Initial syntax tree

120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Planeairway = airwayairway

Πchairman

120590price lt 900

Pricelowast Planelowast Planelowast

DecryptDecryptDecrypt

Decrypt

Airwaylowast

Planeplaneid = planeplaneid

Figure 3 Syntax tree applied to cloud DB

should run operations on cloud services as much as possibleand the answers can be computed with little effort by theclient

For clear expression operation procedure is expressedby using the syntax tree The decryption operation splits thetree into cryptograph operations and plaintext operationsBecause any single operation on the original tree ends withthe selection after decryption thereby the principle of queryoptimization by using syntax tree is to iteratively pull up theselection

For example given a selection ldquoSELECT chairman FROMAirway Price PlaneWHERE price lt 900 AND begin = ldquoshang-hairdquo AND end = ldquobeijingrdquo ANDPriceplaneid = PlaneplaneidAND Planeairway = Airwayairwayrdquo we take query tree toillustrate how to optimize this query and describe its detailedprocedures

In Figure 2 the SQL statement is converted into an initialsyntax tree If the enterprise use cloud services or other off-site storage platforms we need to first decrypt the crypto-graph then query the data at client as shown in Figure 3where cryptograph database on cloud is bounded by the dot-ted line Query objects (Price Plane and Airway) are con-verted to cryptograph tables (Pricelowast Planelowast and Airwaylowast) inthe cloud database

Operations on syntax tree are performed from bottom toup In Figure 3 the first step is to execute selection while thefollowing steps include rewriting the condition of selectionoperations converting it to a selection on cryptograph incloud database and then decrypting and further filtering theresult at client A new syntax tree is derived as shown inFigure 4

Planeairway = airwayairway

Planeplaneid = planeplaneid

Πchairman

120590price lt 900

Pricelowast Planelowast Planelowast

Decrypt

Decrypt Decrypt Decrypt

Airwaylowast

120590lowast120575cond(price lt 900) 120590

lowast120575cond(begin = ldquoShanghairdquo) 120590

lowast120575cond(end = ldquoBeijingrdquo)

120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Figure 4 Rewriting syntax tree

Planeairway = airwayairway

Planeplaneid = planeplaneid

Πchairman

Pricelowast Planelowast

Decrypt

Decrypt Decrypt

Airwaylowast

120590lowast120575cond(price lt 900)

120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

(begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590lowast120575cond

Figure 5 Moving selections in syntax tree

According to optimization principles described abovewe should iteratively pull up selections Therefore by bothexchanging the positions between selection operations(pricelt 900 begin = ldquoshanghairdquo and end = ldquobeijingrdquo) and joinoperation and then combing corresponding conditions weobtain a new syntax tree as shown in Figure 5

Moreover based on operation rewriting rules join opera-tion in Figure 5 should be converted into two parts includingthe join on cryptograph in the cloud database and the selec-tion on decrypted provisional results as shown in Figure 6Repeat the above steps rewrite all kinds of operations andcontinuously exchange the positions between selection oper-ations and other operations till all the selections cannot bepulled up As a result we get the ultima syntax tree as shownin Figure 7 Operations within dotted line would be executedon cloud service whereas user only needs to execute the lastselection From here we see that the above method takes fulladvantage of cloud service to reduce the cost of transmittingand postprocessing and improve the efficiency of artifactquerying in business process

4 Case Study

In this section we will introduce a business instance of acertain enterprise Based on themethod in Section 2we com-plete the data modeling with artifact lifecycle from a givenprocess instance and illustrate the query process throughquery tree mentioned in Section 3

An enterprisersquos process of equipment purchasescrapinvolves the following steps At first equipment division fillsout the equipment purchasescrap application and hands it to

Mathematical Problems in Engineering 7

Planeairway = airwayairway

Πchairman

Pricelowast Planelowast

Decrypt

Decrypt

Airwaylowast

lowast

120590priceplaneid = planeplaneid

120575cond(priceplaneid = planeplaneid)

120590lowast120575cond(price lt 900)

120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

(begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590lowast120575cond

Figure 6 Rewriting syntax tree

departmentmanagers and companyrsquos leadership for approvalIf the application is consented then we should archive itelse we withdraw it Purchasing department does purchaseaccording to a copy of application and when the purchaseis completed documents should be archived Equipmentdivision scraps the equipment based on specific methods andstandards and then archives the processing results Assetsdepartment regularly verifies companyrsquos assets based onpurchasescrap equipment information Archive departmenthas permission to query all the archived information

This process involves multiple departments and multiplesets of information If we manage the data alone as businessdata are complicated and even one attribute has differencevalue in different event thereby it is difficult to manage If wemanage the process alone only the department activities willbe involvedwhile business data in the process will be ignoredIn this context we analyze the process concerning both dataand process and describe this instance with an Artiflow (119873119878 119877 119862 Ru) where

119873 ldquoEP119878rdquo

119878 FilloutEPAAudit1 Audit2Query Purchase AssetVerification

119877 NewEPA PrimaryEPA FinalEPA Unapro-vedEPA PO FAL

119862 FtoN NtoA

Ru constraint (EPA) = (FilloutEPA Audit1 Aud-it2)

The model contains multiple artifacts ldquoEPArdquo is fordescribing equipment purchase application while its lifecyclestarts from filling auditing to archiving When having beenarchived it will provide asset verification and support queryprocessing ldquoESArdquo is for describing equipment scrap applica-tion while its lifecycle starts from filling auditing to archiv-ing It is associated with another artifact called ldquomethod ampstandardrdquo The lifecycle of ldquoPOrdquoldquoSLrdquo captures process from

Πchairman

Pricelowast Planelowast

Decrypt

Airwaylowastlowast

lowast

120575cond(priceplaneid = planeplaneid)

120575condplaneairway = airwayairway

priceplaneid = planeplaneid and planeairway = airwayairway120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

120590lowast120575cond(price lt 900) (begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590

lowast120575cond

Figure 7 The ultima syntax tree

purchasescrap to application archive The whole process isshown in Figure 8

Artifact Example

Artifact (119862 A 120591 119876 119904 119865)119862 = ldquoEPArdquoA EquipmentName PurchaseAmount UnitPriceApplicationDate Applicant AuditingComment Audit-ingDate119876 empty table (initial state 119904) basic information fill-ing delivery auditing auditing completion auditedapplication archiving (terminate state 119865)120591 EquipmentName verchar PurchaseAmount InUnitPrice Int ApplicationDate Date Applicant Ver-char AuditingComment Verchar AuditingDate Date

Service Example

service = (119899 119881119903 119881119908 119875 119864) where

119899 = ldquoAudit2rdquo119881119903 EPA

119881119908 EPA

119875 DEFINED (EquipmentName) and DEFINED (Pur-chaseAmount) and DEFINED (UnitPrice) and DEFINED(ApplicationDate) and DEFINED (Applicant) and

notDEFINED (AuditingComment) and notDEFINED(AuditingDate)119864 DEFINED (EquipmentName) and DEFINED (Pur-chaseAmount) and DEFINED (UnitPrice) and DEFINED(ApplicationDate) and DEFINED (Applicant) and

DEFINED (AuditingComment) and DEFINED (Audit-ingDate)

Repository Example

119877 = (re 119877119886 119877119903 119862119905)

re ldquoFinalEPArdquo119877119886 EPA

8 Mathematical Problems in Engineering

M1 purchaserequest

EPA

EPA

EPA

EPA EPA

EPA

EPA

EPA

EPA

EPA

EPA

E1

Fill outEPA

E3

E3

E3

NewEPA M3 submit 1

E4

E4

NewESA

M5 scrap request

Fill outESA

UnapprovedEPA

EPA

Unapproved

PrimaryEPA

PrimaryESA

ESA

ESA

ESAESA

ESAESA

ESA

ESA

ESA

ESA

Methodand

standard

M4 submit 2

Audit2

E5

E6

E6

E6

E2

Scrap

FinalESA SL

SL

SL

SLMS

FAL

FAL FAL

FAL

Query

M2 query

M6 perform

M6 perform

Purchase

PO

PO

PO

PO

Assetverification

Final

Audit1

Figure 8 Lifecycle of business data in equipment purchasescrap process

DecryptDecrypt

Decrypt

POEPAid = finalEPAId

FAL Poid = POId 120590finalEPAaudit1 = ldquoDoctor Lirdquo

120590FALname = ldquodepth sounderrdquo

FinalEPAlowast

FAClowast

POlowast

Πapp name

Figure 9 Initial query on EPS

119877119903 EPA

119862119905 IsDefine(AuditingComment)

There is a repository named ldquoFinalEPArdquo which reads andstores artifact ldquoEPArdquo only if ldquoAuditingCommentrdquo has beenassigned

Given a query ldquoSELECT app name FROM FAL POFinalEPA WHERE Falname = lsquodepthsounderrsquo AND FALPOid=POIdANDPOEPAid= FinalEPAIdANDFinalEPAAudit2 = lsquoDoctor Lirdquorsquo it can be converted into a syntax treeas shown in Figure 9 which can be further converted into anew syntax tree shown in Figure 10 Queries will be issued onthis syntax tree

Decrypt

FinalEPAlowast

FAClowast

POlowast

Πapp name

120590FALname = ldquodepth sounderrdquo and FAL POId=POId and

POEPAid = finalEPAId and finalEPA audit2 = ldquoDoctor Lirdquo

120575cond(POEAPId = finalEPAId)

120575cond(FAL Poid = POId) 120590lowast120575cond (finalEPA audit2 = ldquoDoctor Lirdquo)

120590lowast120575cond FAL name = ldquodepth sounderrdquo

lowast

lowast

Figure 10 Ultima query on EPS

5 Conclusion

There is no doubt that more and more large datasets will bepoured out during business process execution meanwhilethese business data are extremely valuable In this case wemodeled business data through its lifecycle from the per-spective of process which ensures the integrity of dynamicbusiness data Furthermore we present the notion of userinterest on business data which has a superior function in

Mathematical Problems in Engineering 9

countingminimum interferential tuples during data partitionand ensuring a lower cost of postprocessing brought bydata partition Considering current business data are mostlystored on cloud we proposed a query rewriting strategy foroff-site encrypted data which has a significant advantage inreducing the postprocessing cost Currently there is littleresearch on business data modeling and querying in the truesense Our research lays great foundation for business datarsquosapplication in enterprise That is the initial step of businessdata management architecture and we will further researchon business data analysis with its lifecycle to fully dig thesignificant value of business data

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Natural Science Foun-dation of China (61272098) and Science and TechnologyDevelopment Foundation of Shanghai Ocean University

References

[1] W Huang Z ChenW Dong H Li B Cao and J Cao ldquoMobileinternet big data platform in china unicomrdquo Tsinghua Scienceand Technology vol 19 no 1 pp 95ndash101 2014

[2] S Sagiroglu andD Sinanc ldquoBig data a reviewrdquo in Proceedings ofthe International Conference on Collaboration Technologies andSystems (CTS rsquo13) pp 42ndash47 San Diego Calif USA May 2013

[3] M Chui B Brown J Bughin et al Big Data The Next Frontierfor Innovation Competition and Productivity McKinsey GlobalInstitute 2011

[4] M Singh and S K Jain ldquoA survey on dataspacerdquo in Advances inNetwork Security and Applications vol 196 of Communicationsin Computer and Information Science pp 608ndash621 Springer2011

[5] K Belhajjame N W Paton S M Embury A A A Fernandesand C Hedeler ldquoIncrementally improving dataspaces based onuser feedbackrdquo Information Systems vol 38 no 5 pp 656ndash6872013

[6] J-P Dittrich and M A Vaz Salles ldquoIDM a unified and versa-tile data model for Personal Dataspace Managementrdquo in Pro-ceedings of the 32nd International Conference onVery LargeDataBases (VLDB rsquo06) pp 367ndash378 September 2006

[7] S Pradhan ldquoTowards a novel desktop search techniquerdquoin Database and Expert Systems Applications pp 192ndash201Springer Berlin Germany 2007

[8] M Zhong M Liu and Q Chen ldquoModeling heterogeneousdata in dataspacerdquo in Proceedings of the IEEE InternationalConference on Information Reuse and Integration (IEEE IRI rsquo08)pp 404ndash409 Las Vegas Nev USA July 2008

[9] A Sarma X Dong andAHalevy ldquoDatamodeling in dataspacesupport platformsrdquo in Conceptual Modeling Foundations andApplications pp 122ndash138 Springer Berlin Germany 2009

[10] R Hull and J Su Report on NSF Workshop on Data-CentricWorkflows 2012 httpdcw2009csucsbedureportpdf

[11] R Hull J Su and R Vaculin ldquoData management perspectiveson business process managementrdquo in Proceedings of the ACMSIGMODConference onManagement ofData (SIGMOD rsquo13) pp943ndash947 June 2013

[12] K Bhattacharya N S Caswell S Kumaran A Nigam and FY Wu ldquoArtifact-centered operational modeling lessons fromcustomeengagementsrdquo IBM Systems Journal vol 46 no 4 pp703ndash721 2007

[13] K Bhattacharya R Guttman K Lyman et al ldquoA model-drivenapproach to industrializing discovery processes in pharmaceu-tical researchrdquo IBM Systems Journal vol 44 no 1 pp 145ndash1622005

[14] R Vaculın R Hull T Heath C Cochran A Nigam and PSukaviriya ldquoDeclarative business artifact centric modeling ofdecision and knowledge intensive business processesrdquo in Pro-ceedings of the 15th IEEE International EDOC Enterprise Com-puting Conference (EDOC rsquo11) pp 151ndash160 September 2011

[15] R Vaculın R Hull M Vukovic T Heath N Mills and Y SunldquoSupporting collaborative decision processesrdquo in Proceedings ofthe IEEE 10th International Conference on Services Computing(SCC rsquo13) pp 651ndash658 July 2013

[16] A Nigam and N S Caswell ldquoBusiness artifacts an approach tooperational specificationrdquo IBM Systems Journal vol 42 no 3pp 428ndash445 2003

[17] G Liu X Liu H Qin et al ldquoAutomated realization of businessworkflow specificationrdquo in Proceedings of the 1st InternationalWorkshop on SOA Globalization People and Work (SG-PAWrsquo09) pp 8ndash9 2009

[18] H Hacigumus B Iyer C Li and S Mehrotra ldquoExecuting SQLover encrypted data in the database-service-providermodelrdquo inProceedings of the ACM SIGMOD International Conference onManagment of Data pp 216ndash227 June 2002

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article Modeling and Querying Business Data with ...downloads.hindawi.com/journals/mpe/2015/506272.pdf · monitoring for performance or business concerns, auditing, and compliance

Mathematical Problems in Engineering 3

M1 request fordetection

E1Detection

taskregistration

DIS DIS DIS

DIS

DIS

DIS

DIS DIS DIS

DIS

Tasklibrary

DIS

M2 updatemethod and standard

E2

Method andstandard

management

Method and standardaltering table

Product standardaltering table

Product standardaltering table

Method and standardaltering table

Method andstandard

Product standardinformation

Taskassignment

Assignmenttask library

M3 alter product standard

E3

Productstandard

management

Productstandard

Product standardinformation

Method and standardinformation

Method and standardinformation

Dectinginformation sheet

Lab 1

Lab 2

Detectionresult

library

M4 submitauditing

Auditing

E4

E4

E4

E5

Auditedinformation

library

M5 edit report

Reportediting

Detectioninformation sheet

Detection report

Queryapplication

M4 querydetection result

E4

Queryand

management

Queryapplication

Taskarchiving

Figure 1 Lifecycle of business data in ldquomonitoring informationrdquo

the artifact lifecycle in which elements of repository serviceartifact type and transport channel are abstracted to rep-resent a realistic business process Artiflow views businessprocess as a graph where nodes are either ldquoservicerdquo or ldquorepos-itoryrdquo We formalize Artiflow to facilitate data analysis andillustrate it to facilitate process analysis Figure 1 illustratesa quality inspection process instance of a certain enterprisewhere themain artifact is the ldquomonitoring information sheetrdquoThe artifact captures the detected productrsquos evolvement fromcreation to archiving which includes all the business-relevantdata in this process The whole process comprises detectiontask registration task assignment task inspection task audi-tion and so forth Note that artifact ldquomonitoring informationsheetrdquo is inseparable from the coordinatewith such other arti-facts as ldquoproduct standardrdquo and ldquomethod amp standardrdquo withinits lifecycleWhen ldquomonitoring information sheetrdquo completesits lifecycle it will serve as a reference to form a new artifactmdashldquodetection information sheet (DIS for short)rdquo

In this figure there are nine services (ldquotask assignmentrdquoldquoauditingrdquo etc) seven repositories (ldquoassignment task libraryrdquoetc) and serial transport channels between these repositoriesand services

23 Model Quality Evaluation Exactly one business objectcan be achieved by implementing different business pro-cesses while different business process corresponds to a dif-ferent Artiflowmodel However wewillmeasure theArtiflowbased on two factors (1) the number of services determinesthe flexibility of model (2) The repository services read andupdate artifacts It is in this context that we define followingtheorem to measure the quality of artifact models

Theorem 8 Given an Artiflow (119873 119878 119877 119862 119877119906) it has 119895 Arti-facts where the number of attributes in any119860119903119905119894119891119886119888119905

119894is 119899119894 Sup-

pose |119878119894| and |119877

119894| represent the service amount and repository

amount of corresponding 119860119903119905119894119891119886119888119905119894 respectively formula (1)

is defined to calculate Artiflowrsquos web service granularity and

repository service proportion so as to measure the quality ofmodels

120587 =

sum119895

119894=1120588119894(120572 (

1003816100381610038161003816100381611987811989410038161003816100381610038161003816

119899119894) + 120573 (1 minus

1003816100381610038161003816100381611987711989410038161003816100381610038161003816

(1003816100381610038161003816100381611987811989410038161003816100381610038161003816

+1003816100381610038161003816100381611987711989410038161003816100381610038161003816)))

sum119894

119894=1120572 (

10038161003816100381610038161198781198941003816100381610038161003816 119899119894) + 120573 (1 minus

10038161003816100381610038161198771198941003816100381610038161003816 (

10038161003816100381610038161198781198941003816100381610038161003816 +

10038161003816100381610038161198771198941003816100381610038161003816))

(1)

where 120572 120573 and 120588119894are known

Theorem Proving For a given Artiflow (119873 119878 119877 119862 Ru) eachArtifact

119894comprises both a service sequence and a reposi-

tory sequence marked as (119878119909 119877119906 119878

119910 119877V) where service

sequence is described as 119878119894

= (119878119909 119878

119910) and repository

sequence is described as119877119894= (119877119906 119877V) Each Artifact119894 also

contains 119898 attributesSuppose |119878

119894| and |119877

119894| represent the service amount and

repository amount respectively then |119878119894|119898 represents the

granularity of services when dividing the whole lifecycle ofartifact by its attribute number 119898 A larger value indicatesthere are more blocks that are divided and the granularity isless which contributes to building a more flexible model

In Artiflow normally each artifact has a following repos-itory to store its intermediate state but there is exceptionthat some services can directly communicate with each otherand do not need intermediate repositories Therefore for thesame Artifact the few the repository elements are the lessthe redundancy would be |119877

119894|(|119878119894| + |119877

119894|) represents the

proportion of repository elements in both service and repos-itory elements within its corresponding artifact lifecycle Theshorter the value is the better the designed lifecycle would be

The quality of Artifact119894is computed by the following

formula

120587 = 120572 (

1003816100381610038161003816100381611987811989410038161003816100381610038161003816

119899119894

) + 120573 (1 minus

1003816100381610038161003816100381611987711989410038161003816100381610038161003816

10038161003816100381610038161198781198941003816100381610038161003816 +

10038161003816100381610038161198771198941003816100381610038161003816

) (2)

where 120572 and 120573 are predefined constants which is used tobalance the different magnitude between values both beforeand after the plus

4 Mathematical Problems in Engineering

Each Artiflow comprises multiple Artifacts so the qual-ity measurement formula for the whole Artiflow is 120587 =

sum119895

119894minus1120588119894120587119894 sum119895

119894minus1120587119894 where 119895 is the number of artifacts and

sum119895

119894minus1120588119894

= 1 120588119894represents the importance of Artifact

119894 The

optimization of key Artifacts has a great impact on the wholemodel to great extent while the optimization of less-valuableartifact does not contribute too much to the model efficiencyNote that 120588

119894can be either given by user or obtained by data

analysisBy integrating with both repository element redundancy

and service element granularity

120587 =sum119895

119894=1120588119894120587119894

sum119895

119894=1120587119894

=

sum119895

119894=1120588119894(120572 (

1003816100381610038161003816100381611987811989410038161003816100381610038161003816

119899119894) + 120573 (1 minus

1003816100381610038161003816100381611987711989410038161003816100381610038161003816

(1003816100381610038161003816100381611987811989410038161003816100381610038161003816

+1003816100381610038161003816100381611987711989410038161003816100381610038161003816)))

sum119894

119894=1120572 (

10038161003816100381610038161198781198941003816100381610038161003816 119899119894) + 120573 (1 minus

10038161003816100381610038161198771198941003816100381610038161003816 (

10038161003816100381610038161198781198941003816100381610038161003816 +

10038161003816100381610038161198771198941003816100381610038161003816))

(3)

can be deduced and taken to measure the model quality

3 Business Data Querying

Enterprises like Google Amazon have provided plenty ofcloud services which provide an open storage solution fordata like process data all over the world But off-site storageis unsafe due to data privacy even public cloud In this casethese data need to be encrypted and then stored in databaseBut it is hard to make a trade-off between data security andquery speed which is because process data need to be fre-quent queried modified and transmitted In this section wemake study on partitioning encrypted artifacts and comingup with a superior query plan for cryptograph query thatminimizes the execution cost

31 Business Data Partition In order to ensure the efficiencyof business process a superior data partition is on-demandWhen using Bucket partitionmethod query result on crypto-graph is actually a superset of true results generated by rele-vant operators and then filtered at the client after decryptionThus superior partition method is of great help and aims tominimize the work done as much as possible such as mini-mizing the number of interferential results

311 Data Analysis

Definition 9 (Bucket [18]) Mapping the domain of attribute119860 into another partitions set 119901

1 119901

119872 where 119901

119894cup 119901119895

= 01 ⩽ 119894 119895 ⩽ 119872 each partition 119901

119894is named as a Bucket 119872 is the

Bucket number

Definition 10 (the user interest on artifact) Querying onArti-factrsquos attribute 119860 of 119899 times respectively while 119902(119886

119894) repre-

sents any single result of queries 119902 that contains value 119886119894

suppose 119891(119902(119886119894)) is the frequency of 119902(119886

119894) occurring in 119899

trials as 119899 increases the frequency stabilizes at a certain valuewhich is expressed as 119901(119902(119886

119894)) In other words 119901(119902(119886

119894) is the

probability of artifact attribute 119886119894emerged in query result

dataset called user interest

Definition 11 (interferential artifact) (Intf-Artifact) is an arti-fact which is incorrect result but belong to cryptograph queryresult 119902

lowast(119886119894) named as INTFA(119902

lowast(119886119894))

312 Min-Interference Partition All Artifacts in each Bucketcorrespond to a given index number in Bucket-based crypto-graph partition Cryptograph query returns all the encryptedArtifacts in Bucket where true result existsThe rest in Bucketwould be transmitted to users as Intf-Artifact and then itshould be deciphered and further filtered Hence Bucketpartition method determines the number of Intf-Artifactswhich further effects the query processing cost

Suppose a cryptograph relation contains 119899 tuples Arti-fact1Artifact

2 Artifact

119899 and 119896 is a large integer then we

pose 119896 random queries Totally there are 119896119894queries where

their final query results are Artifact119894 and other 119897

119894tuples are

returned as the provisional result In this case the expectationof Intf-Artifact is 119897

119894lowast (119896119894119896)

There are 119899 tuples in the relation at all and then theexpectation of total Intf-Artifacts is

1198971

lowast (1198961

119896) + 1198972

lowast (1198962

119896) + sdot sdot sdot + 119897

119899lowast (

119896119899

119896) (4)

As for each Bucket containing 119899 different attribute valuesits user interest is119901(119902(119861)) = 119901(119902(119886

1))+119901(119902(119886

2))+sdot sdot sdot+119901(119902(119886

119899)

If the user interest on 119894th artifact in a given Bucket is119901(119902(119886119894)119901(119902(119861)) (1 le 119894 le 119899) then the number of Intf-

Artifacts brought by above query is |INTFA(119902 lowast (119886119894))| = 119891

1+

1198912

+ sdot sdot sdot + 119891119894minus1

+ 119891119894+1

+ sdot sdot sdot + 119891119899

As for Bucket 119895 (1 le 119895 le 119896) based on the user interest onartifact and the number of Intf-Artifacts in each Bucket wecan describe Bucket Intf-Artifact as followsWINTFA

=

119872

sum

119895=1

⟦119901 (119902 (119861119895))⟧ lowast

10038161003816100381610038161003816INTFA (Bucket

119895)10038161003816100381610038161003816

=

119872

sum

119895=1

[119901 (119902 (119861119895)) lowast

119899

sum

119894=1

(119901 (119902 (119886

119894))

119901 (119902 (119861119895))

1003816100381610038161003816INTFA (119902lowast

(119886119894))

1003816100381610038161003816)]

=

119872

sum

119895=1

[119901 (119902 (119861119895)) lowast

119899

sum

119894=1

(119901 (119902 (119886

119894))

119901 (119902 (119861119895))

(

119899

sum

119894=1

119891119895

119894minus 119891119895

119894))]

=

119872

sum

119895=1

[

119899

sum

119894=1

119901 (119902 (119886119894))

119899

sum

119894=1

119891119895

119894minus

119899

sum

119894=1

119901 (119902 (119886119894)) 119891119895

119894]

=

119872

sum

119895=1

[

119899

sum

119894=1

119901 (119902 (119886119894)) 119865119895

minus

119899

sum

119894=1

119901 (119902 (119886119894)) 119891119895

119894]

(5)

From here we see that in the case of a fixed Bucket num-ber the smaller the value of formula (5) is the more excellentthe index would be A larger value brings a heavy cost whenquerying and renders a low efficiency of Bucket partitionFrom the probability angle Bucket where artifact with higheruser interest exists should contain fewer Artifacts Thereforeuser interest on each artifact should be viewed as the weight

Mathematical Problems in Engineering 5

in the whole processMoreover when the index is being builtformula (5) is used to determine which Bucket we store eachartifact in which helps to obtain an optimal partition result

32 Business Data Query Cloud service stores encryptedartifact information and corresponding index informationwhile such other information as the partitioning of attributesmapping function and so forth are stored at client When auser issues a query request query 119902 should be rewritten toits server-side cryptograph query 119902

lowast which is then executedon cloudThe purpose of rewriting SQL queries is to split thequery computation across the client and cloud

321 Basic Definitions

Definition 12 120585(lowast119886]

(119909) is a function which returns a set ofall the Bucket ID where its right boundary value 119861

119895

lowastright is

not greater than 119909 when once partitioning Bucket that is120585(lowast119886]

(119909) = BIDVlowast

| 119861Vlowastright le 119909

Definition 13 120585[119886lowast)

(119909) is a function which returns a set of allthe Bucket ID where its left boundary value 119861

119895

lowastleft is greater

than 119909 when once partitioning Bucket that is 120585[119886lowast)

(119909) =

BIDVlowast

| 119861Vlowastleft ge 119909

Definition 14 120585(lowast119901(119902(V119894))](119909) is a function which returns a set

of all the Bucket ID where its maximum artifact query proba-bility 119861

lowast

119895119901right is not greater than 119909 when twice partitioning

Bucket that is 120585(lowast119901(119902(V119894))](119909) = BIDlowast

119895| 119861lowast

119895119901right le 119909

Definition 15 120585[119901(119902(119886119894))lowast)

(119909) is a function which returns aset of all the Bucket ID where its minimum artifact queryprobability 119861

lowast

119895119901left is not less than 119909when twice partitioning

Bucket that is 120585[119901(119902(119886119894))lowast)

(119909) = BIDlowast119895

| 119861lowast

119895119901left ge 119909

Definition 16 120575cond(119862) is a function that translates specificquery conditions to encrypted ones

Definition 17 Query rewriting function is described as120575query(119902) rArr 119902

lowast where 119902 is the original query and 119902lowast is the

cryptograph query

322 Query Rewriting Rules In view of grammatical rulesquery condition cond includes V 119860 119860 cond

1or cond

2

cond1

and cond2 where ldquordquo is the operator such as equal less

than not greater than greater than and not less than We listthe rewrite formulas for various query conditions as shownin Formulas (6) to (8)

(1) 119860 V120575cond (119909 = 119890) 997904rArr

1119860lowast

= 120585119890(119909)

120575cond (119909 lt 119890) 997904rArr2119860lowast

le 120585119890(119909)

120575cond (119909 lt 119890) 997904rArr3119860lowast

isin 120585(lowast119890]

(119909)

120575cond (119909 gt 119890) 997904rArr4119860lowast

ge 120585119890(119909)

120575cond (119909 gt 119890) 997904rArr5119860lowast

isin 120585[119890lowast)

(119909)

(6)

where both Map 2 and Map 4 are order preserving and bothMap3 and Map 5 are random

(2) 119860 119860

120575cond (119860119894lt 119860119895)

997904rArr1

or (119860lowast

119894= Bid

119860119894(119901119896) and 119860lowast

119895ge 120585119860119894

(119901left))

120575cond (119860119894lt 119860119895)

997904rArr2

or (119860lowast

119895= Bid

119860119895(119901119897) and 119860lowast

119894ge 120585119860119895

(119901right))

120575cond (119860119894lt 119860119895) 997904rArr3

or (120585119860119894

(119901119896left) le 120585

119860119895(119901119897right))

120575cond (119860119894lt 119860119895)

997904rArr4

or (119860lowast

119894= Bid

119860119894(119901119896) and 119860lowast

119895= Bid

119860119895(119901119897))

(7)

where 119901119896

isin partition(119860119894) 119901119897

isin partition(119860119895) and 119901

119897high ge

119901119896lowWhen the condition is 119860

119894lt 119860119895 in Map 1 119860

119894is order pre-

serving while in Map 3 both 119860119894and 119860

119895are order preserving

Meanwhile in Map 2 119860119895is order preserving and in Map 4

both 119860119894and 119860

119895are random

(3) cond1

or and cond2

120575cond (cond1

or cond2)

997904rArr 120575cond (cond1) or 120575cond (cond

2)

120575cond (cond1

and cond2)

997904rArr 120575cond (cond1) and 120575cond (cond

2)

(8)

For instance suppose there are two artifact plaintexttables in cloud database which are app (aid aname timecontent cid) check (cid aid result) respectively where therange of attribute aid is divided into 6 partitions includingidappaid([0 100]) = 3 idappaid((100 200]) = 7 idappaid((200

300]) = 5 idappaid((300 400]) = 1 idcheckaid([0 200]) = 2idcheckaid((200 400]) = 6

Given above partition results we rewrite the followingquery conditions based on above formulas

120575cond (aid = 256) 997904rArr aidlowast = 5

120575cond (aid lt 180) 997904rArr aidlowast isin 3 7

120575cond (aid gt 240) 997904rArr aidlowast isin 5 1

(9)

120575cond (appdid = checkdid)rArr (applowastdidlowast = 3 and checklowastdidlowast = 2) or (applowastdidlowast = 7 and checklowastdidlowast = 2) or (applowastdidlowast =5 and checklowastdidlowast = 6) or (applowastdidlowast = 1 and checklowastdidlowast = 6)

120575cond (appdid lt checkdid)rArr (applowastdidlowast = 3 and checklowastdidlowast = 2) or (applowastdidlowast = 3 and checklowastdidlowast = 6) or (applowastdidlowast =7 and checklowastdidlowast = 2) or (applowastdidlowast = 7 and checklowastdidlowast = 6) or

(applowastdidlowast = 5 and checklowastdidlowast = 6) or (applowastdidlowast = 1 and checklowastdidlowast = 6)

323 Query Optimization Principles Because data isencrypted and stored in various places in order to reducethe transmission cost and improve the query efficiency we

6 Mathematical Problems in Engineering

Planeairway = airwayairway

Airway

Price Plane Plane

Πchairman

120590price lt 900 120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Planeplaneid = planeplaneid

Figure 2 Initial syntax tree

120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Planeairway = airwayairway

Πchairman

120590price lt 900

Pricelowast Planelowast Planelowast

DecryptDecryptDecrypt

Decrypt

Airwaylowast

Planeplaneid = planeplaneid

Figure 3 Syntax tree applied to cloud DB

should run operations on cloud services as much as possibleand the answers can be computed with little effort by theclient

For clear expression operation procedure is expressedby using the syntax tree The decryption operation splits thetree into cryptograph operations and plaintext operationsBecause any single operation on the original tree ends withthe selection after decryption thereby the principle of queryoptimization by using syntax tree is to iteratively pull up theselection

For example given a selection ldquoSELECT chairman FROMAirway Price PlaneWHERE price lt 900 AND begin = ldquoshang-hairdquo AND end = ldquobeijingrdquo ANDPriceplaneid = PlaneplaneidAND Planeairway = Airwayairwayrdquo we take query tree toillustrate how to optimize this query and describe its detailedprocedures

In Figure 2 the SQL statement is converted into an initialsyntax tree If the enterprise use cloud services or other off-site storage platforms we need to first decrypt the crypto-graph then query the data at client as shown in Figure 3where cryptograph database on cloud is bounded by the dot-ted line Query objects (Price Plane and Airway) are con-verted to cryptograph tables (Pricelowast Planelowast and Airwaylowast) inthe cloud database

Operations on syntax tree are performed from bottom toup In Figure 3 the first step is to execute selection while thefollowing steps include rewriting the condition of selectionoperations converting it to a selection on cryptograph incloud database and then decrypting and further filtering theresult at client A new syntax tree is derived as shown inFigure 4

Planeairway = airwayairway

Planeplaneid = planeplaneid

Πchairman

120590price lt 900

Pricelowast Planelowast Planelowast

Decrypt

Decrypt Decrypt Decrypt

Airwaylowast

120590lowast120575cond(price lt 900) 120590

lowast120575cond(begin = ldquoShanghairdquo) 120590

lowast120575cond(end = ldquoBeijingrdquo)

120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Figure 4 Rewriting syntax tree

Planeairway = airwayairway

Planeplaneid = planeplaneid

Πchairman

Pricelowast Planelowast

Decrypt

Decrypt Decrypt

Airwaylowast

120590lowast120575cond(price lt 900)

120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

(begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590lowast120575cond

Figure 5 Moving selections in syntax tree

According to optimization principles described abovewe should iteratively pull up selections Therefore by bothexchanging the positions between selection operations(pricelt 900 begin = ldquoshanghairdquo and end = ldquobeijingrdquo) and joinoperation and then combing corresponding conditions weobtain a new syntax tree as shown in Figure 5

Moreover based on operation rewriting rules join opera-tion in Figure 5 should be converted into two parts includingthe join on cryptograph in the cloud database and the selec-tion on decrypted provisional results as shown in Figure 6Repeat the above steps rewrite all kinds of operations andcontinuously exchange the positions between selection oper-ations and other operations till all the selections cannot bepulled up As a result we get the ultima syntax tree as shownin Figure 7 Operations within dotted line would be executedon cloud service whereas user only needs to execute the lastselection From here we see that the above method takes fulladvantage of cloud service to reduce the cost of transmittingand postprocessing and improve the efficiency of artifactquerying in business process

4 Case Study

In this section we will introduce a business instance of acertain enterprise Based on themethod in Section 2we com-plete the data modeling with artifact lifecycle from a givenprocess instance and illustrate the query process throughquery tree mentioned in Section 3

An enterprisersquos process of equipment purchasescrapinvolves the following steps At first equipment division fillsout the equipment purchasescrap application and hands it to

Mathematical Problems in Engineering 7

Planeairway = airwayairway

Πchairman

Pricelowast Planelowast

Decrypt

Decrypt

Airwaylowast

lowast

120590priceplaneid = planeplaneid

120575cond(priceplaneid = planeplaneid)

120590lowast120575cond(price lt 900)

120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

(begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590lowast120575cond

Figure 6 Rewriting syntax tree

departmentmanagers and companyrsquos leadership for approvalIf the application is consented then we should archive itelse we withdraw it Purchasing department does purchaseaccording to a copy of application and when the purchaseis completed documents should be archived Equipmentdivision scraps the equipment based on specific methods andstandards and then archives the processing results Assetsdepartment regularly verifies companyrsquos assets based onpurchasescrap equipment information Archive departmenthas permission to query all the archived information

This process involves multiple departments and multiplesets of information If we manage the data alone as businessdata are complicated and even one attribute has differencevalue in different event thereby it is difficult to manage If wemanage the process alone only the department activities willbe involvedwhile business data in the process will be ignoredIn this context we analyze the process concerning both dataand process and describe this instance with an Artiflow (119873119878 119877 119862 Ru) where

119873 ldquoEP119878rdquo

119878 FilloutEPAAudit1 Audit2Query Purchase AssetVerification

119877 NewEPA PrimaryEPA FinalEPA Unapro-vedEPA PO FAL

119862 FtoN NtoA

Ru constraint (EPA) = (FilloutEPA Audit1 Aud-it2)

The model contains multiple artifacts ldquoEPArdquo is fordescribing equipment purchase application while its lifecyclestarts from filling auditing to archiving When having beenarchived it will provide asset verification and support queryprocessing ldquoESArdquo is for describing equipment scrap applica-tion while its lifecycle starts from filling auditing to archiv-ing It is associated with another artifact called ldquomethod ampstandardrdquo The lifecycle of ldquoPOrdquoldquoSLrdquo captures process from

Πchairman

Pricelowast Planelowast

Decrypt

Airwaylowastlowast

lowast

120575cond(priceplaneid = planeplaneid)

120575condplaneairway = airwayairway

priceplaneid = planeplaneid and planeairway = airwayairway120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

120590lowast120575cond(price lt 900) (begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590

lowast120575cond

Figure 7 The ultima syntax tree

purchasescrap to application archive The whole process isshown in Figure 8

Artifact Example

Artifact (119862 A 120591 119876 119904 119865)119862 = ldquoEPArdquoA EquipmentName PurchaseAmount UnitPriceApplicationDate Applicant AuditingComment Audit-ingDate119876 empty table (initial state 119904) basic information fill-ing delivery auditing auditing completion auditedapplication archiving (terminate state 119865)120591 EquipmentName verchar PurchaseAmount InUnitPrice Int ApplicationDate Date Applicant Ver-char AuditingComment Verchar AuditingDate Date

Service Example

service = (119899 119881119903 119881119908 119875 119864) where

119899 = ldquoAudit2rdquo119881119903 EPA

119881119908 EPA

119875 DEFINED (EquipmentName) and DEFINED (Pur-chaseAmount) and DEFINED (UnitPrice) and DEFINED(ApplicationDate) and DEFINED (Applicant) and

notDEFINED (AuditingComment) and notDEFINED(AuditingDate)119864 DEFINED (EquipmentName) and DEFINED (Pur-chaseAmount) and DEFINED (UnitPrice) and DEFINED(ApplicationDate) and DEFINED (Applicant) and

DEFINED (AuditingComment) and DEFINED (Audit-ingDate)

Repository Example

119877 = (re 119877119886 119877119903 119862119905)

re ldquoFinalEPArdquo119877119886 EPA

8 Mathematical Problems in Engineering

M1 purchaserequest

EPA

EPA

EPA

EPA EPA

EPA

EPA

EPA

EPA

EPA

EPA

E1

Fill outEPA

E3

E3

E3

NewEPA M3 submit 1

E4

E4

NewESA

M5 scrap request

Fill outESA

UnapprovedEPA

EPA

Unapproved

PrimaryEPA

PrimaryESA

ESA

ESA

ESAESA

ESAESA

ESA

ESA

ESA

ESA

Methodand

standard

M4 submit 2

Audit2

E5

E6

E6

E6

E2

Scrap

FinalESA SL

SL

SL

SLMS

FAL

FAL FAL

FAL

Query

M2 query

M6 perform

M6 perform

Purchase

PO

PO

PO

PO

Assetverification

Final

Audit1

Figure 8 Lifecycle of business data in equipment purchasescrap process

DecryptDecrypt

Decrypt

POEPAid = finalEPAId

FAL Poid = POId 120590finalEPAaudit1 = ldquoDoctor Lirdquo

120590FALname = ldquodepth sounderrdquo

FinalEPAlowast

FAClowast

POlowast

Πapp name

Figure 9 Initial query on EPS

119877119903 EPA

119862119905 IsDefine(AuditingComment)

There is a repository named ldquoFinalEPArdquo which reads andstores artifact ldquoEPArdquo only if ldquoAuditingCommentrdquo has beenassigned

Given a query ldquoSELECT app name FROM FAL POFinalEPA WHERE Falname = lsquodepthsounderrsquo AND FALPOid=POIdANDPOEPAid= FinalEPAIdANDFinalEPAAudit2 = lsquoDoctor Lirdquorsquo it can be converted into a syntax treeas shown in Figure 9 which can be further converted into anew syntax tree shown in Figure 10 Queries will be issued onthis syntax tree

Decrypt

FinalEPAlowast

FAClowast

POlowast

Πapp name

120590FALname = ldquodepth sounderrdquo and FAL POId=POId and

POEPAid = finalEPAId and finalEPA audit2 = ldquoDoctor Lirdquo

120575cond(POEAPId = finalEPAId)

120575cond(FAL Poid = POId) 120590lowast120575cond (finalEPA audit2 = ldquoDoctor Lirdquo)

120590lowast120575cond FAL name = ldquodepth sounderrdquo

lowast

lowast

Figure 10 Ultima query on EPS

5 Conclusion

There is no doubt that more and more large datasets will bepoured out during business process execution meanwhilethese business data are extremely valuable In this case wemodeled business data through its lifecycle from the per-spective of process which ensures the integrity of dynamicbusiness data Furthermore we present the notion of userinterest on business data which has a superior function in

Mathematical Problems in Engineering 9

countingminimum interferential tuples during data partitionand ensuring a lower cost of postprocessing brought bydata partition Considering current business data are mostlystored on cloud we proposed a query rewriting strategy foroff-site encrypted data which has a significant advantage inreducing the postprocessing cost Currently there is littleresearch on business data modeling and querying in the truesense Our research lays great foundation for business datarsquosapplication in enterprise That is the initial step of businessdata management architecture and we will further researchon business data analysis with its lifecycle to fully dig thesignificant value of business data

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Natural Science Foun-dation of China (61272098) and Science and TechnologyDevelopment Foundation of Shanghai Ocean University

References

[1] W Huang Z ChenW Dong H Li B Cao and J Cao ldquoMobileinternet big data platform in china unicomrdquo Tsinghua Scienceand Technology vol 19 no 1 pp 95ndash101 2014

[2] S Sagiroglu andD Sinanc ldquoBig data a reviewrdquo in Proceedings ofthe International Conference on Collaboration Technologies andSystems (CTS rsquo13) pp 42ndash47 San Diego Calif USA May 2013

[3] M Chui B Brown J Bughin et al Big Data The Next Frontierfor Innovation Competition and Productivity McKinsey GlobalInstitute 2011

[4] M Singh and S K Jain ldquoA survey on dataspacerdquo in Advances inNetwork Security and Applications vol 196 of Communicationsin Computer and Information Science pp 608ndash621 Springer2011

[5] K Belhajjame N W Paton S M Embury A A A Fernandesand C Hedeler ldquoIncrementally improving dataspaces based onuser feedbackrdquo Information Systems vol 38 no 5 pp 656ndash6872013

[6] J-P Dittrich and M A Vaz Salles ldquoIDM a unified and versa-tile data model for Personal Dataspace Managementrdquo in Pro-ceedings of the 32nd International Conference onVery LargeDataBases (VLDB rsquo06) pp 367ndash378 September 2006

[7] S Pradhan ldquoTowards a novel desktop search techniquerdquoin Database and Expert Systems Applications pp 192ndash201Springer Berlin Germany 2007

[8] M Zhong M Liu and Q Chen ldquoModeling heterogeneousdata in dataspacerdquo in Proceedings of the IEEE InternationalConference on Information Reuse and Integration (IEEE IRI rsquo08)pp 404ndash409 Las Vegas Nev USA July 2008

[9] A Sarma X Dong andAHalevy ldquoDatamodeling in dataspacesupport platformsrdquo in Conceptual Modeling Foundations andApplications pp 122ndash138 Springer Berlin Germany 2009

[10] R Hull and J Su Report on NSF Workshop on Data-CentricWorkflows 2012 httpdcw2009csucsbedureportpdf

[11] R Hull J Su and R Vaculin ldquoData management perspectiveson business process managementrdquo in Proceedings of the ACMSIGMODConference onManagement ofData (SIGMOD rsquo13) pp943ndash947 June 2013

[12] K Bhattacharya N S Caswell S Kumaran A Nigam and FY Wu ldquoArtifact-centered operational modeling lessons fromcustomeengagementsrdquo IBM Systems Journal vol 46 no 4 pp703ndash721 2007

[13] K Bhattacharya R Guttman K Lyman et al ldquoA model-drivenapproach to industrializing discovery processes in pharmaceu-tical researchrdquo IBM Systems Journal vol 44 no 1 pp 145ndash1622005

[14] R Vaculın R Hull T Heath C Cochran A Nigam and PSukaviriya ldquoDeclarative business artifact centric modeling ofdecision and knowledge intensive business processesrdquo in Pro-ceedings of the 15th IEEE International EDOC Enterprise Com-puting Conference (EDOC rsquo11) pp 151ndash160 September 2011

[15] R Vaculın R Hull M Vukovic T Heath N Mills and Y SunldquoSupporting collaborative decision processesrdquo in Proceedings ofthe IEEE 10th International Conference on Services Computing(SCC rsquo13) pp 651ndash658 July 2013

[16] A Nigam and N S Caswell ldquoBusiness artifacts an approach tooperational specificationrdquo IBM Systems Journal vol 42 no 3pp 428ndash445 2003

[17] G Liu X Liu H Qin et al ldquoAutomated realization of businessworkflow specificationrdquo in Proceedings of the 1st InternationalWorkshop on SOA Globalization People and Work (SG-PAWrsquo09) pp 8ndash9 2009

[18] H Hacigumus B Iyer C Li and S Mehrotra ldquoExecuting SQLover encrypted data in the database-service-providermodelrdquo inProceedings of the ACM SIGMOD International Conference onManagment of Data pp 216ndash227 June 2002

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article Modeling and Querying Business Data with ...downloads.hindawi.com/journals/mpe/2015/506272.pdf · monitoring for performance or business concerns, auditing, and compliance

4 Mathematical Problems in Engineering

Each Artiflow comprises multiple Artifacts so the qual-ity measurement formula for the whole Artiflow is 120587 =

sum119895

119894minus1120588119894120587119894 sum119895

119894minus1120587119894 where 119895 is the number of artifacts and

sum119895

119894minus1120588119894

= 1 120588119894represents the importance of Artifact

119894 The

optimization of key Artifacts has a great impact on the wholemodel to great extent while the optimization of less-valuableartifact does not contribute too much to the model efficiencyNote that 120588

119894can be either given by user or obtained by data

analysisBy integrating with both repository element redundancy

and service element granularity

120587 =sum119895

119894=1120588119894120587119894

sum119895

119894=1120587119894

=

sum119895

119894=1120588119894(120572 (

1003816100381610038161003816100381611987811989410038161003816100381610038161003816

119899119894) + 120573 (1 minus

1003816100381610038161003816100381611987711989410038161003816100381610038161003816

(1003816100381610038161003816100381611987811989410038161003816100381610038161003816

+1003816100381610038161003816100381611987711989410038161003816100381610038161003816)))

sum119894

119894=1120572 (

10038161003816100381610038161198781198941003816100381610038161003816 119899119894) + 120573 (1 minus

10038161003816100381610038161198771198941003816100381610038161003816 (

10038161003816100381610038161198781198941003816100381610038161003816 +

10038161003816100381610038161198771198941003816100381610038161003816))

(3)

can be deduced and taken to measure the model quality

3 Business Data Querying

Enterprises like Google Amazon have provided plenty ofcloud services which provide an open storage solution fordata like process data all over the world But off-site storageis unsafe due to data privacy even public cloud In this casethese data need to be encrypted and then stored in databaseBut it is hard to make a trade-off between data security andquery speed which is because process data need to be fre-quent queried modified and transmitted In this section wemake study on partitioning encrypted artifacts and comingup with a superior query plan for cryptograph query thatminimizes the execution cost

31 Business Data Partition In order to ensure the efficiencyof business process a superior data partition is on-demandWhen using Bucket partitionmethod query result on crypto-graph is actually a superset of true results generated by rele-vant operators and then filtered at the client after decryptionThus superior partition method is of great help and aims tominimize the work done as much as possible such as mini-mizing the number of interferential results

311 Data Analysis

Definition 9 (Bucket [18]) Mapping the domain of attribute119860 into another partitions set 119901

1 119901

119872 where 119901

119894cup 119901119895

= 01 ⩽ 119894 119895 ⩽ 119872 each partition 119901

119894is named as a Bucket 119872 is the

Bucket number

Definition 10 (the user interest on artifact) Querying onArti-factrsquos attribute 119860 of 119899 times respectively while 119902(119886

119894) repre-

sents any single result of queries 119902 that contains value 119886119894

suppose 119891(119902(119886119894)) is the frequency of 119902(119886

119894) occurring in 119899

trials as 119899 increases the frequency stabilizes at a certain valuewhich is expressed as 119901(119902(119886

119894)) In other words 119901(119902(119886

119894) is the

probability of artifact attribute 119886119894emerged in query result

dataset called user interest

Definition 11 (interferential artifact) (Intf-Artifact) is an arti-fact which is incorrect result but belong to cryptograph queryresult 119902

lowast(119886119894) named as INTFA(119902

lowast(119886119894))

312 Min-Interference Partition All Artifacts in each Bucketcorrespond to a given index number in Bucket-based crypto-graph partition Cryptograph query returns all the encryptedArtifacts in Bucket where true result existsThe rest in Bucketwould be transmitted to users as Intf-Artifact and then itshould be deciphered and further filtered Hence Bucketpartition method determines the number of Intf-Artifactswhich further effects the query processing cost

Suppose a cryptograph relation contains 119899 tuples Arti-fact1Artifact

2 Artifact

119899 and 119896 is a large integer then we

pose 119896 random queries Totally there are 119896119894queries where

their final query results are Artifact119894 and other 119897

119894tuples are

returned as the provisional result In this case the expectationof Intf-Artifact is 119897

119894lowast (119896119894119896)

There are 119899 tuples in the relation at all and then theexpectation of total Intf-Artifacts is

1198971

lowast (1198961

119896) + 1198972

lowast (1198962

119896) + sdot sdot sdot + 119897

119899lowast (

119896119899

119896) (4)

As for each Bucket containing 119899 different attribute valuesits user interest is119901(119902(119861)) = 119901(119902(119886

1))+119901(119902(119886

2))+sdot sdot sdot+119901(119902(119886

119899)

If the user interest on 119894th artifact in a given Bucket is119901(119902(119886119894)119901(119902(119861)) (1 le 119894 le 119899) then the number of Intf-

Artifacts brought by above query is |INTFA(119902 lowast (119886119894))| = 119891

1+

1198912

+ sdot sdot sdot + 119891119894minus1

+ 119891119894+1

+ sdot sdot sdot + 119891119899

As for Bucket 119895 (1 le 119895 le 119896) based on the user interest onartifact and the number of Intf-Artifacts in each Bucket wecan describe Bucket Intf-Artifact as followsWINTFA

=

119872

sum

119895=1

⟦119901 (119902 (119861119895))⟧ lowast

10038161003816100381610038161003816INTFA (Bucket

119895)10038161003816100381610038161003816

=

119872

sum

119895=1

[119901 (119902 (119861119895)) lowast

119899

sum

119894=1

(119901 (119902 (119886

119894))

119901 (119902 (119861119895))

1003816100381610038161003816INTFA (119902lowast

(119886119894))

1003816100381610038161003816)]

=

119872

sum

119895=1

[119901 (119902 (119861119895)) lowast

119899

sum

119894=1

(119901 (119902 (119886

119894))

119901 (119902 (119861119895))

(

119899

sum

119894=1

119891119895

119894minus 119891119895

119894))]

=

119872

sum

119895=1

[

119899

sum

119894=1

119901 (119902 (119886119894))

119899

sum

119894=1

119891119895

119894minus

119899

sum

119894=1

119901 (119902 (119886119894)) 119891119895

119894]

=

119872

sum

119895=1

[

119899

sum

119894=1

119901 (119902 (119886119894)) 119865119895

minus

119899

sum

119894=1

119901 (119902 (119886119894)) 119891119895

119894]

(5)

From here we see that in the case of a fixed Bucket num-ber the smaller the value of formula (5) is the more excellentthe index would be A larger value brings a heavy cost whenquerying and renders a low efficiency of Bucket partitionFrom the probability angle Bucket where artifact with higheruser interest exists should contain fewer Artifacts Thereforeuser interest on each artifact should be viewed as the weight

Mathematical Problems in Engineering 5

in the whole processMoreover when the index is being builtformula (5) is used to determine which Bucket we store eachartifact in which helps to obtain an optimal partition result

32 Business Data Query Cloud service stores encryptedartifact information and corresponding index informationwhile such other information as the partitioning of attributesmapping function and so forth are stored at client When auser issues a query request query 119902 should be rewritten toits server-side cryptograph query 119902

lowast which is then executedon cloudThe purpose of rewriting SQL queries is to split thequery computation across the client and cloud

321 Basic Definitions

Definition 12 120585(lowast119886]

(119909) is a function which returns a set ofall the Bucket ID where its right boundary value 119861

119895

lowastright is

not greater than 119909 when once partitioning Bucket that is120585(lowast119886]

(119909) = BIDVlowast

| 119861Vlowastright le 119909

Definition 13 120585[119886lowast)

(119909) is a function which returns a set of allthe Bucket ID where its left boundary value 119861

119895

lowastleft is greater

than 119909 when once partitioning Bucket that is 120585[119886lowast)

(119909) =

BIDVlowast

| 119861Vlowastleft ge 119909

Definition 14 120585(lowast119901(119902(V119894))](119909) is a function which returns a set

of all the Bucket ID where its maximum artifact query proba-bility 119861

lowast

119895119901right is not greater than 119909 when twice partitioning

Bucket that is 120585(lowast119901(119902(V119894))](119909) = BIDlowast

119895| 119861lowast

119895119901right le 119909

Definition 15 120585[119901(119902(119886119894))lowast)

(119909) is a function which returns aset of all the Bucket ID where its minimum artifact queryprobability 119861

lowast

119895119901left is not less than 119909when twice partitioning

Bucket that is 120585[119901(119902(119886119894))lowast)

(119909) = BIDlowast119895

| 119861lowast

119895119901left ge 119909

Definition 16 120575cond(119862) is a function that translates specificquery conditions to encrypted ones

Definition 17 Query rewriting function is described as120575query(119902) rArr 119902

lowast where 119902 is the original query and 119902lowast is the

cryptograph query

322 Query Rewriting Rules In view of grammatical rulesquery condition cond includes V 119860 119860 cond

1or cond

2

cond1

and cond2 where ldquordquo is the operator such as equal less

than not greater than greater than and not less than We listthe rewrite formulas for various query conditions as shownin Formulas (6) to (8)

(1) 119860 V120575cond (119909 = 119890) 997904rArr

1119860lowast

= 120585119890(119909)

120575cond (119909 lt 119890) 997904rArr2119860lowast

le 120585119890(119909)

120575cond (119909 lt 119890) 997904rArr3119860lowast

isin 120585(lowast119890]

(119909)

120575cond (119909 gt 119890) 997904rArr4119860lowast

ge 120585119890(119909)

120575cond (119909 gt 119890) 997904rArr5119860lowast

isin 120585[119890lowast)

(119909)

(6)

where both Map 2 and Map 4 are order preserving and bothMap3 and Map 5 are random

(2) 119860 119860

120575cond (119860119894lt 119860119895)

997904rArr1

or (119860lowast

119894= Bid

119860119894(119901119896) and 119860lowast

119895ge 120585119860119894

(119901left))

120575cond (119860119894lt 119860119895)

997904rArr2

or (119860lowast

119895= Bid

119860119895(119901119897) and 119860lowast

119894ge 120585119860119895

(119901right))

120575cond (119860119894lt 119860119895) 997904rArr3

or (120585119860119894

(119901119896left) le 120585

119860119895(119901119897right))

120575cond (119860119894lt 119860119895)

997904rArr4

or (119860lowast

119894= Bid

119860119894(119901119896) and 119860lowast

119895= Bid

119860119895(119901119897))

(7)

where 119901119896

isin partition(119860119894) 119901119897

isin partition(119860119895) and 119901

119897high ge

119901119896lowWhen the condition is 119860

119894lt 119860119895 in Map 1 119860

119894is order pre-

serving while in Map 3 both 119860119894and 119860

119895are order preserving

Meanwhile in Map 2 119860119895is order preserving and in Map 4

both 119860119894and 119860

119895are random

(3) cond1

or and cond2

120575cond (cond1

or cond2)

997904rArr 120575cond (cond1) or 120575cond (cond

2)

120575cond (cond1

and cond2)

997904rArr 120575cond (cond1) and 120575cond (cond

2)

(8)

For instance suppose there are two artifact plaintexttables in cloud database which are app (aid aname timecontent cid) check (cid aid result) respectively where therange of attribute aid is divided into 6 partitions includingidappaid([0 100]) = 3 idappaid((100 200]) = 7 idappaid((200

300]) = 5 idappaid((300 400]) = 1 idcheckaid([0 200]) = 2idcheckaid((200 400]) = 6

Given above partition results we rewrite the followingquery conditions based on above formulas

120575cond (aid = 256) 997904rArr aidlowast = 5

120575cond (aid lt 180) 997904rArr aidlowast isin 3 7

120575cond (aid gt 240) 997904rArr aidlowast isin 5 1

(9)

120575cond (appdid = checkdid)rArr (applowastdidlowast = 3 and checklowastdidlowast = 2) or (applowastdidlowast = 7 and checklowastdidlowast = 2) or (applowastdidlowast =5 and checklowastdidlowast = 6) or (applowastdidlowast = 1 and checklowastdidlowast = 6)

120575cond (appdid lt checkdid)rArr (applowastdidlowast = 3 and checklowastdidlowast = 2) or (applowastdidlowast = 3 and checklowastdidlowast = 6) or (applowastdidlowast =7 and checklowastdidlowast = 2) or (applowastdidlowast = 7 and checklowastdidlowast = 6) or

(applowastdidlowast = 5 and checklowastdidlowast = 6) or (applowastdidlowast = 1 and checklowastdidlowast = 6)

323 Query Optimization Principles Because data isencrypted and stored in various places in order to reducethe transmission cost and improve the query efficiency we

6 Mathematical Problems in Engineering

Planeairway = airwayairway

Airway

Price Plane Plane

Πchairman

120590price lt 900 120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Planeplaneid = planeplaneid

Figure 2 Initial syntax tree

120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Planeairway = airwayairway

Πchairman

120590price lt 900

Pricelowast Planelowast Planelowast

DecryptDecryptDecrypt

Decrypt

Airwaylowast

Planeplaneid = planeplaneid

Figure 3 Syntax tree applied to cloud DB

should run operations on cloud services as much as possibleand the answers can be computed with little effort by theclient

For clear expression operation procedure is expressedby using the syntax tree The decryption operation splits thetree into cryptograph operations and plaintext operationsBecause any single operation on the original tree ends withthe selection after decryption thereby the principle of queryoptimization by using syntax tree is to iteratively pull up theselection

For example given a selection ldquoSELECT chairman FROMAirway Price PlaneWHERE price lt 900 AND begin = ldquoshang-hairdquo AND end = ldquobeijingrdquo ANDPriceplaneid = PlaneplaneidAND Planeairway = Airwayairwayrdquo we take query tree toillustrate how to optimize this query and describe its detailedprocedures

In Figure 2 the SQL statement is converted into an initialsyntax tree If the enterprise use cloud services or other off-site storage platforms we need to first decrypt the crypto-graph then query the data at client as shown in Figure 3where cryptograph database on cloud is bounded by the dot-ted line Query objects (Price Plane and Airway) are con-verted to cryptograph tables (Pricelowast Planelowast and Airwaylowast) inthe cloud database

Operations on syntax tree are performed from bottom toup In Figure 3 the first step is to execute selection while thefollowing steps include rewriting the condition of selectionoperations converting it to a selection on cryptograph incloud database and then decrypting and further filtering theresult at client A new syntax tree is derived as shown inFigure 4

Planeairway = airwayairway

Planeplaneid = planeplaneid

Πchairman

120590price lt 900

Pricelowast Planelowast Planelowast

Decrypt

Decrypt Decrypt Decrypt

Airwaylowast

120590lowast120575cond(price lt 900) 120590

lowast120575cond(begin = ldquoShanghairdquo) 120590

lowast120575cond(end = ldquoBeijingrdquo)

120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Figure 4 Rewriting syntax tree

Planeairway = airwayairway

Planeplaneid = planeplaneid

Πchairman

Pricelowast Planelowast

Decrypt

Decrypt Decrypt

Airwaylowast

120590lowast120575cond(price lt 900)

120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

(begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590lowast120575cond

Figure 5 Moving selections in syntax tree

According to optimization principles described abovewe should iteratively pull up selections Therefore by bothexchanging the positions between selection operations(pricelt 900 begin = ldquoshanghairdquo and end = ldquobeijingrdquo) and joinoperation and then combing corresponding conditions weobtain a new syntax tree as shown in Figure 5

Moreover based on operation rewriting rules join opera-tion in Figure 5 should be converted into two parts includingthe join on cryptograph in the cloud database and the selec-tion on decrypted provisional results as shown in Figure 6Repeat the above steps rewrite all kinds of operations andcontinuously exchange the positions between selection oper-ations and other operations till all the selections cannot bepulled up As a result we get the ultima syntax tree as shownin Figure 7 Operations within dotted line would be executedon cloud service whereas user only needs to execute the lastselection From here we see that the above method takes fulladvantage of cloud service to reduce the cost of transmittingand postprocessing and improve the efficiency of artifactquerying in business process

4 Case Study

In this section we will introduce a business instance of acertain enterprise Based on themethod in Section 2we com-plete the data modeling with artifact lifecycle from a givenprocess instance and illustrate the query process throughquery tree mentioned in Section 3

An enterprisersquos process of equipment purchasescrapinvolves the following steps At first equipment division fillsout the equipment purchasescrap application and hands it to

Mathematical Problems in Engineering 7

Planeairway = airwayairway

Πchairman

Pricelowast Planelowast

Decrypt

Decrypt

Airwaylowast

lowast

120590priceplaneid = planeplaneid

120575cond(priceplaneid = planeplaneid)

120590lowast120575cond(price lt 900)

120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

(begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590lowast120575cond

Figure 6 Rewriting syntax tree

departmentmanagers and companyrsquos leadership for approvalIf the application is consented then we should archive itelse we withdraw it Purchasing department does purchaseaccording to a copy of application and when the purchaseis completed documents should be archived Equipmentdivision scraps the equipment based on specific methods andstandards and then archives the processing results Assetsdepartment regularly verifies companyrsquos assets based onpurchasescrap equipment information Archive departmenthas permission to query all the archived information

This process involves multiple departments and multiplesets of information If we manage the data alone as businessdata are complicated and even one attribute has differencevalue in different event thereby it is difficult to manage If wemanage the process alone only the department activities willbe involvedwhile business data in the process will be ignoredIn this context we analyze the process concerning both dataand process and describe this instance with an Artiflow (119873119878 119877 119862 Ru) where

119873 ldquoEP119878rdquo

119878 FilloutEPAAudit1 Audit2Query Purchase AssetVerification

119877 NewEPA PrimaryEPA FinalEPA Unapro-vedEPA PO FAL

119862 FtoN NtoA

Ru constraint (EPA) = (FilloutEPA Audit1 Aud-it2)

The model contains multiple artifacts ldquoEPArdquo is fordescribing equipment purchase application while its lifecyclestarts from filling auditing to archiving When having beenarchived it will provide asset verification and support queryprocessing ldquoESArdquo is for describing equipment scrap applica-tion while its lifecycle starts from filling auditing to archiv-ing It is associated with another artifact called ldquomethod ampstandardrdquo The lifecycle of ldquoPOrdquoldquoSLrdquo captures process from

Πchairman

Pricelowast Planelowast

Decrypt

Airwaylowastlowast

lowast

120575cond(priceplaneid = planeplaneid)

120575condplaneairway = airwayairway

priceplaneid = planeplaneid and planeairway = airwayairway120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

120590lowast120575cond(price lt 900) (begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590

lowast120575cond

Figure 7 The ultima syntax tree

purchasescrap to application archive The whole process isshown in Figure 8

Artifact Example

Artifact (119862 A 120591 119876 119904 119865)119862 = ldquoEPArdquoA EquipmentName PurchaseAmount UnitPriceApplicationDate Applicant AuditingComment Audit-ingDate119876 empty table (initial state 119904) basic information fill-ing delivery auditing auditing completion auditedapplication archiving (terminate state 119865)120591 EquipmentName verchar PurchaseAmount InUnitPrice Int ApplicationDate Date Applicant Ver-char AuditingComment Verchar AuditingDate Date

Service Example

service = (119899 119881119903 119881119908 119875 119864) where

119899 = ldquoAudit2rdquo119881119903 EPA

119881119908 EPA

119875 DEFINED (EquipmentName) and DEFINED (Pur-chaseAmount) and DEFINED (UnitPrice) and DEFINED(ApplicationDate) and DEFINED (Applicant) and

notDEFINED (AuditingComment) and notDEFINED(AuditingDate)119864 DEFINED (EquipmentName) and DEFINED (Pur-chaseAmount) and DEFINED (UnitPrice) and DEFINED(ApplicationDate) and DEFINED (Applicant) and

DEFINED (AuditingComment) and DEFINED (Audit-ingDate)

Repository Example

119877 = (re 119877119886 119877119903 119862119905)

re ldquoFinalEPArdquo119877119886 EPA

8 Mathematical Problems in Engineering

M1 purchaserequest

EPA

EPA

EPA

EPA EPA

EPA

EPA

EPA

EPA

EPA

EPA

E1

Fill outEPA

E3

E3

E3

NewEPA M3 submit 1

E4

E4

NewESA

M5 scrap request

Fill outESA

UnapprovedEPA

EPA

Unapproved

PrimaryEPA

PrimaryESA

ESA

ESA

ESAESA

ESAESA

ESA

ESA

ESA

ESA

Methodand

standard

M4 submit 2

Audit2

E5

E6

E6

E6

E2

Scrap

FinalESA SL

SL

SL

SLMS

FAL

FAL FAL

FAL

Query

M2 query

M6 perform

M6 perform

Purchase

PO

PO

PO

PO

Assetverification

Final

Audit1

Figure 8 Lifecycle of business data in equipment purchasescrap process

DecryptDecrypt

Decrypt

POEPAid = finalEPAId

FAL Poid = POId 120590finalEPAaudit1 = ldquoDoctor Lirdquo

120590FALname = ldquodepth sounderrdquo

FinalEPAlowast

FAClowast

POlowast

Πapp name

Figure 9 Initial query on EPS

119877119903 EPA

119862119905 IsDefine(AuditingComment)

There is a repository named ldquoFinalEPArdquo which reads andstores artifact ldquoEPArdquo only if ldquoAuditingCommentrdquo has beenassigned

Given a query ldquoSELECT app name FROM FAL POFinalEPA WHERE Falname = lsquodepthsounderrsquo AND FALPOid=POIdANDPOEPAid= FinalEPAIdANDFinalEPAAudit2 = lsquoDoctor Lirdquorsquo it can be converted into a syntax treeas shown in Figure 9 which can be further converted into anew syntax tree shown in Figure 10 Queries will be issued onthis syntax tree

Decrypt

FinalEPAlowast

FAClowast

POlowast

Πapp name

120590FALname = ldquodepth sounderrdquo and FAL POId=POId and

POEPAid = finalEPAId and finalEPA audit2 = ldquoDoctor Lirdquo

120575cond(POEAPId = finalEPAId)

120575cond(FAL Poid = POId) 120590lowast120575cond (finalEPA audit2 = ldquoDoctor Lirdquo)

120590lowast120575cond FAL name = ldquodepth sounderrdquo

lowast

lowast

Figure 10 Ultima query on EPS

5 Conclusion

There is no doubt that more and more large datasets will bepoured out during business process execution meanwhilethese business data are extremely valuable In this case wemodeled business data through its lifecycle from the per-spective of process which ensures the integrity of dynamicbusiness data Furthermore we present the notion of userinterest on business data which has a superior function in

Mathematical Problems in Engineering 9

countingminimum interferential tuples during data partitionand ensuring a lower cost of postprocessing brought bydata partition Considering current business data are mostlystored on cloud we proposed a query rewriting strategy foroff-site encrypted data which has a significant advantage inreducing the postprocessing cost Currently there is littleresearch on business data modeling and querying in the truesense Our research lays great foundation for business datarsquosapplication in enterprise That is the initial step of businessdata management architecture and we will further researchon business data analysis with its lifecycle to fully dig thesignificant value of business data

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Natural Science Foun-dation of China (61272098) and Science and TechnologyDevelopment Foundation of Shanghai Ocean University

References

[1] W Huang Z ChenW Dong H Li B Cao and J Cao ldquoMobileinternet big data platform in china unicomrdquo Tsinghua Scienceand Technology vol 19 no 1 pp 95ndash101 2014

[2] S Sagiroglu andD Sinanc ldquoBig data a reviewrdquo in Proceedings ofthe International Conference on Collaboration Technologies andSystems (CTS rsquo13) pp 42ndash47 San Diego Calif USA May 2013

[3] M Chui B Brown J Bughin et al Big Data The Next Frontierfor Innovation Competition and Productivity McKinsey GlobalInstitute 2011

[4] M Singh and S K Jain ldquoA survey on dataspacerdquo in Advances inNetwork Security and Applications vol 196 of Communicationsin Computer and Information Science pp 608ndash621 Springer2011

[5] K Belhajjame N W Paton S M Embury A A A Fernandesand C Hedeler ldquoIncrementally improving dataspaces based onuser feedbackrdquo Information Systems vol 38 no 5 pp 656ndash6872013

[6] J-P Dittrich and M A Vaz Salles ldquoIDM a unified and versa-tile data model for Personal Dataspace Managementrdquo in Pro-ceedings of the 32nd International Conference onVery LargeDataBases (VLDB rsquo06) pp 367ndash378 September 2006

[7] S Pradhan ldquoTowards a novel desktop search techniquerdquoin Database and Expert Systems Applications pp 192ndash201Springer Berlin Germany 2007

[8] M Zhong M Liu and Q Chen ldquoModeling heterogeneousdata in dataspacerdquo in Proceedings of the IEEE InternationalConference on Information Reuse and Integration (IEEE IRI rsquo08)pp 404ndash409 Las Vegas Nev USA July 2008

[9] A Sarma X Dong andAHalevy ldquoDatamodeling in dataspacesupport platformsrdquo in Conceptual Modeling Foundations andApplications pp 122ndash138 Springer Berlin Germany 2009

[10] R Hull and J Su Report on NSF Workshop on Data-CentricWorkflows 2012 httpdcw2009csucsbedureportpdf

[11] R Hull J Su and R Vaculin ldquoData management perspectiveson business process managementrdquo in Proceedings of the ACMSIGMODConference onManagement ofData (SIGMOD rsquo13) pp943ndash947 June 2013

[12] K Bhattacharya N S Caswell S Kumaran A Nigam and FY Wu ldquoArtifact-centered operational modeling lessons fromcustomeengagementsrdquo IBM Systems Journal vol 46 no 4 pp703ndash721 2007

[13] K Bhattacharya R Guttman K Lyman et al ldquoA model-drivenapproach to industrializing discovery processes in pharmaceu-tical researchrdquo IBM Systems Journal vol 44 no 1 pp 145ndash1622005

[14] R Vaculın R Hull T Heath C Cochran A Nigam and PSukaviriya ldquoDeclarative business artifact centric modeling ofdecision and knowledge intensive business processesrdquo in Pro-ceedings of the 15th IEEE International EDOC Enterprise Com-puting Conference (EDOC rsquo11) pp 151ndash160 September 2011

[15] R Vaculın R Hull M Vukovic T Heath N Mills and Y SunldquoSupporting collaborative decision processesrdquo in Proceedings ofthe IEEE 10th International Conference on Services Computing(SCC rsquo13) pp 651ndash658 July 2013

[16] A Nigam and N S Caswell ldquoBusiness artifacts an approach tooperational specificationrdquo IBM Systems Journal vol 42 no 3pp 428ndash445 2003

[17] G Liu X Liu H Qin et al ldquoAutomated realization of businessworkflow specificationrdquo in Proceedings of the 1st InternationalWorkshop on SOA Globalization People and Work (SG-PAWrsquo09) pp 8ndash9 2009

[18] H Hacigumus B Iyer C Li and S Mehrotra ldquoExecuting SQLover encrypted data in the database-service-providermodelrdquo inProceedings of the ACM SIGMOD International Conference onManagment of Data pp 216ndash227 June 2002

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article Modeling and Querying Business Data with ...downloads.hindawi.com/journals/mpe/2015/506272.pdf · monitoring for performance or business concerns, auditing, and compliance

Mathematical Problems in Engineering 5

in the whole processMoreover when the index is being builtformula (5) is used to determine which Bucket we store eachartifact in which helps to obtain an optimal partition result

32 Business Data Query Cloud service stores encryptedartifact information and corresponding index informationwhile such other information as the partitioning of attributesmapping function and so forth are stored at client When auser issues a query request query 119902 should be rewritten toits server-side cryptograph query 119902

lowast which is then executedon cloudThe purpose of rewriting SQL queries is to split thequery computation across the client and cloud

321 Basic Definitions

Definition 12 120585(lowast119886]

(119909) is a function which returns a set ofall the Bucket ID where its right boundary value 119861

119895

lowastright is

not greater than 119909 when once partitioning Bucket that is120585(lowast119886]

(119909) = BIDVlowast

| 119861Vlowastright le 119909

Definition 13 120585[119886lowast)

(119909) is a function which returns a set of allthe Bucket ID where its left boundary value 119861

119895

lowastleft is greater

than 119909 when once partitioning Bucket that is 120585[119886lowast)

(119909) =

BIDVlowast

| 119861Vlowastleft ge 119909

Definition 14 120585(lowast119901(119902(V119894))](119909) is a function which returns a set

of all the Bucket ID where its maximum artifact query proba-bility 119861

lowast

119895119901right is not greater than 119909 when twice partitioning

Bucket that is 120585(lowast119901(119902(V119894))](119909) = BIDlowast

119895| 119861lowast

119895119901right le 119909

Definition 15 120585[119901(119902(119886119894))lowast)

(119909) is a function which returns aset of all the Bucket ID where its minimum artifact queryprobability 119861

lowast

119895119901left is not less than 119909when twice partitioning

Bucket that is 120585[119901(119902(119886119894))lowast)

(119909) = BIDlowast119895

| 119861lowast

119895119901left ge 119909

Definition 16 120575cond(119862) is a function that translates specificquery conditions to encrypted ones

Definition 17 Query rewriting function is described as120575query(119902) rArr 119902

lowast where 119902 is the original query and 119902lowast is the

cryptograph query

322 Query Rewriting Rules In view of grammatical rulesquery condition cond includes V 119860 119860 cond

1or cond

2

cond1

and cond2 where ldquordquo is the operator such as equal less

than not greater than greater than and not less than We listthe rewrite formulas for various query conditions as shownin Formulas (6) to (8)

(1) 119860 V120575cond (119909 = 119890) 997904rArr

1119860lowast

= 120585119890(119909)

120575cond (119909 lt 119890) 997904rArr2119860lowast

le 120585119890(119909)

120575cond (119909 lt 119890) 997904rArr3119860lowast

isin 120585(lowast119890]

(119909)

120575cond (119909 gt 119890) 997904rArr4119860lowast

ge 120585119890(119909)

120575cond (119909 gt 119890) 997904rArr5119860lowast

isin 120585[119890lowast)

(119909)

(6)

where both Map 2 and Map 4 are order preserving and bothMap3 and Map 5 are random

(2) 119860 119860

120575cond (119860119894lt 119860119895)

997904rArr1

or (119860lowast

119894= Bid

119860119894(119901119896) and 119860lowast

119895ge 120585119860119894

(119901left))

120575cond (119860119894lt 119860119895)

997904rArr2

or (119860lowast

119895= Bid

119860119895(119901119897) and 119860lowast

119894ge 120585119860119895

(119901right))

120575cond (119860119894lt 119860119895) 997904rArr3

or (120585119860119894

(119901119896left) le 120585

119860119895(119901119897right))

120575cond (119860119894lt 119860119895)

997904rArr4

or (119860lowast

119894= Bid

119860119894(119901119896) and 119860lowast

119895= Bid

119860119895(119901119897))

(7)

where 119901119896

isin partition(119860119894) 119901119897

isin partition(119860119895) and 119901

119897high ge

119901119896lowWhen the condition is 119860

119894lt 119860119895 in Map 1 119860

119894is order pre-

serving while in Map 3 both 119860119894and 119860

119895are order preserving

Meanwhile in Map 2 119860119895is order preserving and in Map 4

both 119860119894and 119860

119895are random

(3) cond1

or and cond2

120575cond (cond1

or cond2)

997904rArr 120575cond (cond1) or 120575cond (cond

2)

120575cond (cond1

and cond2)

997904rArr 120575cond (cond1) and 120575cond (cond

2)

(8)

For instance suppose there are two artifact plaintexttables in cloud database which are app (aid aname timecontent cid) check (cid aid result) respectively where therange of attribute aid is divided into 6 partitions includingidappaid([0 100]) = 3 idappaid((100 200]) = 7 idappaid((200

300]) = 5 idappaid((300 400]) = 1 idcheckaid([0 200]) = 2idcheckaid((200 400]) = 6

Given above partition results we rewrite the followingquery conditions based on above formulas

120575cond (aid = 256) 997904rArr aidlowast = 5

120575cond (aid lt 180) 997904rArr aidlowast isin 3 7

120575cond (aid gt 240) 997904rArr aidlowast isin 5 1

(9)

120575cond (appdid = checkdid)rArr (applowastdidlowast = 3 and checklowastdidlowast = 2) or (applowastdidlowast = 7 and checklowastdidlowast = 2) or (applowastdidlowast =5 and checklowastdidlowast = 6) or (applowastdidlowast = 1 and checklowastdidlowast = 6)

120575cond (appdid lt checkdid)rArr (applowastdidlowast = 3 and checklowastdidlowast = 2) or (applowastdidlowast = 3 and checklowastdidlowast = 6) or (applowastdidlowast =7 and checklowastdidlowast = 2) or (applowastdidlowast = 7 and checklowastdidlowast = 6) or

(applowastdidlowast = 5 and checklowastdidlowast = 6) or (applowastdidlowast = 1 and checklowastdidlowast = 6)

323 Query Optimization Principles Because data isencrypted and stored in various places in order to reducethe transmission cost and improve the query efficiency we

6 Mathematical Problems in Engineering

Planeairway = airwayairway

Airway

Price Plane Plane

Πchairman

120590price lt 900 120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Planeplaneid = planeplaneid

Figure 2 Initial syntax tree

120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Planeairway = airwayairway

Πchairman

120590price lt 900

Pricelowast Planelowast Planelowast

DecryptDecryptDecrypt

Decrypt

Airwaylowast

Planeplaneid = planeplaneid

Figure 3 Syntax tree applied to cloud DB

should run operations on cloud services as much as possibleand the answers can be computed with little effort by theclient

For clear expression operation procedure is expressedby using the syntax tree The decryption operation splits thetree into cryptograph operations and plaintext operationsBecause any single operation on the original tree ends withthe selection after decryption thereby the principle of queryoptimization by using syntax tree is to iteratively pull up theselection

For example given a selection ldquoSELECT chairman FROMAirway Price PlaneWHERE price lt 900 AND begin = ldquoshang-hairdquo AND end = ldquobeijingrdquo ANDPriceplaneid = PlaneplaneidAND Planeairway = Airwayairwayrdquo we take query tree toillustrate how to optimize this query and describe its detailedprocedures

In Figure 2 the SQL statement is converted into an initialsyntax tree If the enterprise use cloud services or other off-site storage platforms we need to first decrypt the crypto-graph then query the data at client as shown in Figure 3where cryptograph database on cloud is bounded by the dot-ted line Query objects (Price Plane and Airway) are con-verted to cryptograph tables (Pricelowast Planelowast and Airwaylowast) inthe cloud database

Operations on syntax tree are performed from bottom toup In Figure 3 the first step is to execute selection while thefollowing steps include rewriting the condition of selectionoperations converting it to a selection on cryptograph incloud database and then decrypting and further filtering theresult at client A new syntax tree is derived as shown inFigure 4

Planeairway = airwayairway

Planeplaneid = planeplaneid

Πchairman

120590price lt 900

Pricelowast Planelowast Planelowast

Decrypt

Decrypt Decrypt Decrypt

Airwaylowast

120590lowast120575cond(price lt 900) 120590

lowast120575cond(begin = ldquoShanghairdquo) 120590

lowast120575cond(end = ldquoBeijingrdquo)

120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Figure 4 Rewriting syntax tree

Planeairway = airwayairway

Planeplaneid = planeplaneid

Πchairman

Pricelowast Planelowast

Decrypt

Decrypt Decrypt

Airwaylowast

120590lowast120575cond(price lt 900)

120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

(begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590lowast120575cond

Figure 5 Moving selections in syntax tree

According to optimization principles described abovewe should iteratively pull up selections Therefore by bothexchanging the positions between selection operations(pricelt 900 begin = ldquoshanghairdquo and end = ldquobeijingrdquo) and joinoperation and then combing corresponding conditions weobtain a new syntax tree as shown in Figure 5

Moreover based on operation rewriting rules join opera-tion in Figure 5 should be converted into two parts includingthe join on cryptograph in the cloud database and the selec-tion on decrypted provisional results as shown in Figure 6Repeat the above steps rewrite all kinds of operations andcontinuously exchange the positions between selection oper-ations and other operations till all the selections cannot bepulled up As a result we get the ultima syntax tree as shownin Figure 7 Operations within dotted line would be executedon cloud service whereas user only needs to execute the lastselection From here we see that the above method takes fulladvantage of cloud service to reduce the cost of transmittingand postprocessing and improve the efficiency of artifactquerying in business process

4 Case Study

In this section we will introduce a business instance of acertain enterprise Based on themethod in Section 2we com-plete the data modeling with artifact lifecycle from a givenprocess instance and illustrate the query process throughquery tree mentioned in Section 3

An enterprisersquos process of equipment purchasescrapinvolves the following steps At first equipment division fillsout the equipment purchasescrap application and hands it to

Mathematical Problems in Engineering 7

Planeairway = airwayairway

Πchairman

Pricelowast Planelowast

Decrypt

Decrypt

Airwaylowast

lowast

120590priceplaneid = planeplaneid

120575cond(priceplaneid = planeplaneid)

120590lowast120575cond(price lt 900)

120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

(begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590lowast120575cond

Figure 6 Rewriting syntax tree

departmentmanagers and companyrsquos leadership for approvalIf the application is consented then we should archive itelse we withdraw it Purchasing department does purchaseaccording to a copy of application and when the purchaseis completed documents should be archived Equipmentdivision scraps the equipment based on specific methods andstandards and then archives the processing results Assetsdepartment regularly verifies companyrsquos assets based onpurchasescrap equipment information Archive departmenthas permission to query all the archived information

This process involves multiple departments and multiplesets of information If we manage the data alone as businessdata are complicated and even one attribute has differencevalue in different event thereby it is difficult to manage If wemanage the process alone only the department activities willbe involvedwhile business data in the process will be ignoredIn this context we analyze the process concerning both dataand process and describe this instance with an Artiflow (119873119878 119877 119862 Ru) where

119873 ldquoEP119878rdquo

119878 FilloutEPAAudit1 Audit2Query Purchase AssetVerification

119877 NewEPA PrimaryEPA FinalEPA Unapro-vedEPA PO FAL

119862 FtoN NtoA

Ru constraint (EPA) = (FilloutEPA Audit1 Aud-it2)

The model contains multiple artifacts ldquoEPArdquo is fordescribing equipment purchase application while its lifecyclestarts from filling auditing to archiving When having beenarchived it will provide asset verification and support queryprocessing ldquoESArdquo is for describing equipment scrap applica-tion while its lifecycle starts from filling auditing to archiv-ing It is associated with another artifact called ldquomethod ampstandardrdquo The lifecycle of ldquoPOrdquoldquoSLrdquo captures process from

Πchairman

Pricelowast Planelowast

Decrypt

Airwaylowastlowast

lowast

120575cond(priceplaneid = planeplaneid)

120575condplaneairway = airwayairway

priceplaneid = planeplaneid and planeairway = airwayairway120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

120590lowast120575cond(price lt 900) (begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590

lowast120575cond

Figure 7 The ultima syntax tree

purchasescrap to application archive The whole process isshown in Figure 8

Artifact Example

Artifact (119862 A 120591 119876 119904 119865)119862 = ldquoEPArdquoA EquipmentName PurchaseAmount UnitPriceApplicationDate Applicant AuditingComment Audit-ingDate119876 empty table (initial state 119904) basic information fill-ing delivery auditing auditing completion auditedapplication archiving (terminate state 119865)120591 EquipmentName verchar PurchaseAmount InUnitPrice Int ApplicationDate Date Applicant Ver-char AuditingComment Verchar AuditingDate Date

Service Example

service = (119899 119881119903 119881119908 119875 119864) where

119899 = ldquoAudit2rdquo119881119903 EPA

119881119908 EPA

119875 DEFINED (EquipmentName) and DEFINED (Pur-chaseAmount) and DEFINED (UnitPrice) and DEFINED(ApplicationDate) and DEFINED (Applicant) and

notDEFINED (AuditingComment) and notDEFINED(AuditingDate)119864 DEFINED (EquipmentName) and DEFINED (Pur-chaseAmount) and DEFINED (UnitPrice) and DEFINED(ApplicationDate) and DEFINED (Applicant) and

DEFINED (AuditingComment) and DEFINED (Audit-ingDate)

Repository Example

119877 = (re 119877119886 119877119903 119862119905)

re ldquoFinalEPArdquo119877119886 EPA

8 Mathematical Problems in Engineering

M1 purchaserequest

EPA

EPA

EPA

EPA EPA

EPA

EPA

EPA

EPA

EPA

EPA

E1

Fill outEPA

E3

E3

E3

NewEPA M3 submit 1

E4

E4

NewESA

M5 scrap request

Fill outESA

UnapprovedEPA

EPA

Unapproved

PrimaryEPA

PrimaryESA

ESA

ESA

ESAESA

ESAESA

ESA

ESA

ESA

ESA

Methodand

standard

M4 submit 2

Audit2

E5

E6

E6

E6

E2

Scrap

FinalESA SL

SL

SL

SLMS

FAL

FAL FAL

FAL

Query

M2 query

M6 perform

M6 perform

Purchase

PO

PO

PO

PO

Assetverification

Final

Audit1

Figure 8 Lifecycle of business data in equipment purchasescrap process

DecryptDecrypt

Decrypt

POEPAid = finalEPAId

FAL Poid = POId 120590finalEPAaudit1 = ldquoDoctor Lirdquo

120590FALname = ldquodepth sounderrdquo

FinalEPAlowast

FAClowast

POlowast

Πapp name

Figure 9 Initial query on EPS

119877119903 EPA

119862119905 IsDefine(AuditingComment)

There is a repository named ldquoFinalEPArdquo which reads andstores artifact ldquoEPArdquo only if ldquoAuditingCommentrdquo has beenassigned

Given a query ldquoSELECT app name FROM FAL POFinalEPA WHERE Falname = lsquodepthsounderrsquo AND FALPOid=POIdANDPOEPAid= FinalEPAIdANDFinalEPAAudit2 = lsquoDoctor Lirdquorsquo it can be converted into a syntax treeas shown in Figure 9 which can be further converted into anew syntax tree shown in Figure 10 Queries will be issued onthis syntax tree

Decrypt

FinalEPAlowast

FAClowast

POlowast

Πapp name

120590FALname = ldquodepth sounderrdquo and FAL POId=POId and

POEPAid = finalEPAId and finalEPA audit2 = ldquoDoctor Lirdquo

120575cond(POEAPId = finalEPAId)

120575cond(FAL Poid = POId) 120590lowast120575cond (finalEPA audit2 = ldquoDoctor Lirdquo)

120590lowast120575cond FAL name = ldquodepth sounderrdquo

lowast

lowast

Figure 10 Ultima query on EPS

5 Conclusion

There is no doubt that more and more large datasets will bepoured out during business process execution meanwhilethese business data are extremely valuable In this case wemodeled business data through its lifecycle from the per-spective of process which ensures the integrity of dynamicbusiness data Furthermore we present the notion of userinterest on business data which has a superior function in

Mathematical Problems in Engineering 9

countingminimum interferential tuples during data partitionand ensuring a lower cost of postprocessing brought bydata partition Considering current business data are mostlystored on cloud we proposed a query rewriting strategy foroff-site encrypted data which has a significant advantage inreducing the postprocessing cost Currently there is littleresearch on business data modeling and querying in the truesense Our research lays great foundation for business datarsquosapplication in enterprise That is the initial step of businessdata management architecture and we will further researchon business data analysis with its lifecycle to fully dig thesignificant value of business data

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Natural Science Foun-dation of China (61272098) and Science and TechnologyDevelopment Foundation of Shanghai Ocean University

References

[1] W Huang Z ChenW Dong H Li B Cao and J Cao ldquoMobileinternet big data platform in china unicomrdquo Tsinghua Scienceand Technology vol 19 no 1 pp 95ndash101 2014

[2] S Sagiroglu andD Sinanc ldquoBig data a reviewrdquo in Proceedings ofthe International Conference on Collaboration Technologies andSystems (CTS rsquo13) pp 42ndash47 San Diego Calif USA May 2013

[3] M Chui B Brown J Bughin et al Big Data The Next Frontierfor Innovation Competition and Productivity McKinsey GlobalInstitute 2011

[4] M Singh and S K Jain ldquoA survey on dataspacerdquo in Advances inNetwork Security and Applications vol 196 of Communicationsin Computer and Information Science pp 608ndash621 Springer2011

[5] K Belhajjame N W Paton S M Embury A A A Fernandesand C Hedeler ldquoIncrementally improving dataspaces based onuser feedbackrdquo Information Systems vol 38 no 5 pp 656ndash6872013

[6] J-P Dittrich and M A Vaz Salles ldquoIDM a unified and versa-tile data model for Personal Dataspace Managementrdquo in Pro-ceedings of the 32nd International Conference onVery LargeDataBases (VLDB rsquo06) pp 367ndash378 September 2006

[7] S Pradhan ldquoTowards a novel desktop search techniquerdquoin Database and Expert Systems Applications pp 192ndash201Springer Berlin Germany 2007

[8] M Zhong M Liu and Q Chen ldquoModeling heterogeneousdata in dataspacerdquo in Proceedings of the IEEE InternationalConference on Information Reuse and Integration (IEEE IRI rsquo08)pp 404ndash409 Las Vegas Nev USA July 2008

[9] A Sarma X Dong andAHalevy ldquoDatamodeling in dataspacesupport platformsrdquo in Conceptual Modeling Foundations andApplications pp 122ndash138 Springer Berlin Germany 2009

[10] R Hull and J Su Report on NSF Workshop on Data-CentricWorkflows 2012 httpdcw2009csucsbedureportpdf

[11] R Hull J Su and R Vaculin ldquoData management perspectiveson business process managementrdquo in Proceedings of the ACMSIGMODConference onManagement ofData (SIGMOD rsquo13) pp943ndash947 June 2013

[12] K Bhattacharya N S Caswell S Kumaran A Nigam and FY Wu ldquoArtifact-centered operational modeling lessons fromcustomeengagementsrdquo IBM Systems Journal vol 46 no 4 pp703ndash721 2007

[13] K Bhattacharya R Guttman K Lyman et al ldquoA model-drivenapproach to industrializing discovery processes in pharmaceu-tical researchrdquo IBM Systems Journal vol 44 no 1 pp 145ndash1622005

[14] R Vaculın R Hull T Heath C Cochran A Nigam and PSukaviriya ldquoDeclarative business artifact centric modeling ofdecision and knowledge intensive business processesrdquo in Pro-ceedings of the 15th IEEE International EDOC Enterprise Com-puting Conference (EDOC rsquo11) pp 151ndash160 September 2011

[15] R Vaculın R Hull M Vukovic T Heath N Mills and Y SunldquoSupporting collaborative decision processesrdquo in Proceedings ofthe IEEE 10th International Conference on Services Computing(SCC rsquo13) pp 651ndash658 July 2013

[16] A Nigam and N S Caswell ldquoBusiness artifacts an approach tooperational specificationrdquo IBM Systems Journal vol 42 no 3pp 428ndash445 2003

[17] G Liu X Liu H Qin et al ldquoAutomated realization of businessworkflow specificationrdquo in Proceedings of the 1st InternationalWorkshop on SOA Globalization People and Work (SG-PAWrsquo09) pp 8ndash9 2009

[18] H Hacigumus B Iyer C Li and S Mehrotra ldquoExecuting SQLover encrypted data in the database-service-providermodelrdquo inProceedings of the ACM SIGMOD International Conference onManagment of Data pp 216ndash227 June 2002

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article Modeling and Querying Business Data with ...downloads.hindawi.com/journals/mpe/2015/506272.pdf · monitoring for performance or business concerns, auditing, and compliance

6 Mathematical Problems in Engineering

Planeairway = airwayairway

Airway

Price Plane Plane

Πchairman

120590price lt 900 120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Planeplaneid = planeplaneid

Figure 2 Initial syntax tree

120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Planeairway = airwayairway

Πchairman

120590price lt 900

Pricelowast Planelowast Planelowast

DecryptDecryptDecrypt

Decrypt

Airwaylowast

Planeplaneid = planeplaneid

Figure 3 Syntax tree applied to cloud DB

should run operations on cloud services as much as possibleand the answers can be computed with little effort by theclient

For clear expression operation procedure is expressedby using the syntax tree The decryption operation splits thetree into cryptograph operations and plaintext operationsBecause any single operation on the original tree ends withthe selection after decryption thereby the principle of queryoptimization by using syntax tree is to iteratively pull up theselection

For example given a selection ldquoSELECT chairman FROMAirway Price PlaneWHERE price lt 900 AND begin = ldquoshang-hairdquo AND end = ldquobeijingrdquo ANDPriceplaneid = PlaneplaneidAND Planeairway = Airwayairwayrdquo we take query tree toillustrate how to optimize this query and describe its detailedprocedures

In Figure 2 the SQL statement is converted into an initialsyntax tree If the enterprise use cloud services or other off-site storage platforms we need to first decrypt the crypto-graph then query the data at client as shown in Figure 3where cryptograph database on cloud is bounded by the dot-ted line Query objects (Price Plane and Airway) are con-verted to cryptograph tables (Pricelowast Planelowast and Airwaylowast) inthe cloud database

Operations on syntax tree are performed from bottom toup In Figure 3 the first step is to execute selection while thefollowing steps include rewriting the condition of selectionoperations converting it to a selection on cryptograph incloud database and then decrypting and further filtering theresult at client A new syntax tree is derived as shown inFigure 4

Planeairway = airwayairway

Planeplaneid = planeplaneid

Πchairman

120590price lt 900

Pricelowast Planelowast Planelowast

Decrypt

Decrypt Decrypt Decrypt

Airwaylowast

120590lowast120575cond(price lt 900) 120590

lowast120575cond(begin = ldquoShanghairdquo) 120590

lowast120575cond(end = ldquoBeijingrdquo)

120590begin = ldquoShanghairdquo 120590end = ldquoBeijingrdquo

Figure 4 Rewriting syntax tree

Planeairway = airwayairway

Planeplaneid = planeplaneid

Πchairman

Pricelowast Planelowast

Decrypt

Decrypt Decrypt

Airwaylowast

120590lowast120575cond(price lt 900)

120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

(begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590lowast120575cond

Figure 5 Moving selections in syntax tree

According to optimization principles described abovewe should iteratively pull up selections Therefore by bothexchanging the positions between selection operations(pricelt 900 begin = ldquoshanghairdquo and end = ldquobeijingrdquo) and joinoperation and then combing corresponding conditions weobtain a new syntax tree as shown in Figure 5

Moreover based on operation rewriting rules join opera-tion in Figure 5 should be converted into two parts includingthe join on cryptograph in the cloud database and the selec-tion on decrypted provisional results as shown in Figure 6Repeat the above steps rewrite all kinds of operations andcontinuously exchange the positions between selection oper-ations and other operations till all the selections cannot bepulled up As a result we get the ultima syntax tree as shownin Figure 7 Operations within dotted line would be executedon cloud service whereas user only needs to execute the lastselection From here we see that the above method takes fulladvantage of cloud service to reduce the cost of transmittingand postprocessing and improve the efficiency of artifactquerying in business process

4 Case Study

In this section we will introduce a business instance of acertain enterprise Based on themethod in Section 2we com-plete the data modeling with artifact lifecycle from a givenprocess instance and illustrate the query process throughquery tree mentioned in Section 3

An enterprisersquos process of equipment purchasescrapinvolves the following steps At first equipment division fillsout the equipment purchasescrap application and hands it to

Mathematical Problems in Engineering 7

Planeairway = airwayairway

Πchairman

Pricelowast Planelowast

Decrypt

Decrypt

Airwaylowast

lowast

120590priceplaneid = planeplaneid

120575cond(priceplaneid = planeplaneid)

120590lowast120575cond(price lt 900)

120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

(begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590lowast120575cond

Figure 6 Rewriting syntax tree

departmentmanagers and companyrsquos leadership for approvalIf the application is consented then we should archive itelse we withdraw it Purchasing department does purchaseaccording to a copy of application and when the purchaseis completed documents should be archived Equipmentdivision scraps the equipment based on specific methods andstandards and then archives the processing results Assetsdepartment regularly verifies companyrsquos assets based onpurchasescrap equipment information Archive departmenthas permission to query all the archived information

This process involves multiple departments and multiplesets of information If we manage the data alone as businessdata are complicated and even one attribute has differencevalue in different event thereby it is difficult to manage If wemanage the process alone only the department activities willbe involvedwhile business data in the process will be ignoredIn this context we analyze the process concerning both dataand process and describe this instance with an Artiflow (119873119878 119877 119862 Ru) where

119873 ldquoEP119878rdquo

119878 FilloutEPAAudit1 Audit2Query Purchase AssetVerification

119877 NewEPA PrimaryEPA FinalEPA Unapro-vedEPA PO FAL

119862 FtoN NtoA

Ru constraint (EPA) = (FilloutEPA Audit1 Aud-it2)

The model contains multiple artifacts ldquoEPArdquo is fordescribing equipment purchase application while its lifecyclestarts from filling auditing to archiving When having beenarchived it will provide asset verification and support queryprocessing ldquoESArdquo is for describing equipment scrap applica-tion while its lifecycle starts from filling auditing to archiv-ing It is associated with another artifact called ldquomethod ampstandardrdquo The lifecycle of ldquoPOrdquoldquoSLrdquo captures process from

Πchairman

Pricelowast Planelowast

Decrypt

Airwaylowastlowast

lowast

120575cond(priceplaneid = planeplaneid)

120575condplaneairway = airwayairway

priceplaneid = planeplaneid and planeairway = airwayairway120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

120590lowast120575cond(price lt 900) (begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590

lowast120575cond

Figure 7 The ultima syntax tree

purchasescrap to application archive The whole process isshown in Figure 8

Artifact Example

Artifact (119862 A 120591 119876 119904 119865)119862 = ldquoEPArdquoA EquipmentName PurchaseAmount UnitPriceApplicationDate Applicant AuditingComment Audit-ingDate119876 empty table (initial state 119904) basic information fill-ing delivery auditing auditing completion auditedapplication archiving (terminate state 119865)120591 EquipmentName verchar PurchaseAmount InUnitPrice Int ApplicationDate Date Applicant Ver-char AuditingComment Verchar AuditingDate Date

Service Example

service = (119899 119881119903 119881119908 119875 119864) where

119899 = ldquoAudit2rdquo119881119903 EPA

119881119908 EPA

119875 DEFINED (EquipmentName) and DEFINED (Pur-chaseAmount) and DEFINED (UnitPrice) and DEFINED(ApplicationDate) and DEFINED (Applicant) and

notDEFINED (AuditingComment) and notDEFINED(AuditingDate)119864 DEFINED (EquipmentName) and DEFINED (Pur-chaseAmount) and DEFINED (UnitPrice) and DEFINED(ApplicationDate) and DEFINED (Applicant) and

DEFINED (AuditingComment) and DEFINED (Audit-ingDate)

Repository Example

119877 = (re 119877119886 119877119903 119862119905)

re ldquoFinalEPArdquo119877119886 EPA

8 Mathematical Problems in Engineering

M1 purchaserequest

EPA

EPA

EPA

EPA EPA

EPA

EPA

EPA

EPA

EPA

EPA

E1

Fill outEPA

E3

E3

E3

NewEPA M3 submit 1

E4

E4

NewESA

M5 scrap request

Fill outESA

UnapprovedEPA

EPA

Unapproved

PrimaryEPA

PrimaryESA

ESA

ESA

ESAESA

ESAESA

ESA

ESA

ESA

ESA

Methodand

standard

M4 submit 2

Audit2

E5

E6

E6

E6

E2

Scrap

FinalESA SL

SL

SL

SLMS

FAL

FAL FAL

FAL

Query

M2 query

M6 perform

M6 perform

Purchase

PO

PO

PO

PO

Assetverification

Final

Audit1

Figure 8 Lifecycle of business data in equipment purchasescrap process

DecryptDecrypt

Decrypt

POEPAid = finalEPAId

FAL Poid = POId 120590finalEPAaudit1 = ldquoDoctor Lirdquo

120590FALname = ldquodepth sounderrdquo

FinalEPAlowast

FAClowast

POlowast

Πapp name

Figure 9 Initial query on EPS

119877119903 EPA

119862119905 IsDefine(AuditingComment)

There is a repository named ldquoFinalEPArdquo which reads andstores artifact ldquoEPArdquo only if ldquoAuditingCommentrdquo has beenassigned

Given a query ldquoSELECT app name FROM FAL POFinalEPA WHERE Falname = lsquodepthsounderrsquo AND FALPOid=POIdANDPOEPAid= FinalEPAIdANDFinalEPAAudit2 = lsquoDoctor Lirdquorsquo it can be converted into a syntax treeas shown in Figure 9 which can be further converted into anew syntax tree shown in Figure 10 Queries will be issued onthis syntax tree

Decrypt

FinalEPAlowast

FAClowast

POlowast

Πapp name

120590FALname = ldquodepth sounderrdquo and FAL POId=POId and

POEPAid = finalEPAId and finalEPA audit2 = ldquoDoctor Lirdquo

120575cond(POEAPId = finalEPAId)

120575cond(FAL Poid = POId) 120590lowast120575cond (finalEPA audit2 = ldquoDoctor Lirdquo)

120590lowast120575cond FAL name = ldquodepth sounderrdquo

lowast

lowast

Figure 10 Ultima query on EPS

5 Conclusion

There is no doubt that more and more large datasets will bepoured out during business process execution meanwhilethese business data are extremely valuable In this case wemodeled business data through its lifecycle from the per-spective of process which ensures the integrity of dynamicbusiness data Furthermore we present the notion of userinterest on business data which has a superior function in

Mathematical Problems in Engineering 9

countingminimum interferential tuples during data partitionand ensuring a lower cost of postprocessing brought bydata partition Considering current business data are mostlystored on cloud we proposed a query rewriting strategy foroff-site encrypted data which has a significant advantage inreducing the postprocessing cost Currently there is littleresearch on business data modeling and querying in the truesense Our research lays great foundation for business datarsquosapplication in enterprise That is the initial step of businessdata management architecture and we will further researchon business data analysis with its lifecycle to fully dig thesignificant value of business data

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Natural Science Foun-dation of China (61272098) and Science and TechnologyDevelopment Foundation of Shanghai Ocean University

References

[1] W Huang Z ChenW Dong H Li B Cao and J Cao ldquoMobileinternet big data platform in china unicomrdquo Tsinghua Scienceand Technology vol 19 no 1 pp 95ndash101 2014

[2] S Sagiroglu andD Sinanc ldquoBig data a reviewrdquo in Proceedings ofthe International Conference on Collaboration Technologies andSystems (CTS rsquo13) pp 42ndash47 San Diego Calif USA May 2013

[3] M Chui B Brown J Bughin et al Big Data The Next Frontierfor Innovation Competition and Productivity McKinsey GlobalInstitute 2011

[4] M Singh and S K Jain ldquoA survey on dataspacerdquo in Advances inNetwork Security and Applications vol 196 of Communicationsin Computer and Information Science pp 608ndash621 Springer2011

[5] K Belhajjame N W Paton S M Embury A A A Fernandesand C Hedeler ldquoIncrementally improving dataspaces based onuser feedbackrdquo Information Systems vol 38 no 5 pp 656ndash6872013

[6] J-P Dittrich and M A Vaz Salles ldquoIDM a unified and versa-tile data model for Personal Dataspace Managementrdquo in Pro-ceedings of the 32nd International Conference onVery LargeDataBases (VLDB rsquo06) pp 367ndash378 September 2006

[7] S Pradhan ldquoTowards a novel desktop search techniquerdquoin Database and Expert Systems Applications pp 192ndash201Springer Berlin Germany 2007

[8] M Zhong M Liu and Q Chen ldquoModeling heterogeneousdata in dataspacerdquo in Proceedings of the IEEE InternationalConference on Information Reuse and Integration (IEEE IRI rsquo08)pp 404ndash409 Las Vegas Nev USA July 2008

[9] A Sarma X Dong andAHalevy ldquoDatamodeling in dataspacesupport platformsrdquo in Conceptual Modeling Foundations andApplications pp 122ndash138 Springer Berlin Germany 2009

[10] R Hull and J Su Report on NSF Workshop on Data-CentricWorkflows 2012 httpdcw2009csucsbedureportpdf

[11] R Hull J Su and R Vaculin ldquoData management perspectiveson business process managementrdquo in Proceedings of the ACMSIGMODConference onManagement ofData (SIGMOD rsquo13) pp943ndash947 June 2013

[12] K Bhattacharya N S Caswell S Kumaran A Nigam and FY Wu ldquoArtifact-centered operational modeling lessons fromcustomeengagementsrdquo IBM Systems Journal vol 46 no 4 pp703ndash721 2007

[13] K Bhattacharya R Guttman K Lyman et al ldquoA model-drivenapproach to industrializing discovery processes in pharmaceu-tical researchrdquo IBM Systems Journal vol 44 no 1 pp 145ndash1622005

[14] R Vaculın R Hull T Heath C Cochran A Nigam and PSukaviriya ldquoDeclarative business artifact centric modeling ofdecision and knowledge intensive business processesrdquo in Pro-ceedings of the 15th IEEE International EDOC Enterprise Com-puting Conference (EDOC rsquo11) pp 151ndash160 September 2011

[15] R Vaculın R Hull M Vukovic T Heath N Mills and Y SunldquoSupporting collaborative decision processesrdquo in Proceedings ofthe IEEE 10th International Conference on Services Computing(SCC rsquo13) pp 651ndash658 July 2013

[16] A Nigam and N S Caswell ldquoBusiness artifacts an approach tooperational specificationrdquo IBM Systems Journal vol 42 no 3pp 428ndash445 2003

[17] G Liu X Liu H Qin et al ldquoAutomated realization of businessworkflow specificationrdquo in Proceedings of the 1st InternationalWorkshop on SOA Globalization People and Work (SG-PAWrsquo09) pp 8ndash9 2009

[18] H Hacigumus B Iyer C Li and S Mehrotra ldquoExecuting SQLover encrypted data in the database-service-providermodelrdquo inProceedings of the ACM SIGMOD International Conference onManagment of Data pp 216ndash227 June 2002

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article Modeling and Querying Business Data with ...downloads.hindawi.com/journals/mpe/2015/506272.pdf · monitoring for performance or business concerns, auditing, and compliance

Mathematical Problems in Engineering 7

Planeairway = airwayairway

Πchairman

Pricelowast Planelowast

Decrypt

Decrypt

Airwaylowast

lowast

120590priceplaneid = planeplaneid

120575cond(priceplaneid = planeplaneid)

120590lowast120575cond(price lt 900)

120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

(begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590lowast120575cond

Figure 6 Rewriting syntax tree

departmentmanagers and companyrsquos leadership for approvalIf the application is consented then we should archive itelse we withdraw it Purchasing department does purchaseaccording to a copy of application and when the purchaseis completed documents should be archived Equipmentdivision scraps the equipment based on specific methods andstandards and then archives the processing results Assetsdepartment regularly verifies companyrsquos assets based onpurchasescrap equipment information Archive departmenthas permission to query all the archived information

This process involves multiple departments and multiplesets of information If we manage the data alone as businessdata are complicated and even one attribute has differencevalue in different event thereby it is difficult to manage If wemanage the process alone only the department activities willbe involvedwhile business data in the process will be ignoredIn this context we analyze the process concerning both dataand process and describe this instance with an Artiflow (119873119878 119877 119862 Ru) where

119873 ldquoEP119878rdquo

119878 FilloutEPAAudit1 Audit2Query Purchase AssetVerification

119877 NewEPA PrimaryEPA FinalEPA Unapro-vedEPA PO FAL

119862 FtoN NtoA

Ru constraint (EPA) = (FilloutEPA Audit1 Aud-it2)

The model contains multiple artifacts ldquoEPArdquo is fordescribing equipment purchase application while its lifecyclestarts from filling auditing to archiving When having beenarchived it will provide asset verification and support queryprocessing ldquoESArdquo is for describing equipment scrap applica-tion while its lifecycle starts from filling auditing to archiv-ing It is associated with another artifact called ldquomethod ampstandardrdquo The lifecycle of ldquoPOrdquoldquoSLrdquo captures process from

Πchairman

Pricelowast Planelowast

Decrypt

Airwaylowastlowast

lowast

120575cond(priceplaneid = planeplaneid)

120575condplaneairway = airwayairway

priceplaneid = planeplaneid and planeairway = airwayairway120590price lt 900 and begin = ldquoShanghairdquo and = ldquoBeijingrdquoend

120590lowast120575cond(price lt 900) (begin = ldquoShanghairdquo and end = ldquoBeijingrdquo)120590

lowast120575cond

Figure 7 The ultima syntax tree

purchasescrap to application archive The whole process isshown in Figure 8

Artifact Example

Artifact (119862 A 120591 119876 119904 119865)119862 = ldquoEPArdquoA EquipmentName PurchaseAmount UnitPriceApplicationDate Applicant AuditingComment Audit-ingDate119876 empty table (initial state 119904) basic information fill-ing delivery auditing auditing completion auditedapplication archiving (terminate state 119865)120591 EquipmentName verchar PurchaseAmount InUnitPrice Int ApplicationDate Date Applicant Ver-char AuditingComment Verchar AuditingDate Date

Service Example

service = (119899 119881119903 119881119908 119875 119864) where

119899 = ldquoAudit2rdquo119881119903 EPA

119881119908 EPA

119875 DEFINED (EquipmentName) and DEFINED (Pur-chaseAmount) and DEFINED (UnitPrice) and DEFINED(ApplicationDate) and DEFINED (Applicant) and

notDEFINED (AuditingComment) and notDEFINED(AuditingDate)119864 DEFINED (EquipmentName) and DEFINED (Pur-chaseAmount) and DEFINED (UnitPrice) and DEFINED(ApplicationDate) and DEFINED (Applicant) and

DEFINED (AuditingComment) and DEFINED (Audit-ingDate)

Repository Example

119877 = (re 119877119886 119877119903 119862119905)

re ldquoFinalEPArdquo119877119886 EPA

8 Mathematical Problems in Engineering

M1 purchaserequest

EPA

EPA

EPA

EPA EPA

EPA

EPA

EPA

EPA

EPA

EPA

E1

Fill outEPA

E3

E3

E3

NewEPA M3 submit 1

E4

E4

NewESA

M5 scrap request

Fill outESA

UnapprovedEPA

EPA

Unapproved

PrimaryEPA

PrimaryESA

ESA

ESA

ESAESA

ESAESA

ESA

ESA

ESA

ESA

Methodand

standard

M4 submit 2

Audit2

E5

E6

E6

E6

E2

Scrap

FinalESA SL

SL

SL

SLMS

FAL

FAL FAL

FAL

Query

M2 query

M6 perform

M6 perform

Purchase

PO

PO

PO

PO

Assetverification

Final

Audit1

Figure 8 Lifecycle of business data in equipment purchasescrap process

DecryptDecrypt

Decrypt

POEPAid = finalEPAId

FAL Poid = POId 120590finalEPAaudit1 = ldquoDoctor Lirdquo

120590FALname = ldquodepth sounderrdquo

FinalEPAlowast

FAClowast

POlowast

Πapp name

Figure 9 Initial query on EPS

119877119903 EPA

119862119905 IsDefine(AuditingComment)

There is a repository named ldquoFinalEPArdquo which reads andstores artifact ldquoEPArdquo only if ldquoAuditingCommentrdquo has beenassigned

Given a query ldquoSELECT app name FROM FAL POFinalEPA WHERE Falname = lsquodepthsounderrsquo AND FALPOid=POIdANDPOEPAid= FinalEPAIdANDFinalEPAAudit2 = lsquoDoctor Lirdquorsquo it can be converted into a syntax treeas shown in Figure 9 which can be further converted into anew syntax tree shown in Figure 10 Queries will be issued onthis syntax tree

Decrypt

FinalEPAlowast

FAClowast

POlowast

Πapp name

120590FALname = ldquodepth sounderrdquo and FAL POId=POId and

POEPAid = finalEPAId and finalEPA audit2 = ldquoDoctor Lirdquo

120575cond(POEAPId = finalEPAId)

120575cond(FAL Poid = POId) 120590lowast120575cond (finalEPA audit2 = ldquoDoctor Lirdquo)

120590lowast120575cond FAL name = ldquodepth sounderrdquo

lowast

lowast

Figure 10 Ultima query on EPS

5 Conclusion

There is no doubt that more and more large datasets will bepoured out during business process execution meanwhilethese business data are extremely valuable In this case wemodeled business data through its lifecycle from the per-spective of process which ensures the integrity of dynamicbusiness data Furthermore we present the notion of userinterest on business data which has a superior function in

Mathematical Problems in Engineering 9

countingminimum interferential tuples during data partitionand ensuring a lower cost of postprocessing brought bydata partition Considering current business data are mostlystored on cloud we proposed a query rewriting strategy foroff-site encrypted data which has a significant advantage inreducing the postprocessing cost Currently there is littleresearch on business data modeling and querying in the truesense Our research lays great foundation for business datarsquosapplication in enterprise That is the initial step of businessdata management architecture and we will further researchon business data analysis with its lifecycle to fully dig thesignificant value of business data

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Natural Science Foun-dation of China (61272098) and Science and TechnologyDevelopment Foundation of Shanghai Ocean University

References

[1] W Huang Z ChenW Dong H Li B Cao and J Cao ldquoMobileinternet big data platform in china unicomrdquo Tsinghua Scienceand Technology vol 19 no 1 pp 95ndash101 2014

[2] S Sagiroglu andD Sinanc ldquoBig data a reviewrdquo in Proceedings ofthe International Conference on Collaboration Technologies andSystems (CTS rsquo13) pp 42ndash47 San Diego Calif USA May 2013

[3] M Chui B Brown J Bughin et al Big Data The Next Frontierfor Innovation Competition and Productivity McKinsey GlobalInstitute 2011

[4] M Singh and S K Jain ldquoA survey on dataspacerdquo in Advances inNetwork Security and Applications vol 196 of Communicationsin Computer and Information Science pp 608ndash621 Springer2011

[5] K Belhajjame N W Paton S M Embury A A A Fernandesand C Hedeler ldquoIncrementally improving dataspaces based onuser feedbackrdquo Information Systems vol 38 no 5 pp 656ndash6872013

[6] J-P Dittrich and M A Vaz Salles ldquoIDM a unified and versa-tile data model for Personal Dataspace Managementrdquo in Pro-ceedings of the 32nd International Conference onVery LargeDataBases (VLDB rsquo06) pp 367ndash378 September 2006

[7] S Pradhan ldquoTowards a novel desktop search techniquerdquoin Database and Expert Systems Applications pp 192ndash201Springer Berlin Germany 2007

[8] M Zhong M Liu and Q Chen ldquoModeling heterogeneousdata in dataspacerdquo in Proceedings of the IEEE InternationalConference on Information Reuse and Integration (IEEE IRI rsquo08)pp 404ndash409 Las Vegas Nev USA July 2008

[9] A Sarma X Dong andAHalevy ldquoDatamodeling in dataspacesupport platformsrdquo in Conceptual Modeling Foundations andApplications pp 122ndash138 Springer Berlin Germany 2009

[10] R Hull and J Su Report on NSF Workshop on Data-CentricWorkflows 2012 httpdcw2009csucsbedureportpdf

[11] R Hull J Su and R Vaculin ldquoData management perspectiveson business process managementrdquo in Proceedings of the ACMSIGMODConference onManagement ofData (SIGMOD rsquo13) pp943ndash947 June 2013

[12] K Bhattacharya N S Caswell S Kumaran A Nigam and FY Wu ldquoArtifact-centered operational modeling lessons fromcustomeengagementsrdquo IBM Systems Journal vol 46 no 4 pp703ndash721 2007

[13] K Bhattacharya R Guttman K Lyman et al ldquoA model-drivenapproach to industrializing discovery processes in pharmaceu-tical researchrdquo IBM Systems Journal vol 44 no 1 pp 145ndash1622005

[14] R Vaculın R Hull T Heath C Cochran A Nigam and PSukaviriya ldquoDeclarative business artifact centric modeling ofdecision and knowledge intensive business processesrdquo in Pro-ceedings of the 15th IEEE International EDOC Enterprise Com-puting Conference (EDOC rsquo11) pp 151ndash160 September 2011

[15] R Vaculın R Hull M Vukovic T Heath N Mills and Y SunldquoSupporting collaborative decision processesrdquo in Proceedings ofthe IEEE 10th International Conference on Services Computing(SCC rsquo13) pp 651ndash658 July 2013

[16] A Nigam and N S Caswell ldquoBusiness artifacts an approach tooperational specificationrdquo IBM Systems Journal vol 42 no 3pp 428ndash445 2003

[17] G Liu X Liu H Qin et al ldquoAutomated realization of businessworkflow specificationrdquo in Proceedings of the 1st InternationalWorkshop on SOA Globalization People and Work (SG-PAWrsquo09) pp 8ndash9 2009

[18] H Hacigumus B Iyer C Li and S Mehrotra ldquoExecuting SQLover encrypted data in the database-service-providermodelrdquo inProceedings of the ACM SIGMOD International Conference onManagment of Data pp 216ndash227 June 2002

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article Modeling and Querying Business Data with ...downloads.hindawi.com/journals/mpe/2015/506272.pdf · monitoring for performance or business concerns, auditing, and compliance

8 Mathematical Problems in Engineering

M1 purchaserequest

EPA

EPA

EPA

EPA EPA

EPA

EPA

EPA

EPA

EPA

EPA

E1

Fill outEPA

E3

E3

E3

NewEPA M3 submit 1

E4

E4

NewESA

M5 scrap request

Fill outESA

UnapprovedEPA

EPA

Unapproved

PrimaryEPA

PrimaryESA

ESA

ESA

ESAESA

ESAESA

ESA

ESA

ESA

ESA

Methodand

standard

M4 submit 2

Audit2

E5

E6

E6

E6

E2

Scrap

FinalESA SL

SL

SL

SLMS

FAL

FAL FAL

FAL

Query

M2 query

M6 perform

M6 perform

Purchase

PO

PO

PO

PO

Assetverification

Final

Audit1

Figure 8 Lifecycle of business data in equipment purchasescrap process

DecryptDecrypt

Decrypt

POEPAid = finalEPAId

FAL Poid = POId 120590finalEPAaudit1 = ldquoDoctor Lirdquo

120590FALname = ldquodepth sounderrdquo

FinalEPAlowast

FAClowast

POlowast

Πapp name

Figure 9 Initial query on EPS

119877119903 EPA

119862119905 IsDefine(AuditingComment)

There is a repository named ldquoFinalEPArdquo which reads andstores artifact ldquoEPArdquo only if ldquoAuditingCommentrdquo has beenassigned

Given a query ldquoSELECT app name FROM FAL POFinalEPA WHERE Falname = lsquodepthsounderrsquo AND FALPOid=POIdANDPOEPAid= FinalEPAIdANDFinalEPAAudit2 = lsquoDoctor Lirdquorsquo it can be converted into a syntax treeas shown in Figure 9 which can be further converted into anew syntax tree shown in Figure 10 Queries will be issued onthis syntax tree

Decrypt

FinalEPAlowast

FAClowast

POlowast

Πapp name

120590FALname = ldquodepth sounderrdquo and FAL POId=POId and

POEPAid = finalEPAId and finalEPA audit2 = ldquoDoctor Lirdquo

120575cond(POEAPId = finalEPAId)

120575cond(FAL Poid = POId) 120590lowast120575cond (finalEPA audit2 = ldquoDoctor Lirdquo)

120590lowast120575cond FAL name = ldquodepth sounderrdquo

lowast

lowast

Figure 10 Ultima query on EPS

5 Conclusion

There is no doubt that more and more large datasets will bepoured out during business process execution meanwhilethese business data are extremely valuable In this case wemodeled business data through its lifecycle from the per-spective of process which ensures the integrity of dynamicbusiness data Furthermore we present the notion of userinterest on business data which has a superior function in

Mathematical Problems in Engineering 9

countingminimum interferential tuples during data partitionand ensuring a lower cost of postprocessing brought bydata partition Considering current business data are mostlystored on cloud we proposed a query rewriting strategy foroff-site encrypted data which has a significant advantage inreducing the postprocessing cost Currently there is littleresearch on business data modeling and querying in the truesense Our research lays great foundation for business datarsquosapplication in enterprise That is the initial step of businessdata management architecture and we will further researchon business data analysis with its lifecycle to fully dig thesignificant value of business data

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Natural Science Foun-dation of China (61272098) and Science and TechnologyDevelopment Foundation of Shanghai Ocean University

References

[1] W Huang Z ChenW Dong H Li B Cao and J Cao ldquoMobileinternet big data platform in china unicomrdquo Tsinghua Scienceand Technology vol 19 no 1 pp 95ndash101 2014

[2] S Sagiroglu andD Sinanc ldquoBig data a reviewrdquo in Proceedings ofthe International Conference on Collaboration Technologies andSystems (CTS rsquo13) pp 42ndash47 San Diego Calif USA May 2013

[3] M Chui B Brown J Bughin et al Big Data The Next Frontierfor Innovation Competition and Productivity McKinsey GlobalInstitute 2011

[4] M Singh and S K Jain ldquoA survey on dataspacerdquo in Advances inNetwork Security and Applications vol 196 of Communicationsin Computer and Information Science pp 608ndash621 Springer2011

[5] K Belhajjame N W Paton S M Embury A A A Fernandesand C Hedeler ldquoIncrementally improving dataspaces based onuser feedbackrdquo Information Systems vol 38 no 5 pp 656ndash6872013

[6] J-P Dittrich and M A Vaz Salles ldquoIDM a unified and versa-tile data model for Personal Dataspace Managementrdquo in Pro-ceedings of the 32nd International Conference onVery LargeDataBases (VLDB rsquo06) pp 367ndash378 September 2006

[7] S Pradhan ldquoTowards a novel desktop search techniquerdquoin Database and Expert Systems Applications pp 192ndash201Springer Berlin Germany 2007

[8] M Zhong M Liu and Q Chen ldquoModeling heterogeneousdata in dataspacerdquo in Proceedings of the IEEE InternationalConference on Information Reuse and Integration (IEEE IRI rsquo08)pp 404ndash409 Las Vegas Nev USA July 2008

[9] A Sarma X Dong andAHalevy ldquoDatamodeling in dataspacesupport platformsrdquo in Conceptual Modeling Foundations andApplications pp 122ndash138 Springer Berlin Germany 2009

[10] R Hull and J Su Report on NSF Workshop on Data-CentricWorkflows 2012 httpdcw2009csucsbedureportpdf

[11] R Hull J Su and R Vaculin ldquoData management perspectiveson business process managementrdquo in Proceedings of the ACMSIGMODConference onManagement ofData (SIGMOD rsquo13) pp943ndash947 June 2013

[12] K Bhattacharya N S Caswell S Kumaran A Nigam and FY Wu ldquoArtifact-centered operational modeling lessons fromcustomeengagementsrdquo IBM Systems Journal vol 46 no 4 pp703ndash721 2007

[13] K Bhattacharya R Guttman K Lyman et al ldquoA model-drivenapproach to industrializing discovery processes in pharmaceu-tical researchrdquo IBM Systems Journal vol 44 no 1 pp 145ndash1622005

[14] R Vaculın R Hull T Heath C Cochran A Nigam and PSukaviriya ldquoDeclarative business artifact centric modeling ofdecision and knowledge intensive business processesrdquo in Pro-ceedings of the 15th IEEE International EDOC Enterprise Com-puting Conference (EDOC rsquo11) pp 151ndash160 September 2011

[15] R Vaculın R Hull M Vukovic T Heath N Mills and Y SunldquoSupporting collaborative decision processesrdquo in Proceedings ofthe IEEE 10th International Conference on Services Computing(SCC rsquo13) pp 651ndash658 July 2013

[16] A Nigam and N S Caswell ldquoBusiness artifacts an approach tooperational specificationrdquo IBM Systems Journal vol 42 no 3pp 428ndash445 2003

[17] G Liu X Liu H Qin et al ldquoAutomated realization of businessworkflow specificationrdquo in Proceedings of the 1st InternationalWorkshop on SOA Globalization People and Work (SG-PAWrsquo09) pp 8ndash9 2009

[18] H Hacigumus B Iyer C Li and S Mehrotra ldquoExecuting SQLover encrypted data in the database-service-providermodelrdquo inProceedings of the ACM SIGMOD International Conference onManagment of Data pp 216ndash227 June 2002

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article Modeling and Querying Business Data with ...downloads.hindawi.com/journals/mpe/2015/506272.pdf · monitoring for performance or business concerns, auditing, and compliance

Mathematical Problems in Engineering 9

countingminimum interferential tuples during data partitionand ensuring a lower cost of postprocessing brought bydata partition Considering current business data are mostlystored on cloud we proposed a query rewriting strategy foroff-site encrypted data which has a significant advantage inreducing the postprocessing cost Currently there is littleresearch on business data modeling and querying in the truesense Our research lays great foundation for business datarsquosapplication in enterprise That is the initial step of businessdata management architecture and we will further researchon business data analysis with its lifecycle to fully dig thesignificant value of business data

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Natural Science Foun-dation of China (61272098) and Science and TechnologyDevelopment Foundation of Shanghai Ocean University

References

[1] W Huang Z ChenW Dong H Li B Cao and J Cao ldquoMobileinternet big data platform in china unicomrdquo Tsinghua Scienceand Technology vol 19 no 1 pp 95ndash101 2014

[2] S Sagiroglu andD Sinanc ldquoBig data a reviewrdquo in Proceedings ofthe International Conference on Collaboration Technologies andSystems (CTS rsquo13) pp 42ndash47 San Diego Calif USA May 2013

[3] M Chui B Brown J Bughin et al Big Data The Next Frontierfor Innovation Competition and Productivity McKinsey GlobalInstitute 2011

[4] M Singh and S K Jain ldquoA survey on dataspacerdquo in Advances inNetwork Security and Applications vol 196 of Communicationsin Computer and Information Science pp 608ndash621 Springer2011

[5] K Belhajjame N W Paton S M Embury A A A Fernandesand C Hedeler ldquoIncrementally improving dataspaces based onuser feedbackrdquo Information Systems vol 38 no 5 pp 656ndash6872013

[6] J-P Dittrich and M A Vaz Salles ldquoIDM a unified and versa-tile data model for Personal Dataspace Managementrdquo in Pro-ceedings of the 32nd International Conference onVery LargeDataBases (VLDB rsquo06) pp 367ndash378 September 2006

[7] S Pradhan ldquoTowards a novel desktop search techniquerdquoin Database and Expert Systems Applications pp 192ndash201Springer Berlin Germany 2007

[8] M Zhong M Liu and Q Chen ldquoModeling heterogeneousdata in dataspacerdquo in Proceedings of the IEEE InternationalConference on Information Reuse and Integration (IEEE IRI rsquo08)pp 404ndash409 Las Vegas Nev USA July 2008

[9] A Sarma X Dong andAHalevy ldquoDatamodeling in dataspacesupport platformsrdquo in Conceptual Modeling Foundations andApplications pp 122ndash138 Springer Berlin Germany 2009

[10] R Hull and J Su Report on NSF Workshop on Data-CentricWorkflows 2012 httpdcw2009csucsbedureportpdf

[11] R Hull J Su and R Vaculin ldquoData management perspectiveson business process managementrdquo in Proceedings of the ACMSIGMODConference onManagement ofData (SIGMOD rsquo13) pp943ndash947 June 2013

[12] K Bhattacharya N S Caswell S Kumaran A Nigam and FY Wu ldquoArtifact-centered operational modeling lessons fromcustomeengagementsrdquo IBM Systems Journal vol 46 no 4 pp703ndash721 2007

[13] K Bhattacharya R Guttman K Lyman et al ldquoA model-drivenapproach to industrializing discovery processes in pharmaceu-tical researchrdquo IBM Systems Journal vol 44 no 1 pp 145ndash1622005

[14] R Vaculın R Hull T Heath C Cochran A Nigam and PSukaviriya ldquoDeclarative business artifact centric modeling ofdecision and knowledge intensive business processesrdquo in Pro-ceedings of the 15th IEEE International EDOC Enterprise Com-puting Conference (EDOC rsquo11) pp 151ndash160 September 2011

[15] R Vaculın R Hull M Vukovic T Heath N Mills and Y SunldquoSupporting collaborative decision processesrdquo in Proceedings ofthe IEEE 10th International Conference on Services Computing(SCC rsquo13) pp 651ndash658 July 2013

[16] A Nigam and N S Caswell ldquoBusiness artifacts an approach tooperational specificationrdquo IBM Systems Journal vol 42 no 3pp 428ndash445 2003

[17] G Liu X Liu H Qin et al ldquoAutomated realization of businessworkflow specificationrdquo in Proceedings of the 1st InternationalWorkshop on SOA Globalization People and Work (SG-PAWrsquo09) pp 8ndash9 2009

[18] H Hacigumus B Iyer C Li and S Mehrotra ldquoExecuting SQLover encrypted data in the database-service-providermodelrdquo inProceedings of the ACM SIGMOD International Conference onManagment of Data pp 216ndash227 June 2002

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 10: Research Article Modeling and Querying Business Data with ...downloads.hindawi.com/journals/mpe/2015/506272.pdf · monitoring for performance or business concerns, auditing, and compliance

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of