Upload
emory-hood
View
216
Download
1
Tags:
Embed Size (px)
Citation preview
OverviewOverview
Our research activities concern the implementation of Our research activities concern the implementation of Web information systems for Web information systems for eGovernmenteGovernment applications applications
Due to development of eGovernment initiatives,Due to development of eGovernment initiatives,more and more on-line more and more on-line resourcesresources and and servicesservices are are being made available by Public Administrations (PAs)being made available by Public Administrations (PAs)
We make use of We make use of temporal databasetemporal database and and semantic Websemantic Web techniques to provide techniques to provide personalized accesspersonalized access to such to such resources and servicesresources and services
In particular, we consider In particular, we consider multi-version norm textsmulti-version norm texts (stored in XML format) available in Web repositories(stored in XML format) available in Web repositories
timetime
Original Original normative textnormative text 11 22
new new versionversion
33
new new versionversion
Importance of versioningImportance of versioning Temporal concernsTemporal concerns are ubiquitous in the law domain are ubiquitous in the law domain
Each normative text changes in time due to different Each normative text changes in time due to different modificationsmodifications, , but keeps its but keeps its identityidentity
The ability to model The ability to model temporal dimensiontemporal dimensionss is essential for the is essential for the management of evolving normsmanagement of evolving norms
it is crucial to reconstruct the it is crucial to reconstruct the consolidated versionconsolidated version of a norm of a norm also also past versionspast versions are still important are still important
Importance of versioningImportance of versioning
Applicability (semantic) versioningApplicability (semantic) versioning also plays an important role also plays an important role some norms or some of their parts have or acquire a some norms or some of their parts have or acquire a limited limited
applicabilityapplicability personalized versionpersonalized version of the normof the norm
A version only containing provisions which are applicable A version only containing provisions which are applicable to a citizen’s personal caseto a citizen’s personal case
Self-employedSelf-employed
Art. 1 (unemployed)Art. 1 (unemployed)
xxy yyx yxyx yyyxx xyyxxxy yyx yxyx yyyxx xyyx
Art. 2 (self-employed)Art. 2 (self-employed)
aab bbab abab abba abaab bbab abab abba ab
Art. 3 (retired)Art. 3 (retired)
qwqq ww wqqw wq wwqwqq ww wqqw wq ww
MotivationMotivation
Large XML collections of norms Large XML collections of norms are made available by the PA on the Web are made available by the PA on the Web but but personalizationpersonalization is:is:
AbsentAbsent, e.g. , e.g. http://www.normeinrete.ithttp://www.normeinrete.it(temporal versioning partially supported)(temporal versioning partially supported)
PredefinedPredefined in the Website structure and contents, in the Website structure and contents, e.g. e.g. http://www.italia.gov.ithttp://www.italia.gov.it(hardwired by human experts following the life-events approach)(hardwired by human experts following the life-events approach)
Lack of an effective, flexible, on-demand Lack of an effective, flexible, on-demand (“intelligent”, efficient) personalization facility(“intelligent”, efficient) personalization facility
ObjectivesObjectives
Development of an Development of an effectiveeffective and and efficientefficient Web information systemWeb information system where where::
norms are represented as norms are represented as XML documentsXML documents dynamics of norms in timedynamics of norms in time is captured is captured limited applicabilitylimited applicability of normsof norms
(and their parts) is captured(and their parts) is captured selective access selective access and and reconstruction of versionsreconstruction of versions
is supported by a query engineis supported by a query engine
Aimed at:Aimed at: enabling citizens to access enabling citizens to access personalizedpersonalized versions versions
of of multiversionmultiversion resources resources improving and optimizing the improving and optimizing the involvementinvolvement of citizens of citizens
in the eGovernance processin the eGovernance process
The Technological InfrastructureThe Technological Infrastructure
WEB SERVICESWEB SERVICESOF PUBLICOF PUBLIC
ADMINISTRATIONADMINISTRATION
WEB SERVICESWEB SERVICESWITH ONTOLOGYWITH ONTOLOGY
OOCC
XML REPOSITORY OF XML REPOSITORY OF ANNOTATED NORMSANNOTATED NORMS
SIMPLESIMPLEELABORATIONELABORATION
UNITUNIT
1 – 1 – identification phaseidentification phase: reconstruction : reconstruction on-the-flyon-the-fly of the digital of the digital identity of the authenticated useridentity of the authenticated user
11
classclass CCxx
2 – 2 – classification phaseclassification phase: use of the collected digital identity to : use of the collected digital identity to classify the citizen with respect to the classify the citizen with respect to the civic ontology Ocivic ontology Occ
22
Public Public Administration Administration DBDB
creation creation /update/update
3 – 3 – querying phasequerying phase: access and reconstruction of all and only : access and reconstruction of all and only norms which are applicable to the norms which are applicable to the class Cclass Cxx
33 Querying phaseQuerying phase
The Civic OntologyThe Civic Ontology
Embodies a classification of citizens based on the distinctions Embodies a classification of citizens based on the distinctions introduced by successive norms that imply some introduced by successive norms that imply some limitations in their limitations in their applicability applicability (founding acts) (founding acts)
At this stage of the project, we manage “tree-like” ontologies(i.e. class taxonomies induced by the IS-A relationship)
Citizen
EmployeeUnemployed Retired
Self-employedSubordinate
PrivatePublic
Extension of a previous Extension of a previous temporal XML temporal XML modelmodel (D&KE 2005) including: (D&KE 2005) including: a temporal multi-version XML schemaa temporal multi-version XML schema
is based on the is based on the hierarchical organizationhierarchical organization of normative of normative texts: texts: contents-section-article-paragraphcontents-section-article-paragraph
at each level of the hierarchy, the history of changes is at each level of the hierarchy, the history of changes is represented by the (time-stamped) represented by the (time-stamped) versionsversions produced produced
it supports it supports ancestor-descendant inheritanceancestor-descendant inheritance
temporal manipulation operationstemporal manipulation operations
Addition of applicability annotations in order to Addition of applicability annotations in order to support support semantic versioningsemantic versioning
The modeling approachThe modeling approach
The temporal XML The temporal XML schemaschema
4 Temporal Dimensions:4 Temporal Dimensions:
Publication timePublication timetime of publication time of publication on the Official Journalon the Official Journal
Validity timeValidity timetime the norm is in forcetime the norm is in force
Efficacy timeEfficacy timetime the norm time the norm can be appliedcan be applied
Transaction timeTransaction timetime the norm is storedtime the norm is storedin the systemin the system
Law
Title Contents
Publication – R Vt_Start – RVt_End – OTt_Start – RTt_End – OEt_Start – REt_End – O
An_ref – O Ver
Section
Ver
Article
Ver
Heading
Paragraph
Ver
Heading
Num – R
Num – R
Num – R
Num – R
Num – R
An_ref – O
Num – R
An_ref – O
Num – R
An_ref – O
Num – R
Type – R
Vt_Start – RVt_End – OTt_Start – RTt_End – OEt_Start – REt_End – O
TA
Vt_Start – RVt_End – OTt_Start – RTt_End – OEt_Start – REt_End – O
TA
Vt_Start – RVt_End – OTt_Start – RTt_End – OEt_Start – REt_End – O
TA
Vt_Start – RVt_End – OTt_Start – RTt_End – OEt_Start – REt_End – O
TA
Semantic versioningSemantic versioning
A pre-order and post-order numbering scheme is introduced in the tree-like ontology
Classes are identified by means of their pre-order code Encoding is exploited in query processing for quick ancestor-descendant
checking Applicability annotations (AA) are added to semantic versions of
document parts as references to the ontology classes
Citizen
EmployeeUnemployed Retired
Self-employedSubordinate
PrivatePublic
Citizen
EmployeeUnemployed Retired
Self-employedSubordinate
PrivatePublic
(2,1) (3,6) (8,7)
(4,4) (7,5)
(5,2) (6,3)
(1,8)
Semantic versioningSemantic versioning Applicability is inherited by descendant nodes unless locally redefined By means of redefinitions we can also introduce, for each part of a
document, complex applicability properties Restrictions with respect to ancestors Extensions with respect to ancestors
<article num="1"><ver num="1">
<aa applies_to="3"/>[… Temporal attributes … ]<paragraph num="1">
<ver num="1"> [ … Text … ]<aa applies_to="4"/>[… Temporal attributes … ]
</ver></paragraph><paragraph num="2">
<ver num="1"> [ … Text … ]<aa applies_also="8"/>[… Temporal attributes … ]
</ver></paragraph>
</ver></article>
Citizen
EmployeeUnemployed Retired
Self-employedSubordinate
PrivatePublic
(2,1) (3,6) (8,7)
(4,4) (7,5)
(5,2) (6,3)
(1,8)
John Smith is a self-employed citizen.John Smith is a self-employed citizen.
He is interested in the text of all the norms ...He is interested in the text of all the norms ...
... which contain paragraphs dealing with health care, ...... which contain paragraphs dealing with health care, ...
... which were valid and in effect between 2002 and 2004, ...... which were valid and in effect between 2002 and 2004, ...
... and which are applicable to his ... and which are applicable to his case (civic class 7).case (civic class 7).
Example of Example of full searchfull search
Structural constraintStructural constraint
Textual constraintTextual constraint
Temporal constraintTemporal constraint
Semantic constraintSemantic constraint
4 orthogonal constraints4 orthogonal constraints
FOR $a IN normsFOR $a IN norms
WHERE textConstr ($a//paragraph//text(), ’health AND care’)WHERE textConstr ($a//paragraph//text(), ’health AND care’)
AND tempConstr (’vTime OVERLAPS PERIOD(’2002-01-01’,’2004-12-31’)’)AND tempConstr (’vTime OVERLAPS PERIOD(’2002-01-01’,’2004-12-31’)’)
AND tempConstr (’eTime OVERLAPS PERIOD(’2002-01-01’,’2004-12-31’)’)AND tempConstr (’eTime OVERLAPS PERIOD(’2002-01-01’,’2004-12-31’)’)
AND applConstr (’class 7’)AND applConstr (’class 7’)
RETURN $aRETURN $a
Example of Example of full searchfull search
Structural constraintStructural constraint
Textual constraintTextual constraint
Temporal constraintTemporal constraint
Semantic constraintSemantic constraint
4 orthogonal constraints4 orthogonal constraints
Norm
Article 1
Par 1
Ver 1AA=3
Ver 1
Par 2
Article 2
Health care…Health care………text Xtext X
Ver 2
Public health…Public health………text Ytext Y
Example of Example of full searchfull search
TA
AA
TAAA=4
TAVer 1
AA=3,8
TA
Health care…Health care………text Ztext Z
Citizen
EmployeeUnemployed Retired
Self-employedSubordinate
PrivatePublic
(2,1) (3,6) (8,7)
(4,4) (7,5)
(5,2) (6,3)
(1,8)
Civic ontologyCivic ontology Normative DBNormative DB
……norm//paragraph//text()norm//paragraph//text()
……‘‘class 7’class 7’
……
Our prototype system (“native” approach)Our prototype system (“native” approach)
The query engine is able to access and retrieve only the strictly necessary data
selection relies on ad-hoc data structures supporting multi-versioning storage granularity is finer than the entire documents used by standard XML engines (including our previous prototype – “stratum” approach)
Only the parts which satisfy the temporal and applicability constraints are used for the reconstruction of the retrieved documents
There is no need to retrieve whole XML documents and build space-consuming structures such as DOM trees
Enhanced query processing efficiency
Reduced memory requirements
Evaluation benchmarkEvaluation benchmark Three XML document setsThree XML document sets
5000 documents 5000 documents (120MB) (120MB) 10000 documents 10000 documents (240MB) (240MB) 20000 documents 20000 documents (480MB) (480MB)
Variable document sizeVariable document size min = 2KBmin = 2KB avg = 24KBavg = 24KB max = 125KBmax = 125KB
Five different query typesFive different query types Queries on keywords (structural + textual constraints)Queries on keywords (structural + textual constraints)
Q1Q1 – keywords in contents – keywords in contents Q2Q2 – keywords in type and contents – keywords in type and contents
Temporal queries (structural + temporal constraints)Temporal queries (structural + temporal constraints) Q3Q3 – conditions on publication, validity and transaction time – conditions on publication, validity and transaction time
Mixed queries (structural + textual + temporal constraints)Mixed queries (structural + textual + temporal constraints) Q4Q4,, Q5 Q5 – with keywords and temporal conditions – with keywords and temporal conditions
Five variants with semantic constraintsFive variants with semantic constraints Qx-AQx-A – with additional – with additional applicability constraintsapplicability constraints
PERSONALIZATION PERSONALIZATION OF THE QUERIESOF THE QUERIES
Performance evaluationPerformance evaluation
The new system outperforms its predecessor (“stratum” approach) The new system outperforms its predecessor (“stratum” approach) as far as temporal queries are concernedas far as temporal queries are concerned
The new system showed a The new system showed a very highvery high efficiencyefficiencyin in personalization querypersonalization query processing processing selection of qualifying versions is improved by a technique selection of qualifying versions is improved by a technique
involving simple comparisons involving pre-post encodingsinvolving simple comparisons involving pre-post encodings 0.5-1%0.5-1% more more timetime than for the original versions than for the original versions 3-4%3-4% storagestorage space overhead space overhead
The new system showed The new system showed good scalabilitygood scalability figures in every type of figures in every type of query contextquery context the computing time grows the computing time grows sublinearlysublinearly with the number of with the number of
documents (it depends mainly on the size of the results)documents (it depends mainly on the size of the results)
ConclusionsConclusions
We presented our research work concerning the design and implementation of efficient Web-based information systems for eGovernment applications
We introduced support for a personalized access to resources on the basis of the digital identity of citizens (relying on semantic versioning and ontology mapping)
We developed an efficient platform (“native” approach) for which a specialized Multi-version XML Query Processor has been designed and implemented
We showed our approach to be very efficient in a large set of experimental situations with good scale-up figures under growing load configurations
Future WorkFuture Work
Extensions of the current framework more advanced application requirements may include a more
sophisticated ontology definition (graph-like), possibly versioned, and more advanced reasoning services
Completion of the technological infrastructure usable in a large Web-based eGovernment scenario, including
identification and classification services
Assessment of our prototype systems in a concrete working environment
with real users and with a large repository of real norms
Extension to a more general application domain(Web personalization via ontology-based user profiling)